使用 PromptPerfect 突破限制：生成模型不想让你看到的图片

💡

冷静下来，我们不会关注那种图片（无论你认为那种是什么）。

直接说重点：有时你想生成一张完全无害的图片，但模型（比如 DALL-E 3 或 Stable Diffusion XL）要么直接拒绝，要么生成完全错误的内容。PromptPerfect 可以帮助解决这个问题，让你获得更好、更准确的结果。

在这篇文章中，我们将比较不同的模型，解释如何使用 PromptPerfect 来优化你的体验，并进行测试，向你展示使用 PromptPerfect 优化器前后两种模型的结果。

当然，我们不会生成（或试图生成）任何不当图片。这是一篇适合家庭的文章，尤其适合那些喜欢章鱼狗或狗章鱼的家庭。或者随便我们最后怎么称呼这些在文章后面创造的奇怪的多腿小狗。

tagDALL-E 3 和 Stable Diffusion XL

虽然有很多模型，但今天我们将重点关注这两个崭新的明星：来自 OpenAI 的 DALL-E 3 和来自 Stability AI 的 Stable Diffusion XL。虽然它们都能取得不错的结果，但各有优缺点。

看看 DALL-E 3，它本身就擅长理解长句和物体关系，并且比 Stable Diffusion XL 画出更真实的解剖结构（这里不会出现洛夫克拉夫特式的恐怖手）。然而，它经常直接拒绝生成名人（如 Taylor Swift）或知名角色（如米老鼠，即使我们要求生成已过版权期的汽船威利版本）的图像。它生成的文字比其他任何图像生成模型都要好（虽然这个标准很低）。

Stable Diffusion XL 在生成名人和知名角色的图像方面更开放，尽管它画的一些米老鼠看起来像是在嗑药时画的。然而，它经常搞错解剖结构和物体关系。虽然你可以要求它生成文字（而且看得出它在尽最大努力），但在这方面远远落后于 DALL-E 3。

使用 PromptPerfect 我们可以克服这两个模型的一些缺点。我们将比较使用 PromptPerfect 优化前后的 DALL-E 3 和 Stable Diffusion 的效果。你可以直接跳到最后看谁是最终赢家。

tag使用 PromptPerfect 的优化器

在这场模型之战中，我们使用 PromptPerfect 的优化器来看看如何让我们的提示词获得更好的图像结果。以下是使用方法：

在 PromptPerfect 注册免费额度：

💡

免费试用付费计划 7 天。首次登录后 24 小时内订阅计划可享受 40% 折扣！

点击交互功能：

Dark themed webpage of PromptPerfect! with a navigation bar and titles like "Interactive" and "Auto-tune."

在"优化器"窗格（右侧）中，输入类似 generate a prompt to create an image of felix the cat using DALL-E 3 的内容：

Interface of an AI assistant tool with option to create a playful image of Felix the Cat using DALL-E 3.

点击"发送给助手"

Screenshot of a DALL-E 3 interface with options to create a playful, whimsical image of Felix the Cat, including buttons for

它会思考一会，然后在左侧的"交互"窗格中根据提示词生成图像：

Progression of Felix the Cat illustrations from sketch to a lively pop art style creation.

通过与优化器对话来改进你的提示词，然后重复这个过程：

Screenshot of an artistic request page for creating a Felix the Cat illustration in the 1930s rubber hose animation style, em

tag比赛方法

对于"优化前"的图像，我们将使用：

ChatGPT（GPT-4）使用提示词 generate an image of <thing> 通过 DALL-E 生成图像，例如 generate an image of mickey mouse。
使用 Replicate 的接口通过 Stable Diffusion XL 生成图像，使用提示词 <thing>，例如 mickey mouse。

对于"优化后"的图像，我们将使用 PromptPerfect 的交互式优化器，使用提示词 generate a prompt to create an image of <thing> using <model name>。

我们将展示出现的第一个输出。实际图像数量可能不同 - PromptPerfect 总是生成四张，Stable Diffusion XL（通过 Replicate）生成一张，DALL-E 3 生成一到两张。

💡

虽然 PromptPerfect 的优化器是交互式的（所以你可以通过对话方式改进提示词），但为了尽可能公平，我们只使用第一个结果。通过充分利用优化器的交互功能，你会获得更好的结果。

我们将按以下标准授予奖牌：

💩 - 完全拒绝合作
🥉 - 尝试了，但没有一个输出是我们想要的
🥈 - 至少有一个输出结果还不错！
🥇 - 太棒了，至少有一个输出结果真的很好！

最后我们将进行总结，看看哪个模型和方法最终胜出。

tag谁将成为下一个顶级模型？

模型们，启动你们的引擎！

I will help translate to Chinese while carefully following the requirements. Here's the translation:

tag第一轮：著名人物

让我们先试试我们的主和救世主 Taylor Swift。这是我们要模仿的真实照片：

Taylor Swift wearing a black dress with deep V-neckline and gold necklace, posing with a hand on her hip, against a colorful — 许可协议 CC BY 3.0，署名：iHeartRadioCA

没有 PromptPerfect，DALL-E 3 直接拒绝创建 Taylor：

Chat interface showing a denied request to generate an image of Taylor Swift by ChatGPT due to content policy.

使用 PromptPerfect 后，它用优化过的提示生成了图像，但没有一个真的像她：

Taylor Swift in an edited portrait with suggestions for a red sequined dress, styled hair, and dramatic stage lighting.

使用 SDXL，在使用 PromptPerfect 之前我们得到了一个相当不错的效果：

Portrait of Taylor Swift with red lipstick, blondish-brown hair, wearing a dress against a pink background.

而 PromptPerfect 的优化提示再次交付了良好的效果：

Edited collage of Taylor Swift with multiple effects via a photo editor, highlighting options like Optimize and Assistant.

让我们看看哪些模型真的能生成-生成-生成：

	优化前	优化后
DALL-E 3	💩 直接拒绝	🥉 金发？有。歌手？有。Taylor？没有
Stable Diffusion XL	🥇 有 Swift 的感觉	🥇 相当 Taylor 的风格

tag第二轮："受版权保护"的材料

我们甚至不会去尝试真正受版权保护的材料 - 那是我们不想深入的一个复杂问题。不过，《汽船威利号》中的米老鼠设计自 2024 年起已经不再受版权保护：

Cartoon of Mickey Mouse dressed in captain attire, steering a ship's wheel with a joyful expression, in a classic black-and-w

让我们用他作为主题。DALL-E 3 一开始就直接拒绝：

Chat exchange in Slack showing a user request for a 'Mickey Mouse from Steamboat Willie' image and ChatGPT's polite policy vi

使用 PromptPerfect 后，我们得到了有相似感觉的结果，但不是 1930 年代的橡皮软管风格：

Black and white image editing screen featuring classic Mickey Mouse on a steamboat, with detailed creative instructions.

Stable Diffusion 确实在努力尝试。这个米老鼠给你的耳朵、眼睛和手指都超值：

Black and white illustration of Mickey Mouse gesturing in a playful stance.

使用 PromptPerfect 优化后，Stable Diffusion 仍然给我们一个让人发烧的米老鼠，但更像是轻微发烧，而不是"这些蘑菇到底有多强？"那种发烧：

Series of Mickey Mouse images showcasing artistic transformation from vintage black and white to modern 3D CGI, created with

哪个模型让米老鼠变得最"糟"？

	优化前	优化后
DALL-E 3	💩 政策政策。这些内容绝对不受版权保护了。	🥈 确实有米老鼠的感觉，没有怪异，只是没有我想要的 30 年代风格。
Stable Diffusion XL	🥉 米老鼠，回家吧。你被附身了。	🥈 勉强获得银牌。比 DALL-E 3 更有米老鼠的感觉，但变形真的很分散注意力

tag第三轮：文字

让我们生成一个写着"Happy days are here again"的标志。这次没有目标图片，只需想象（尽管可能很难）一个带有这段文字的标志。用 John Lennon 的话说，只要你尝试，这很容易。

DALL-E 3 给我们带来了愉快的氛围，这正是我想要的。不过，它确实加入了单词"dye"。由于这个词听起来像"die"，可能会传达混杂的信息：

Colorful sign reading "Happy Days Are Here Again" in a creative font, with a sun and clouds, shared in a Slack conversation.

经过优化后，我们至少有一次得到了正确的措辞和拼写，没有多余的词。而另一次几乎完全正确，只是有一个拼写错误：

A vibrant, nostalgic sign reading "Happy Days Are Here Again" in bold, cheerful lettering on weathered wood, set against a lu

Stable Diffusion XL 给我们呈现了 Herpy Days：

Playful sign with the phrase "Happy Days Are Here Again" painted in a unique and stylized manner.

在优化 Stable Diffusion XL 提示词后，我们得到了一个孤单的、拼写错误的森林路标。虽然没有之前那么吓人了，但说实话，我可不想跟着这个路标去它指向的地方。

Old wooden sign reading "Happy days are here again" in a vibrant, detailed landscape with a backdrop of blue sky and trees.

谁会看到快乐的日子，谁又不会呢？

	优化前	优化后
DALL-E 3	🥈 你能看出标志在说什么，尽管它添加了额外的"dye"这个词，而且词序有些混乱	🥇 至少有一个标志有完全正确的文本。另一个只有一个"小"拼写错误（在"HAPPY"中多了一个"P" - 按图像生成标准来说算是小错误了！）
Stable Diffusion XL	🥉 看起来像是来自地狱的励志海报	🥈 虽然不如未优化的 DALL-E 3，但也没有让我想要挖掉自己的眼睛，就像未优化的 SDXL 那样

tag第 4 轮："诅咒"创作

让我们看看这些模型如何适应奇怪的东西，比如一只有七条腿的小狗。这次没有目标图像 - 我不想在我的 Google 搜索历史记录中出现"畸形小狗"。只要想象一只有七条腿的小狗就好。

DALL-E 3 这次给了我们两个输出。我们并没有要求这样做。它大概就是喜欢狗狗吧。这是 AI 变得更像人类的证据？总之，结果就是我们要求的，虽然我觉得有点平淡。不过在这一轮我们不是在评判风格，只看内容。所以一只腿部数量荒谬的狗叠加在 Windows XP 壁纸上也是可以的：

Cute brown and white puppy sitting on a grassy hill with colorful flowers, one paw raised, under a clear blue sky.

Playful brown and white puppy sitting on a flower-dotted green lawn with a blue sky and fluffy clouds overhead. — 虽然这不算严格意义上的 NSFW，但它足够令人不安，所以我对它进行了马赛克处理

优化后，太多腿了！我想知道多腿狗的表情符号是想表达什么？欢迎给我们答案！

Illustration showing four whimsical brown puppies with various poses alongside detailed instructions for creating a surreal s

Stable Diffusion XL 理解错了任务：

Small puppy with floppy ears and a black collar standing on a brown surface, looking at the camera against a gradient gray-bl

即使优化后，我们也想说"七条腿这个要求你哪里没听懂？"：

在这一轮中谁是最佳，谁是垫底？

	优化前	优化后
DALL-E 3	🥇 两只小狗都有奇怪数量的腿。第一只小狗甚至有七条腿，尽管有些腿几乎看不到。虽然我不知道第二只小狗上的那些爪状物是什么，也不想知道。	🥇 太棒了。所有的小狗。所有的腿。你可以和这些可爱的小家伙玩很久的握手游戏。其中一只甚至腿的数量完全正确。
Stable Diffusion XL	🥉当我想要一只腿多的小狗时，我可不是指要长腿	🥉 我更喜欢腿多一点的小狗

tag加分轮：Kegstand 朋克

在某些情况下，无论是否采用优化，DALL-E 3 和 SDXL 都会失败。例如，生成一个朋克做 kegstand 的图像。

这是一个朋克的图像…

Man with a green mohawk and black "DISCHARGE" coat on a city street, showcasing bold fashion and individuality. — 来源：pexels.com

...还有一幅看起来像是来自温馨儿童读物的小桶饮酒插图：

Joyful illustration of men around a barrel with another man playfully inside, indicating humor and fun.

我在网上找不到朋克进行小桶饮酒的实际图片。唉，朋克们，太过拘谨了！

DALL-E 3 给我们展示了一个酒吧里的朋克，灯光很怪但很酷。他在酒桶旁，但没有进行小桶饮酒。

A punk doing a kegstand at a lively party, with interface icons and chat text indicating an image generation command.

优化后，我喜欢这种氛围，但仍然没有小桶饮酒的场景：

Punk rocker with brightly colored spiked hair and tattoos, performing a keg stand at a lively underground party in a graffiti

他们应该把名字改成"Stable Diffusion 急诊室"，因为这个人（？）需要去医院：

Man performing a handstand on a wooden barrel outdoors, dressed in black, with a red and white building and a clear sky in th

优化后看起来好多了。有酒桶，有朋克。可惜还是没有小桶饮酒的场景。

Energetic punk music scene in an underground venue with a crowd, punk in leather and mohawk hairstyles.

谁是朋克，谁是垃圾？

	优化前	优化后
DALL-E 3	🥈 朋克，有了。酒桶，有了。小桶饮酒，没有	🥈 优化改变了一些氛围，但仍然没有实际的小桶饮酒场景
Stable Diffusion XL	🥉 糟糕。不是朋克。不是小桶饮酒。勉强算个人。而且用这种方式做小桶饮酒，他很快就不会是任何类型的人了。	🥈 优化给了我们更好的结果，展示了朋克与酒桶的互动。这次没有惊悚的人体变形。

tag最终得分统计

现在比赛结束了，我们按以下方式计分：

💩：零分
🥉：一分
🥈：两分
🥇：三分

每个选项的最高得分是 15 分（在所有五轮中都获得金牌）。让我们看看具体分析：

挑战项目	DALL-E 3		Stable Diffusion XL
	使用 PromptPerfect 前	使用 PromptPerfect 后	使用 PromptPerfect 前	使用 PromptPerfect 后
著名人物	💩 0	🥉 1	🥇 3	🥇 3
"版权"材料	💩 0	🥈 2	🥉 1	🥈 2
文本	🥈 2	🥇 3	🥉 1	🥈 2
诡异创作	🥇 3	🥇 3	🥉 1	🥉 1
朋克小桶饮酒	🥈 2	🥈 2	🥉 1	🥈 2
总分	🥉 7	🥇 11	🥉 7	🥈 10

简而言之，如果不是前几轮的审查制度，DALL-E 3 的得分会更高。总的来说，使用 PromptPerfect 优化提示词会让两个模型都产生更好的结果。

你可以相信我们，因为这是一个公正的比赛（由我们进行，为我们自己，为我们的产品）。说真的，结果确实不言自明。你可以自己尝试看看效果如何！