Project 7: Stable Diffusion 3

Post Reply
glegrady
Posts: 203
Joined: Wed Sep 22, 2010 12:26 pm

Project 7: Stable Diffusion 3

Post by glegrady » Sat Oct 21, 2023 10:36 am

Project 7: Stable Diffusion 3

The last assignment where we test Stable Diffusion paramaters at http://vislab4.mat.ucsb.edu:7860/

Review articles that give recommendations such as: https://github.com/AUTOMATIC1111/stable ... e-showcase
https://onceuponanalgorithm.org/ai-prom ... ls-part-1/

Consider exploring the various parameters:

The assignment will be presented on Tuesday, November 20. If you are out of town, the presentation can be made by zoom...
George Legrady
legrady@mat.ucsb.edu

pratyush
Posts: 9
Joined: Wed Oct 04, 2023 9:27 am

Re: Project 7: Stable Diffusion 3

Post by pratyush » Mon Nov 13, 2023 9:59 pm

As we approach the final weeks of the quarter, for this week's assignment I opted for a conceptual exploration of one of my final project ideas. Drawing inspiration from John Baldessari's exploration of the narrative potential and associative power of images within the realm of art, my approach involved combining images in collage-like formations. The primary objective was to investigate how geometric alignments within compositions guide the trajectory of our gaze within the frame.

To execute this concept, I utilised the new refiner model on SDXL, generating a series of black and white photographs. The prompts I employed focused on emphasising strong diagonal trajectories within these images, intending to establish a geometric path of coordination for the envisioned collage. Thematically, my contemplation centred around the concepts of isolation and unity within the elements portrayed in the photographs. I aimed to observe how the dialectic relationship between cluster and sparseness engaged in a meaningful conversation, particularly when their common thread was their geometric compositional alignment. The four images I sought to generate included a lonely lighthouse, a comet in space, a group of villagers in India, and a flock of birds on an electrical cable.

Achieving the precise geometric alignment I envisioned required testing at least 32 prompts. Given the substantial space and time constraints for posting and presenting all the images here, I've opted to showcase only a select few. These chosen examples will effectively illustrate the iterative process of refining the prompt and adjusting associated parameters to achieve the desired results. To underscore the cumulative effect of geometric shapes in guiding vision, the final composition incorporated a composite image with an overlaid mask. This mask serves to highlight the guiding trajectories by selectively revealing only them, while covering the remainder of the composition. Presented below are my attempts and the ultimate result.


The Lighthouse:

Prompt 2:


desaturated dramatic black and white photograph of a lighthouse at night shining a thick cone of directional light over the waters of the sea, grainy, push-processing, Kodak TriX 400 ISO film, slow shutter shake, compose left of frame, light falling on water from the lighthouse , dark, Chiaroscuro lighting style 
Negative prompt: watermark, human beings, center framing, moon
Steps: 45, Sampler: DPM++ 2M Karras, CFG scale: 30, Seed: 200, Size: 640x480, Model hash: 7440042bbd, Model: sd_xl_refiner_1.0, Denoising strength: 0.3, Hires upscale: 1.45, Hires upscaler: Latent, Version: v1.6.0


Time taken: 25.1 sec.

Image:

00313-200.png

Prompt 32:

extreme wide desaturated dramatic black and white photograph of a lighthouse at night shining a thick beam of diagonal directional light across the frame over the waters of the sea, grainy, push-processing, Kodak TriX 400 ISO film, slow shutter shake, compose left of frame, light falling on water from the lighthouse , dark, Chiaroscuro lighting style 
Negative prompt: watermark, human beings, center framing, moon, buildings
Negative prompt: watermark, human beings, center framing, moon, sun, buildings 
Steps: 60, Sampler: DPM++ 2M Karras, CFG scale: 30, Seed: 151, Size: 640x480, Model hash: 7440042bbd, Model: sd_xl_refiner_1.0, Denoising strength: 0.25, Hires upscale: 1.45, Hires upscaler: Latent, Version: v1.6.0

Saved: 00513-151.png


Image:

00514-150.png

Upscaled:

00517-152.png

The Comet:


Prompt 4:

desaturated dramatic black and white photograph of a comet flying across the night sky shining a thick cone of directional light over the dark black starry sky, grainy, push-processing, Kodak TriX 400 ISO film, slow shutter shake, compose left of frame , diagonal trajectory of light, dark, Chiaroscuro lighting style 
Negative prompt: watermark, human beings, center framing, moon, landscape
Steps: 45, Sampler: DPM++ 2M Karras, CFG scale: 30, Seed: 200, Size: 640x480, Model hash: 7440042bbd, Model: sd_xl_refiner_1.0, Denoising strength: 0.3, Hires upscale: 1.45, Hires upscaler: Latent, Version: v1.6.0

Saved: 00324-200.png

Image:

00324-200.png

Prompt 5:

desaturated dramatic black and white photograph of a comet flying across the night sky shining a thick cone of directional light over the dark black starry sky, grainy, push-processing, Kodak TriX 400 ISO film, slow shutter shake, compose left of frame , diagonal trajectory of light, dark, Chiaroscuro lighting style 
Negative prompt: watermark, human beings, center framing, moon, landscape, streets, field
Steps: 45, Sampler: DPM++ 2M Karras, CFG scale: 30, Seed: 200, Size: 640x480, Model hash: 7440042bbd, Model: sd_xl_refiner_1.0, Denoising strength: 0.3, Hires upscale: 1.45, Hires upscaler: Latent, Version: v1.6.0

Saved: 00329-200.png


Image:

00329-200.png

Upscaled:

00330-200.png

Electric Cables:


Prompt 10:

desaturated black and white photograph of electric cables diagonally against the sky with crows perched on cables, film grain, Kodak TriX 400 ISO film, compose left of frame, high contrast, Chiaroscuro lighting style, low-angle camera, silhouette against bright overexposed sky
Negative prompt: watermark, clouds, human beings, center framing, moon, landscape, streets, field, slow shutter camera shake, ugly, house, buildings, high-tension cable posts
Steps: 60, Sampler: DPM++ 2M Karras, CFG scale: 30, Seed: 250, Size: 640x480, Model hash: 7440042bbd, Model: sd_xl_refiner_1.0, Denoising strength: 0.25, Hires upscale: 1.45, Hires upscaler: Latent, Version: v1.6.0

Saved: 00360-250.png

Image:

00360-250.png

Prompt 12:

desaturated black and white photograph of electric cables diagonally against the sky with crows perched on cables, film grain, Kodak TriX 400 ISO film, compose left of frame, high contrast, Chiaroscuro lighting style, low-angle camera, silhouette against bright overexposed sky
Negative prompt: watermark, clouds, human beings, center framing, moon, landscape, streets, field, slow shutter camera shake, ugly, house, buildings, high-tension cable posts, rooftop
Steps: 60, Sampler: DPM++ 2M Karras, CFG scale: 30, Seed: 400, Size: 640x480, Model hash: 7440042bbd, Model: sd_xl_refiner_1.0, Denoising strength: 0.25, Hires upscale: 1.45, Hires upscaler: Latent, Version: v1.6.0

Saved: 00375-400.png


Image:

00380-400.png

Upscaled:

00383-402.png

Villagers:


Prompt 22:

desaturated extreme high-angle black and white photograph of Indian villagers queuing on a long road in an Indian village, poor farmers wearing turbans, film grain, Kodak TriX 400 ISO film, compose left of frame, high contrast, dramatic Chiaroscuro lighting style, top-angle camera, high angle, diagonal framing, strong diagonal lines, birds-eye view camera, silhouette against brightly exposed fields, diagonally framed from top, crowded village road, village women carrying harvest on top their heads 
Negative prompt: watermark, clouds, center framing, moon, slow shutter camera shake, ugly, eye-level camera angle, low angle camera, mountains 
Steps: 60, Sampler: DPM++ 2M Karras, CFG scale: 30, Seed: 20, Size: 640x480, Model hash: 7440042bbd, Model: sd_xl_refiner_1.0, Denoising strength: 0.25, Hires upscale: 1.45, Hires upscaler: Latent, Version: v1.6.0

Saved: 00442-20.png



Image:

00447-20.png

Prompt 26:

desaturated extreme high-angle black and white photograph of Indian villagers queuing on a long road in an Indian village, poor farmers wearing turbans, film grain, Kodak TriX 400 ISO film, compose left of frame, high contrast, dramatic Chiaroscuro lighting style, top-angle camera, high angle, diagonal framing, strong diagonal lines, birds-eye view camera, silhouette against brightly exposed fields, diagonally framed from top, crowded village road, village women carrying harvest on top their heads 
Negative prompt: watermark, clouds, center framing, moon, slow shutter camera shake, ugly, eye-level camera angle, low angle camera, mountains 
Steps: 60, Sampler: DPM++ 2M Karras, CFG scale: 30, Seed: 10, Size: 640x480, Model hash: 31e35c80fc, Model: sd_xl_base_1.0, Denoising strength: 0.35, Hires upscale: 1.45, Hires upscaler: Latent, Refiner: sd_xl_refiner_1.0 [7440042bbd], Refiner switch at: 0.6, Version: v1.6.0

Saved: 00472-10.png


Image:

00472-10.png

Upscaled:

00475-12.png

Final Double - Diptych:

The chosen 4 where then compiled in Photoshop as a 4-panel artwork.

Comp_Photo_nomask@0.75x.png

Final Composite Image:

Upon the 4-panel composite, I created this layer mask to highlight the prominent compositional lines in each component images. The resulting shape resembled a butterfly with its wings spread on either side.

Final_Comp_Photo_mask@0.75x@0.75x.png

The initial attempt offered valuable insights into the considerations necessary for creating a collage of this nature. On a technical level, I recognised the need for more precise prompts, requiring extensive trial and error. Conceptually, I acknowledged the significance of selecting images that not only contrasted in composition but also in their pictorial elements. The power and appropriateness of these chosen images played a crucial role in effectively narrating the intended story.

However, unexpected technical issues surfaced with SDXL during this process. Despite initial prompts mentioning "grainy" and "camera shakes," efforts to eliminate these elements proved challenging. Adjusting parameters, such as removing the "camera shake" prompt, maximising the CFG scale, and maintaining denoising values between 0.25 and 0.35, resulted in visible camera shakes and jarring artefacts resembling digital glitches in low-resolution images. While this aesthetic might be suitable in other contexts, it did not align with the specific objectives of my project. Multiple attempts to rectify these issues proved unsuccessful. In particular, several lighthouse images presented an undesirable presence of the moon despite efforts to negate it using negative prompts such as "moon" and "sun," and later introducing "moonless" in subsequent prompts. The persistence of these elements was not conducive to the desired outcome.

autumnsmith
Posts: 10
Joined: Tue Oct 03, 2023 1:08 pm

Re: Project 7: Stable Diffusion 3

Post by autumnsmith » Tue Nov 21, 2023 11:28 am

Section 1:

1 .png

Prompt: comic book scene strip based on 1950s-style hand-drawn illustrations, simple colors, and heavy black outlines. Sepia colors. The story unfolding through multiple scenes about a six-year-old girl who has a pet dog and loses a yellow balloon in some trees. Looks like it is within a paper print newsprint. Playful and fun setting in a neighborhood, with simple cartoon human characters
Steps: 20, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 784544155, Size: 808x512, Model hash: 7440042bbd, Model: sd_xl_refiner_1.0, Denoising strength: 0.75, Version: v1.6.0
2.png

Prompt: 8 year old girl with a dog walking through a neighborhood. Comic book scene, strip of several related scenes unfolding a story based on 1950s-style hand-drawn illustrations, simple colors, and heavy black outlines. Sepia colors. The story unfolding through multiple scenes about a six-year-old girl who has a pet dog and loses a yellow balloon in some trees. Looks like it is within a paper print newsprint. Playful and fun setting in a neighborhood, with simple cartoon human characters. Saturation
Negative prompt: adults, grey
Steps: 20, Sampler: DPM++ 2M Karras, CFG scale: 12.5, Seed: 1405801910, Size: 808x512, Model hash: 7440042bbd, Model: sd_xl_refiner_1.0, Denoising strength: 0.75, Version: v1.6.0
5.png


Section 2:

Prompt: An 8-year-old girl walking through a suburban neighborhood with her dog and her balloon floats into outer space. Comic book scene, strip of several related scenes unfolding a story based on 1950s-style hand-drawn illustrations, simple colors, and heavy black outlines. White framing around each scene. Sepia colors. The story unfolds through multiple scenes about a six-year-old girl who has a pet dog and loses a yellow balloon in some trees. Looks like it is within a paper print newsprint. Playful and fun setting in a neighborhood, with simple cartoon human characters.
Negative prompt: adults, grey, silhouettes, abstract
Steps: 64, Sampler: DPM++ 2M Karras, CFG scale: 16.5, Seed: 3253048867, Size: 808x512, Model hash: 7440042bbd, Model: sd_xl_refiner_1.0, Denoising strength: 0.75, Version: v1.6.0
9.png

Prompt: A little girl walking through a suburban neighborhood with her dog and her balloon floats into outer space. Comic book scene, strip of several related scenes unfolding a story based on 1950s-style hand-drawn illustrations, simple colors, and heavy black outlines. White framing around each scene. Sepia colors. The story unfolds through multiple scenes about a six-year-old girl who has a pet dog and loses a yellow balloon in some trees. Looks like it is within a paper print newsprint. Playful and fun setting in a neighborhood, with simple cartoon human characters.
Negative prompt: adults, grey, silhouettes, abstract, random shapes, random animals
Steps: 99, Sampler: DPM++ 2M Karras, CFG scale: 25, Seed: 468724739, Size: 808x512, Model hash: 7440042bbd, Model: sd_xl_refiner_1.0, Denoising strength: 0.75, Version: v1.6.0
10.png


Prompt: A little girl walking through a suburban neighborhood with her dog and her balloon floats into outer space. Comic book scene, strip of several related scenes unfolding a story based on 1950s-style hand-drawn illustrations, simple colors, and heavy black outlines. White framing around each scene. Sepia colors. The story unfolds through multiple scenes about a young girl who has a pet dog and loses her balloon. Looks like it is within a paper print newsprint. Playful and fun setting in a neighborhood, with simple cartoon human characters.
Negative prompt: adults, grey, silhouettes, abstract, random shapes, random animals
Steps: 99, Sampler: DPM++ 2M Karras, CFG scale: 25, Seed: 1017895697, Size: 808x512, Model hash: 7440042bbd, Model: sd_xl_refiner_1.0, Denoising strength: 0.75, Version: v1.6.0
11.png

Prompt: A little girl walking through a suburban neighborhood with her dog and her yellow balloon floats into outer space. Comic book scene, strip of several related scenes unfolding a story based on 1950s-style hand-drawn illustrations, simple colors, and heavy black outlines. White framing around each scene. Sepia colors. The story unfolds through multiple scenes within the strips. Looks like it is within a paper print newsprint. Playful and fun setting in a neighborhood, with simple cartoon human characters. On paper newsprint
Negative prompt: adults, grey, silhouettes, abstract, random shapes, random animals
Steps: 99, Sampler: DPM++ 2M Karras, CFG scale: 25, Seed: 346451200, Size: 808x512, Model hash: 7440042bbd, Model: sd_xl_refiner_1.0, Denoising strength: 0.75, Version: v1.6.0

12.png
Last edited by autumnsmith on Tue Nov 21, 2023 12:11 pm, edited 1 time in total.

autumnsmith
Posts: 10
Joined: Tue Oct 03, 2023 1:08 pm

Re: Project 7: Stable Diffusion 3

Post by autumnsmith » Tue Nov 21, 2023 11:33 am

Section 3


Prompt: A smiling single little girl walking through a neighborhood with her dog and a balloon in hand, she releases her yellow balloon floats into outer space. Comic book scene, strip of several related scenes unfolding a story based on 1950s-style hand-drawn illustrations, simple colors, and heavy black outlines. White framing around each scene. Sepia colors. The story unfolds through multiple scenes within the strips. Looks like it is within a paper print newsprint. Playful and fun setting in a neighborhood, with simple cartoon human characters. On paper newsprint in pencil sketch
Negative prompt: adults, grey, silhouettes, abstract, random shapes, random animals
Steps: 116, Sampler: DPM++ 2M Karras, CFG scale: 30, Seed: 4217370162, Size: 808x512, Model hash: 6ce0161689, Model: v1-5-pruned-emaonly, Denoising strength: 1, Version: v1.6.0
16.png

Prompt: A smiling single little girl walking through a neighborhood with her dog and a balloon in hand, she releases her yellow balloon floats into outer space. Comic book scene, strip of several related scenes unfolding a story based on 1950s-style hand-drawn illustrations, simple colors, and heavy black outlines. White framing around each scene. Sepia colors. The story unfolds through multiple scenes within the strips. Looks like it is within a paper print newsprint. Playful and fun setting in a neighborhood, with simple cartoon human characters. On paper newsprint in pencil sketch
Negative prompt: adults, grey, silhouettes, abstract, random shapes, random animals, symbols, uncanny, scary, high contrast, text bubbles
Steps: 82, Sampler: DPM++ 2M Karras, CFG scale: 30, Seed: 3717563406, Size: 808x512, Model hash: 6ce0161689, Model: v1-5-pruned-emaonly, Denoising strength: 0.8, Version: v1.6.0
21.png

Prompt: A smiling single little girl walking through a neighborhood with her dog and a balloon in hand, she releases her yellow balloon floats into outer space. Comic book scene, strip of several related scenes unfolding a story based on 1950s-style hand-drawn illustrations, simple colors, and heavy black outlines. White framing around each scene. Sepia colors. The story unfolds through multiple scenes within the strips. Looks like it is within a paper print newsprint. Playful and fun setting in a neighborhood, with simple cartoon human characters. On paper newsprint in pencil sketch, happy facial expressions
Negative prompt: adults, grey, silhouettes, abstract, random shapes, random animals, symbols, uncanny, scary, high contrast, text bubbles, realism, multiple styles, words, cars
Steps: 122, Sampler: DPM++ 2M Karras, CFG scale: 20.5, Seed: 2598734361, Size: 808x512, Model hash: 6ce0161689, Model: v1-5-pruned-emaonly, Denoising strength: 0.51, Version: v1.6.0
23.png

Prompt: Cohesive storyline of a single smiling little girl walking through a neighborhood with her dog and a balloon in hand, she releases her yellow balloon and floats into outer space. Comic book scene, strip of several related scenes unfolding a story based on 1950s-style hand-drawn illustrations, simple colors, and heavy black outlines. White framing around each scene. Sepia colors. The story unfolds through multiple scenes within the strips. Looks like it is within a paper print newsprint. Playful and fun setting in a neighborhood. On paper newsprint in pencil sketch, happy facial expressions, humans and dogs

Negative prompt: adults, grey, silhouettes, abstract, random shapes, random animals, symbols, uncanny, scary, high contrast, text bubbles, realism, multiple styles, words, cars , bluring, out of focus
Steps: 80, Sampler: DPM++ 2M Karras, CFG scale: 16, Seed: 218515530, Size: 808x512, Model hash: 6ce0161689, Model: v1-5-pruned-emaonly, Denoising strength: 0.51, Version: v1.6.0
26.png


Section 4


Prompt: Cohesive storyline of a single smiling little girl walking through a neighborhood with her dog and a balloon in hand, she releases her yellow balloon and floats into outer space. Comic book scene, strip of several related scenes unfolding a story based on 1950s-style hand-drawn illustrations, simple colors, and heavy black outlines. White framing around each scene. Sepia colors. The story unfolds through multiple scenes within the strips. Playful and fun setting in a neighborhood, low poly, happy facial expressions, humans and dogs

Negative prompt: adults, grey, silhouettes, abstract, random shapes, random animals, symbols, uncanny, scary, high contrast, text bubbles, realism, multiple styles, words, cars , bluring, out of focus, weird, uncanny figures
Steps: 40, Sampler: DPM++ 2M Karras, CFG scale: 12.5, Seed: 2751252493, Size: 808x512, Model hash: 6ce0161689, Model: v1-5-pruned-emaonly, Denoising strength: 0.7, Version: v1.6.0
33.png

Prompt: 8 year old girl with a dog walking through a neighborhood. Comic book scene, strip of several related scenes unfolding a story based on 1950s-style hand-drawn illustrations, simple colors, and heavy black outlines. Sepia colors. The story unfolding through multiple scenes about a six-year-old girl who has a pet dog and loses a yellow balloon in some trees. Looks like it is within a paper print newsprint. Playful and fun setting in a neighborhood, with simple cartoon human characters. Sepia colors. The story unfolds through multiple scenes within the strips, low poly, happy facial expressions, humans and dogs

Negative prompt: adults, grey, silhouettes, abstract, random shapes, random animals, symbols, uncanny, scary, high contrast, text bubbles, realism, multiple styles, words, cars , bluring, out of focus, weird, uncanny figures
Steps: 40, Sampler: DPM++ 2M Karras, CFG scale: 8.5, Seed: 804794048, Size: 808x512, Model hash: 6ce0161689, Model: v1-5-pruned-emaonly, Denoising strength: 0.7, Version: v1.6.0
34.png

Prompt: 8 year old girl with a dog walking through a neighborhood. Comic book scene, strip of several related scenes unfolding a story based on 1950s-style hand-drawn illustrations, simple colors, and heavy black outlines. Sepia colors. The story unfolding through multiple scenes about a six-year-old girl who has a pet dog and loses a yellow balloon in some trees. Looks like it is within a paper print newsprint. Playful and fun setting in a neighborhood, with simple cartoon human characters. Sepia colors. The story unfolds through multiple scenes within the strips, low poly, happy facial expressions, humans and dogs

Negative prompt: adults, grey, silhouettes, abstract, random shapes, random animals, symbols, scary, high contrast, text bubbles, realism, multiple styles, words, cars , bluring, out of focus, weird,
Steps: 20, Sampler: DPM++ 2M Karras, CFG scale: 12, Seed: 2465326593, Size: 808x512, Model hash: 6ce0161689, Model: v1-5-pruned-emaonly, Denoising strength: 0.79, Version: v1.6.0
35.png

Beginning this project, I took the most successful combination of AI-generated cartoon art that was made within mid-journey from previous weeks. From there, I initially input the image into the image-to-image conversion within stable diffusion. Between images 5-12, I adjusted the prompt parameters and text to make adjustments as needed. Image five became the most successful of this series, with the inclusion of a girl, a dog, a balloon, and the general feeling or color tone of the original inputted image. The set of compositions also successfully incorporates text without it being a text bubble (image 5). Additionally, in the upper left-hand image, we see the words, trees, and tree, which is the first successful combination or output of text thus far.

Between images 9-12, I began readjusting the parameters and making changes to areas like the prompt, negatives, CFG, scale, and steps. This set of images was more heavily focused on trying to make the storyline cohesive. At this point, the stories took a turn towards not being as visually understandable as previous versions had. However, there still were some successful outputs of image combinations with correlation to what was prompted. Particularly the upper left-hand image within #9 successfully conveys the components that were asked, including a little girl, a dog, and a neighborhood. Here we begin to see a transition from images 9-11, which are continually more uncanny. At this point within the prompt parameters, I attempted to make adjustments to reverse the uncanny nature of the characters being depicted. In addition to the general prompt combativeness, I tried to make changes to the prompt to more closely convey a story across multiple scenes. Number 12 does not do this perfectly, but image 1 within 12 does in some ways convey a successful story scene within a single frame and compiles multiple components that were requested.

At some point, the uncanny nature of the images became distracting from what the original prompt and negatives were requesting. Around image 16, I decided to backtrack my steps to get close to what I felt was a successful composition, so I decided to use image 6 as a base point for the text prompt in conjunction with the imported image from mid-journey. In images, 16, 21, 23, and 26, I was really fighting with the default and input that stable diffusion imposes on the imagery. From here I continually tried to adjust the prompt, negative steps, CFG scale, and other parameters to get the images closer compositionally and stylistically to image 6. The final version was image 26, which successfully encompasses visual aspects from the original inputted mid-journey image including color, tone, and composition. Contrastingly, however, it did add a lot of weird aspects that almost feel pasted within the scene from different styles or periods.

In the last set of images, I tried to minimize the prompt while using other parameters from the online list. I began cutting down the prompts to match more closely to the one given in image 6 where the image was the most understandable about a story unfolding across multiple frames. At the same time, I was adjusting the CFG scale, the denoising strength, steps, and negative prompt features. In the end, I still wasn't able to get these images as close as I would’ve liked. Finally, the program began inputting nonsense text in varying amounts, including human forms that were nonsensical, despite prompts to have more defined featured human beings.
Last edited by autumnsmith on Tue Nov 21, 2023 12:41 pm, edited 1 time in total.

gracefeng
Posts: 8
Joined: Tue Oct 03, 2023 1:12 pm

Re: Project 7: Stable Diffusion 3

Post by gracefeng » Tue Nov 21, 2023 11:38 am

My goal for this project was to see if stable diffusion could create unconventional poster designs. I didn't expect much from its typography as I know AI has difficulty rendering legible words, but I was keen on testing its ability to create novel compositions. I used a reference image of a slightly unconventional poster design to provide a starting point. From then on, my prompts were tailored to produce posters that would diverge further and further from the original reference image while keeping some of its original elements intact (stippling, mixed fonts, organic shapes, typography).

Prompt 1: Experimental poster design, typography, mixed fonts, mixed media, vibrant, randomness, eclectic, maximalist.
Negative prompt: Boring
Steps: 40, Sampler: DPM++ 2M Karras, CFG scale: 5.5, Seed: 3, Size: 512x512, Model hash: 6ce0161689, Model: v1-5-pruned-emaonly, Denoising strength: 0.75, Script: X/Y/Z plot, X Type: Seed, X Values: 3, Fixed X Values: 3, Version: v1.6.0
img2img:
Poster.jpg
Result:
image (23).png
Prompt 2: Experimental poster design, typography, mixed fonts, mixed media, vibrant, randomness, eclectic, maximalist.
Negative prompt: Experimental poster design, typography, mixed fonts, mixed media, vibrant, randomness, eclectic, maximalist.
Steps: 40, Sampler: DPM++ 2M Karras, CFG scale: 5.5, Seed: 3, Size: 512x512, Model hash: 6ce0161689, Model: v1-5-pruned-emaonly, Denoising strength: 0.75, Script: X/Y/Z plot, X Type: Seed, X Values: 3, Fixed X Values: 3, Version: v1.6.0
Result:
image (25).png
Prompt 3: Experimental poster design, typography, mixed fonts, mixed media, vibrant, randomness, eclectic, maximalist.
Negative prompt: Boring
Steps: 50, Sampler: Euler, CFG scale: 1.5, Seed: 1661130803, Size: 512x512, Model hash: 6ce0161689, Model: v1-5-pruned-emaonly, Denoising strength: 1.0, Decode prompt: "Experimental poster design, typography, mixed fonts, mixed media, vibrant, randomness, eclectic, maximalist.", Decode negative prompt: Boring, Decode CFG scale: 1, Decode steps: 50, Randomness: 0.6, Sigma Adjustment: False, Version: v1.6.0
Result:
image (26).png
Prompt 4: Experimental poster design, legible typography, mixed fonts, mixed media, vibrant, randomness, eclectic, maximalist.
Negative prompt: Boring, illegible
Steps: 40, Sampler: DPM++ 2M Karras, CFG scale: 16, Seed: 2958254688, Size: 512x512, Model hash: 6ce0161689, Model: v1-5-pruned-emaonly, Denoising strength: 0.75, Version: v1.6.0
Result:
image (27).png
Prompt 5: Experimental poster design with organic shapes, randomness, mixed media, collage, stippling, digitalism, eclectic.
Negative prompt: Boring, illegible, straight lines.
Steps: 40, Sampler: DPM++ 2M Karras, CFG scale: 23.5, Seed: 2277534652, Size: 512x512, Model hash: 6ce0161689, Model: v1-5-pruned-emaonly, Denoising strength: 0.75, Version: v1.6.0
Result:
image (28).png
Prompt 6: Experimental poster design with organic shapes, mixed fonts, legible typography, digitalism, stippling, randomness, mixed media, collage, eclectic.
Negative prompt: Boring, illegible, straight lines.
Steps: 40, Sampler: DPM++ 2M Karras, CFG scale: 23.5, Seed: 1744566949, Size: 512x512, Model hash: 6ce0161689, Model: v1-5-pruned-emaonly, Denoising strength: 0.75, Version: v1.6.0
Result:
image (29).png
Prompt 7: Experimental poster design with organic shapes, mixed fonts, legible typography, digitalism, stippling, randomness, mixed media, branded, slogan.
Negative prompt: Boring, illegible, straight lines.
Steps: 40, Sampler: DPM++ 2M Karras, CFG scale: 7.5, Seed: 3818463038, Size: 512x512, Model hash: 6ce0161689, Model: v1-5-pruned-emaonly, Denoising strength: 0.75, Version: v1.6.0
Result:
image (31).png
Prompt 8: Experimental poster design with organic shapes, album cover, branded, slogan, mixed fonts, legible typography, digitalism, stippling, randomness.
Negative prompt: Boring, illegible, straight lines.
Steps: 46, Sampler: DPM++ 2M Karras, CFG scale: 7.5, Seed: 72917936, Size: 512x512, Model hash: 6ce0161689, Model: v1-5-pruned-emaonly, Denoising strength: 0.75, Version: v1.6.0
Result:
image (32).png
Prompt 9: Experimental poster design with organic shapes, branded, slogan, legible typography, mixed media, digitalism, stippling, randomness.
Negative prompt: Boring, illegible, straight lines.
Steps: 46, Sampler: DPM++ 2M Karras, CFG scale: 7.5, Seed: 3496594824, Size: 512x512, Model hash: 6ce0161689, Model: v1-5-pruned-emaonly, Denoising strength: 0.75, Version: v1.6.0
Result:
image (33).png
Prompt 10: Experimental poster design with organic shapes and mixed media randomness, duochromatic, branded, slogan, legible typography, digitalism, stippling, overlap.
Negative prompt: Boring, illegible, straight lines.
Steps: 46, Sampler: DPM++ 2M Karras, CFG scale: 7.5, Seed: 2958164571, Size: 512x512, Model hash: 6ce0161689, Model: v1-5-pruned-emaonly, Denoising strength: 0.75, Version: v1.6.0
Results:
image (34).png
image (35).png
image (36).png
image (37).png
image (38).png
image (39).png
Last edited by gracefeng on Tue Nov 21, 2023 11:55 am, edited 1 time in total.

colindunne
Posts: 7
Joined: Tue Oct 03, 2023 1:09 pm

Re: Project 7: Stable Diffusion 3

Post by colindunne » Tue Nov 21, 2023 11:52 am

I explored the links above in the project description and noticed that the articles consistently utilized "Euler a" as their sampling method model. So for this project, I decided to continue off of previous work by performing a more extensive and presentable version of comparing the different Stable DIffusion sampling methods. Researching showed some examples displaying comparisons but not nearly as many sampling methods as what are offered in our current Stable Diffusion Wed UI. I was inclined to approach this with the prompt matrix parameter displayed in this article: (https://github.com/AUTOMATIC1111/stable ... e-showcase). With this parameter I could create multiple prompt results per sampling method that would retain the same seed and prompts across all sampling methods to compare. This particularly enabled multiple different styles to compare results. Some results such as "DPM fast" or "PLMS" proved very broken, and I am interested in seeing if there are more appropriate uses for those sampling models.

Orgiginal prompt
a busy city street in a modern city|illustration|cinematic lighting
Steps: 20, Sampler: DPM++ SDE Karras, CFG scale: 7, Seed: 834184246, Size: 512x512, Model hash: 7440042bbd, Model: sd_xl_refiner_1.0, Version: v1.6.0

______________________________________________________________
DPM++ SDE Karras
Image
______________________________________________________________
DPM++ 2M SDE Exponential
Image
______________________________________________________________
DPM++ 2M Karras
Image
______________________________________________________________
DPM++ 2M SDE Karras
Image
______________________________________________________________
Euler a
Image
______________________________________________________________
Euler
Image
______________________________________________________________
LMS
Image
______________________________________________________________
Heun
Image
______________________________________________________________
DPM2
Image
______________________________________________________________
DPM2 a
Image
______________________________________________________________
DPM++ 2S a
Image
______________________________________________________________
DPM++ 2M
Image
______________________________________________________________
DPM++ SDE
Image
______________________________________________________________
DPM++ 2M SDE
Image
______________________________________________________________
DPM++ 2M SDE Heun
Image
______________________________________________________________
DPM++ 2M SDE Heun Karras
Image
______________________________________________________________
DPM++ 2M SDE Heun Exponential
Image
______________________________________________________________
DPM++ 3M SDE
Image
______________________________________________________________
DPM++ 3M SDE Karras
Image
______________________________________________________________
DPM++ 3M SDE Exponential
Image
______________________________________________________________
DPM fast
Image
______________________________________________________________
DPM adaptive
Image
______________________________________________________________
LMS Karras
Image
______________________________________________________________
DPM2 Karras
Image
______________________________________________________________
DPM2 a Karras
Image
______________________________________________________________
DPM++ 2S a Karras
Image
______________________________________________________________
Restart
Image
______________________________________________________________
DDIM
Image
______________________________________________________________
PLMS
Image
______________________________________________________________
UniPC
Image

luischavezcarrillo
Posts: 8
Joined: Thu Oct 05, 2023 2:48 pm

Re: Project 7: Stable Diffusion 3

Post by luischavezcarrillo » Tue Nov 21, 2023 12:56 pm

The intent of this study was to explore how the RNG values generated from the XYZ plot through the GPU and CPU affect the image, as well as explore the extent of which negatives can remove features from an image to image prompt, while also experimenting with how CFG scales and Denoising can be changed to affect how much of the original image is changed based on the prompt. Initial tests showed more of attempts to remove features, but later on, the use of a negative prompt was dropped, as it seemed unable to fully remove the undesired traits. Despite this, the results produced afterwards were satisfactory. Some refiner checkpoints appear to be very biased when it came to color schemes, and were even able to create wildly different images, without faces or humans, from the others in each batch, despite there being no indication to remove human faces. It also appears that using NV and GPU as RNG sources is redundant, since they produce identical images, which is likely due to the random numbers they produce are used as seed values. It appears that when it comes to img2img prompts, the CFG scale has less overall influence on changes to the original image, than the denoising scale does. A proper balance is most likely necessary in order to create an entirely new, yet lightly similar image.

All the following images were generated off this image:
SPOILER_20231107_121813.jpg
All settings are default unless otherwise indicated by the XYZ plot or the text accompanying the image.

cfg 10, denoising .35
p7rng2.png
cfg 10, denoising .35,
negative: human
p7rng3.png
cfg 10, denoising .35,
negative: face
p7rng4.png
cfg 10, denoising .35,
negative: human face, people, person
p7rng5.png
cfg 6, denoising .6, prompt: horror aesthetic, sepia filter
negative: human face, people, person
p7rng6.png
cfg 6, denoising .6, prompt: horror aesthetic, sepia filter
p7rng7.png
cfg 6, denoising .45, prompt: horror aesthetic, sepia filter
p7rng8.png
cfg 1.5, denoising .45, prompt: horror aesthetic, sepia filter
p7rng9.png
cfg 6, denoising .5, prompt: horror aesthetic, sepia filter
p7rng10.png
cfg 25, denoising .5, prompt: horror aesthetic, sepia filter
p7rng11.png
cfg 25, denoising .35, prompt: horror aesthetic, sepia filter
p7rng12.png
Last edited by luischavezcarrillo on Tue Nov 21, 2023 1:06 pm, edited 1 time in total.

bsierra
Posts: 8
Joined: Tue Oct 03, 2023 3:08 pm

Re: Project 7: Stable Diffusion 3

Post by bsierra » Tue Nov 21, 2023 1:00 pm

With this project, I looked to experiment with the refiner tool in Stable Diffusion. I initially started with the inspiration to create a scene with a fuzzy, blurry filter with bright bursts of light that cut through said filter. Series 1 and 2 are of people taking selfies through a mirror, as I wanted to try to explore commentary on the prevalence of social media, and experiment with the way Stable Diffusion depicted selfies. Series 2 keeps the prompts the same but I lower sampling steps to create a blurry effect using Stable Diffusion itself, rather than through just text queries. From here I decided to experiment with the refiner using a foggy forest concept I tried out in previous projects. The refiner made the images extremely abstract, and added very interesting textures to the composition. Using the refiner, I figured out that I like the way SD 1.5 composes its images, while SDXL applies its own polished filter effect over the top of the composition. Additionally, the SDXL refiner applied heavy scan lines which I thought I could remove if I added more sampling steps. The scan lines stayed, however, I still feel that they added to the glitch art aesthetic I've been experimenting with all quarter, creating a really neat juxtaposition of nature and technology. These images reminded me of cover art for early 2000s electronica or rock music, all that's missing is the typography.

1

00184-549805674.png
00197-3233970869.png
00242-1222628091.png
mirrored image selfie edit, 2010s instagram flickr tumblr, emo scene, symmetry, indie sleaze bloghouse, bright flash of light, obstructed face, light covering face covered, phone covering face, hair covering face
Negative prompt: black and white, face visible, facial features, view of face, unobstructed, face free of object, phone not covering face
Steps: 20, Sampler: Euler a, CFG scale: 7, Seed: 1222628091, Size: 776x512, Model hash: 6ce0161689, Model: v1-5-pruned-emaonly, Version: v1.6.0

2

00262-1759670514.png
00263-1759670515.png
tmpvwgh12n5.png
mirrored image selfie edit, 2010s instagram flickr tumblr, emo scene, symmetry, indie sleaze bloghouse, bright flash of light, obstructed face, light covering face covered, phone covering face, hair covering face
Negative prompt: black and white, face visible, facial features, view of face, unobstructed, face free of object, phone not covering face
Steps: 20, Sampler: Euler a, CFG scale: 7, Seed: 4062007522, Size: 776x512, Model hash: 6ce0161689, Model: v1-5-pruned-emaonly, Version: v1.6.0

3

00343-1908557176.png
00355-3970755789.png

ethereal forest
Steps: 8, Sampler: Euler a, CFG scale: 7, Seed: 1908557176, Size: 776x512, Model hash: 6ce0161689, Model: v1-5-pruned-emaonly, Refiner: sd_xl_base_1.0 [31e35c80fc], Refiner switch at: 0.52, Version: v1.6.0

4

00417-943258453.png
00420-943258456.png
foggy grey ethereal forest, bright white flash of light, light burst, light refraction, fisheye perspective, volumetric lighting, intense motion blur, first person point of view, VHS camcorder footage
Negative prompt: fairytale, cartoon, yellow light,
Steps: 19, Sampler: Euler a, CFG scale: 7, Seed: 943258453, Size: 776x512, Model hash: 6ce0161689, Model: v1-5-pruned-emaonly, Refiner: sd_xl_base_1.0 [31e35c80fc], Refiner switch at: 0.49, Version: v1.6.0

00436-3358035339.png
00438-140377908.png
00430-3830015318.png
foggy grey ethereal forest, bright white flash of light, light burst, light refraction, fisheye perspective, volumetric lighting, intense motion blur, first person point of view, VHS camcorder footage,
Negative prompt: fairytale, cartoon, yellow light, golden hour, sundown
Steps: 19, Sampler: Euler a, CFG scale: 7, Seed: 3358035339, Size: 776x512, Model hash: 6ce0161689, Model: v1-5-pruned-emaonly, Refiner: sd_xl_base_1.0 [31e35c80fc], Refiner switch at: 0.3, Version: v1.6.0

00442-650278030.png
00441-650278029.png
foggy grey ethereal forest, bright white flash of light, light burst, light refraction, fisheye perspective, volumetric lighting, intense motion blur, first person point of view, VHS camcorder footage,
Negative prompt: fairytale, cartoon, yellow light, golden hour, sundown
Steps: 30, Sampler: Euler a, CFG scale: 7, Seed: 650278030, Size: 776x512, Model hash: 6ce0161689, Model: v1-5-pruned-emaonly, Refiner: sd_xl_base_1.0 [31e35c80fc], Refiner switch at: 0.3, Version: v1.6.0

Post Reply