MindMap Gallery AI painting
The mind map drawn based on the zero-based SD tutorial by master Nenly discusses the painting method of using artificial intelligence technology to create.
Edited at 2024-04-12 10:02:10Avatar 3 centers on the Sully family, showcasing the internal rift caused by the sacrifice of their eldest son, and their alliance with other tribes on Pandora against the external conflict of the Ashbringers, who adhere to the philosophy of fire and are allied with humans. It explores the grand themes of family, faith, and survival.
This article discusses the Easter eggs and homages in Zootopia 2 that you may have discovered. The main content includes: character and archetype Easter eggs, cinematic universe crossover Easter eggs, animal ecology and behavior references, symbol and metaphor Easter eggs, social satire and brand allusions, and emotional storylines and sequel foreshadowing.
[Zootopia Character Relationship Chart] The idealistic rabbit police officer Judy and the cynical fox conman Nick form a charmingly contrasting duo, rising from street hustlers to become Zootopia police officers!
Avatar 3 centers on the Sully family, showcasing the internal rift caused by the sacrifice of their eldest son, and their alliance with other tribes on Pandora against the external conflict of the Ashbringers, who adhere to the philosophy of fire and are allied with humans. It explores the grand themes of family, faith, and survival.
This article discusses the Easter eggs and homages in Zootopia 2 that you may have discovered. The main content includes: character and archetype Easter eggs, cinematic universe crossover Easter eggs, animal ecology and behavior references, symbol and metaphor Easter eggs, social satire and brand allusions, and emotional storylines and sequel foreshadowing.
[Zootopia Character Relationship Chart] The idealistic rabbit police officer Judy and the cynical fox conman Nick form a charmingly contrasting duo, rising from street hustlers to become Zootopia police officers!
AI painting
Preface
A brief analysis of the principles of AI painting
Original image → Diffusion (add noise) → Generate (remove noise)
Configuration requirements
Computer (Windows/Mac system)
Recommended for Win10 or above systems
Graphics card (NVIDIA graphics card is preferred)
Discrete graphics card (non-core graphics)
Graphics card performance and memory will affect the operating experience
Performance affects drawing efficiency
Video memory affects the maximum resolution size of drawing graphics and the scale of model training
configuration list
Installation and operation of Webui and front-end software
For details, please refer to the latest installation guide
https://nenly.notion.site/c5805e7ae26b4683a277c5586ea05904
It is recommended to download the integration package of domestic big players
autumn leaves aaaki
https://www.bilibili.com/video/BV1iM4y1y7OA
Independent Researcher-Starry Sky
https://www.bilibili.com/video/BV1bT411p7Gt
Phantom AI Painting Box
https://www.bilibili.com/video/BV1Vc411T7Nw/
Precautions for running the program
The installation path should not contain Chinese characters
The installation path is the folder where you place Stable Diffusion WebUl and the folder it belongs to. If there are Chinese characters in any folder, some errors will be reported in the path indexing process. For example:D:/Program&Files/AlPainting
In order to reduce the probability of errors, input spaces should be reduced as much as possible
Place the Webui folder on a drive with more storage space. Try not to place it on the C drive.
Folders can be copied freely without affecting use.
Operation terminal and command line
subtopic
StableDiffusion(SD)
Basic Operation Guide
Common Functions
Vincentian picture
Tu Sheng Tu
Basic introduction to interface functions
Model
Save and export
The generated pictures will be automatically saved in the local folder
Gallery browser: records various generation information of images
Picture view
View the generated images using the gallery browser
View local files
Location: Go to the Webui root directory (installation location) to find the Outout folder
txt2img-images: Vincent pictures
img2img-images:图生图
extras-images: Image resolution enlargement
txt2img-grids: Vincent picture_generate previews of multiple pictures
img2img-grids: Picture-generating pictures_generate previews of multiple pictures
Model
Model concept
"Large model" in AI painting generally refers to Checkpoint
Definition: "data set" for AI training, used to support AI drawing and painting
The origin of the Checkpoint concept
Most models will undergo continuous training, fine-tuning, and iteration, and the "storage point" in the process is a large model.
Basic properties
Size:1~7G
Common format:.ckpt/.safetensor
Small model: Other models smaller than the large model
For example: LoRA, Embeddings, Hypernetwork, etc.
Play a "fine-tuning" role in large models
VAE: variational autoencoder
Simple understanding of the function: similar to "color filter"
If the model does not come with VAE, VAE needs to be configured correctly.
Otherwise: the picture will be gray and white, and the generation quality will be worrying.
Different models will generate different screen contents and styles.
Model download channel
The difference between official models and private furnace models
Official model: A basic model trained with a lot of effort. It supports the roots of A's painting, but the rendering effect is average.
Private furnace model: A model with stylized characteristics that is "fine-tuned" on the basis of the official model! Trained by individual creators
Training models, also known as "alchemy"
Copyright issues are still controversial
Channels for downloading various models
Hugging Face: https://huggingface.co/models
There are many professional websites on deep learning and artificial intelligence, but they are not very intuitive to find.
Civitai (station C): https://civitai.com/
The most popular AI painting model sharing website in the world. In addition to models, there are also many excellent works on display.
How to filter models
Training model & fusion model
Training model: trained from the basic model
Fusion model: a new model obtained by mixing multiple training models
model tag
Fields in which models are "good at", such as real-life photos, animation, illustrations, architecture, cartoons, 3D, etc.
Learn to use models
View version
Some models may have different iterative versions, generally choose the latest one.
View ModelCard and model description
The author generally provides instructions for use and recommends samplers, VAE, etc.
View the example image "Copying homework"
On the model website, you can generally directly copy the prompt words of the pictures uploaded by the author or other creators and apply them
Model folder path
Webut root directory/Models/Stable-diffusions
Model style classification and recommendation
Recommended model
StableDiffusion1.4
AbyssOrangeMix (Abyss Orange)
Starter Pack
BV1Us4y117Rg
Two-dimensional model: Comic/illustration style, with a distinctive painting brushstroke texture
Recommended models:AbyssOrangeMix,Counterfeit,Anything,Dreamlike,Diffusion
Real model: It is more realistic, has a high degree of simulation, and has a strong ability to restore the real world.
Recommended models: Deliberate, Realistic Vision, LOFI
2.5D model: between the first two, close to the current audience’s imagination of some games and 3D animations
Recommended models: NeverEndingDream (NED), Protogen, Guofeng 3
Other specialized style models
Such as: architectural design, graphic design, etc.
Principle analysis of advanced models - small models
Embeddings: word embedding model
Can be used to restore character image characteristics
Metaphor: a set of "bookmarks" pointing to specific pages
Example: What is "Nekomata"? → Cat, Human, Monster
Usage
Installation: Place the model in the root directory/Embeddings folder
Call: Enter the model file name in the positive/negative prompt word box
Application development
Three-view design: use CharTurnerEmbeddings, plus appropriate sentence prompts to trigger
Example:A character turnaround of a (corneo dva) wearing blue mechabodysuit,(CharTurnerV2:1.2)
Negative word embedding: solve the problem of wrong hands and low picture quality
Just add the corresponding Embeddings file name to the negative prompt words.
Reverse prompt word
Use Clip or DeepBooru
DeepBooru is recommended, Clip often has connection problems
LORA: low rank adaptation model
It is generally used to restore characters and image characteristics, and can also be used to train painting styles.
Metaphor: an extra "coloring page" in the book
Usage
Installation: Place the model in the root directory /models/LORA folder
Zhou Yong: Enter <lora: model file name: weight> in the positive/negative prompt word box
When using it, you can control the weight to avoid overly affecting the painting style or other elements.
Hypernetwork: super network
Generally used for painting style training
(Not many people use it now)
Usage
Installation: Place the model in the root directory /models/Hypernetwork folder
Call: Enter <hypernet:model file name:weight> in the positive/negative prompt word box
LoRA application and practice
The birth and principle of LORA
First used in large language models
"Fine-tuning" a large model with less data
Used together with Checkpoint
The training threshold is low, and the ecology flourishes.
Three basic ways to apply LORA
Enter via prompt word
Format: <Lora:filename:weight>
Called via built-in menu
Next to the "Generate" option, select a variety of different add-on models (including Embeddings.LoRA and more)
Click the corresponding tab to automatically add the prompt word to the upper box
Can manage thumbnails
Called via additional extensions
Additional NetworksExtension
Supports up to five LORAs, with slider adjustment of weight parameters
Five LORA application directions
CharacterLoRA
Case: "CyberCoser", Lucy, the Edgewalker
Tips: LORA can be used with Tagger to push out the characteristics of the character to accurately fix the character image.
Tips: Combining the same LORA with large models of different styles will produce different effects.
For example, if you choose a large model in a real-life style, you can draw a realistic character effect.
Painting style LORA
Case: Ghibli style
Tips: Multiple LORAs can be used in combination, one is responsible for the painting style, and the other is responsible for the characters.
ConceptLoRA
Case:Gacha splash LORA
Tips: Read ModelCard to get more usage suggestions from the author.
Such as: sampler, key prompt words, parameters, etc.
ClothingLoRA
Case: Mecha Girl
Tips: Multiple LORAs of the same type can be used in combination and observe their "chemical reactions" to create the appropriate effect.
The weight should not be too high, otherwise it will cause conflict and confusion in the picture.
Specific element LORA
Case: Cyberhelmet Cyberhelmet
Tips: LORA can also be used in partial redrawing, so that only part of the redrawing applies the LORA effect
For example, partially redraw the head to achieve accurate "helmet wearing"
Prompts
basic writing methods
Full English input
based on phrases
Most of the time it is recommended to split long sentences
A separator (a half-width comma in English) needs to be inserted between phrases.
Line breaks are possible, but it is best not to have separators in each line.
Prompt word concept
Convey drawing requirements and let AI understand "what we want to draw"
Prompt word classification
Positive cue words: what to expect
Content prompt words
Describe the scene concretely
Personal appearance (girl, blonde hair, long hair, etc.)
Clothing characteristics (white dress, jeans, t-shirt, etc.)
Scenes and environments (forest, tree, white flower, day, sunlight, cloudy sky, etc.)
Picture perspective and composition (close-up, full body, distant, etc.)
Other screen elements
standardized prompt words
quality
High quality: best quality, ultra-detailed, masterpiece, highres, 8k
Specific high-quality types: extremely detailed CG unity 8k wallpaper (ultra-fine 8K Unity game CG), unreal engine rendered (unreal engine rendering)
style of painting
Illustration style: Painting, lustration, drawing
Two-dimensional: Anime, Comic, Game CG
Realistic style: Photorealistic, Realistic
Negative prompt words: what you don’t want to happen
Common negative reminder words
Low quality: such as low quality, low res
Monochrome grayscale: such as monochrome, grayscale
Appearance and body shape: such as badproportions, ugly
Problems with limbs: such as missing hands, extra fingers
You can try to use reverse prompt words to generate some elements forward
Prompt word weight grammar
The role of weights: to enhance or weaken the priority of certain prompt words
Adjust weighting method
The first type: brackets
Example: (((white flower))) - enhanced by 1.1^3 times
Example: {{{white flower}}} - Enhanced by 1.05^3 times
Example: [[[white flower]]] - weakened by 1.1^3 times
The second type: brackets, colons, numbers
Example: (white flower:1.5),
Control between 0.5-1.5
Avoid adjusting the weight too much, otherwise it will distort the picture.
Recommend "The Curse"
positive cue words
(masterpiece:1,2),best quality,masterpiece,highres,original,extremely detailed wallpaper,perfect lighting,(extremely detremely detailed CG:1.2),drawing,paintbrush,
reverse prompt word
NSFW,(worst quality:2),(low quality:2),(normal quality:2),lowres,normal quality,((monochrome)),((grayscale)),skin spots,acnes,skin blemishes,age spot ,(ugly:1.331),(duplicate:1.331),(morbid:1.21),(mutilated:1.21),(tranny:1.331),mutated hands,(poorly drawn hands:1.5),blurry,(bad anatomy:1.21) ,(bad proportions:1.331),extra limbs,(disfigured:1.331),(missing arms:1.331),(extra legs:1.331),(fused fingers:1.61051),(too many dingers:1.61051),(unclear eyes: 1.331),lowers,bad hands,missing fingers,extra digit,bad hands,missing fingers,(((extra arms and legs)))
Three Tip Word Methods for Beginners
1. Translation Dafa
Use translation software to directly convert your natural language into prompt words
Describe a specific scene first, and then slowly add entries later
2. Use tools
Use the prompt word tool to complete the prompt word writing by "selecting"
A toolbox: http://www.atoolbox.net/Tool.php?ld=1101
AI word accelerator: https://ai.dawnmark.cn/
Don’t be limited in your thinking by existing entries
3. Copy homework
Refer to some examples of model websites and prompt words to record the finished product of the website.
OpenArt: https://openart.ai
ArtHubAi: https://arthub.ai/
Reference content/standardized prompt words as needed
parameter settings
Number of sampling steps
The higher the number of sampling steps, the more detailed the picture will be.
The improvement above 20 steps is small, but it requires extra computing power.
Recommended range: between 10~30 (default 20)
Sampling method
Various generation algorithms
Recommend the ones marked with " " below
If the model has a recommendation algorithm, use it first
resolution
Resolution is too small: Pictures are inherently blurry and lack details
The resolution is too large: the calculation is slow, the video memory is easily exhausted, and there may be multiple people.
It is necessary to learn through trial and error what resolution can ensure both quality and efficiency under the current equipment conditions.
Other options
Prompt word relevance: the degree to which the prompt word is restored (safety range: 7~12)
Facial Repair: Recommended Check
Tile: Do not check if you are not making a pattern
Publish pictures in batches
Continuously plot according to the number of batches
It is recommended that the single batch quantity be kept at 1
Because the method of making multiple pictures in a single batch is to "join" them into one large picture to generate
Redraw range-applicable to graph-to-graph mode
The recommended setting is between 0.6-0.8
The meaning of random seed
Prompt word correction
Further definition of background content
( )in background: Precisely define background content
Depth of field: Depth of field helps create a photographic atmosphere
random seed
The core of “card drawing”:
It will be randomly generated in a different way each time, and the random generation method is recorded as a set of numbers, that is, a random seed.
Different random seeds bring randomness, and the same random seeds achieve similar effects.
How to fix random seeds
The gallery browser also records the number of seeds
Keep the random seeds consistent and modify the prompt words to achieve a relatively consistent character style.
Tu Sheng Tu
principle
Like text, pictures can also be sent to AI for analysis as a kind of information.
The essence of "redrawing" is that after the pixels of the picture are structurally analyzed, the finished product is similar to the original picture.
The basic steps
upload image
Fill in the prompt word
Use prompt words to describe the content of the screen
Even if you ask AI to draw through Tushengtu, you still need specific and accurate prompt words.
Content-type standardized prompt words
parameter settings
Redraw width
How similar are the original pictures and the finished pictures?
If it is too high, it will easily deform; if it is too low, the "redrawing" effect cannot be achieved.
The recommended setting is between 0.6-0.8
resolution
Prioritize maintaining consistency with the original image
If the original image is too large, it can be scaled down to a safe range.
If the proportion of the finished product is different from the original image
Cut it on the computer first and then import it
Three different cutting methods provide adaptable sizes
Other parameters
Expand applications
Turn real portrait photos into two dimensions
Using SD has higher accuracy and more room for definition
"Personification" of still life and landscapes
Import pictures that are not people and define them with prompt words describing the people.
"Three-dimensional" two-dimensional characters
Import pictures of anime and game characters, and define them with realistic models and realistic standardized prompt words.
You can use the Lora model to restore character characteristics more specifically and accurately
Advanced gameplay
Image synthesis AI redrawing
abstract painting
Draw some colors and lines randomly and import them into AI to generate
ControlNet application and practice
Analysis of ControINet principle
Use specific information to guide you to achieve some features that we cannot accurately control through Wensheng diagrams and Tusheng diagrams.
The meaning of precise control
If you can only rely on "drawing cards" to produce the required content, the generation is highly uncontrollable.
The significance of precise control: In the face of specific needs, only "controllable" can become "productive"
Basic structure: preprocessor → model
Preprocessor can extract feature information from images
The trained ControlNet model reads this information and guides the stable Diffusion generation process
ControlNet basic application methods
Drag in the infographic and select the corresponding preprocessor and model combination
Preprocess the image
Click the "Explode" button to preview the preprocessing results
Preprocessed infographics can be saved and reused
When uploading your infographic, set preprocessing to "None"
Detailed explanation of parameters
Control weight: mainly affects the "strength" of control
Boot timing: the time when ControINet "takes effect" during the generation process (from 0~1)
Control mode: Prefer prompt words or ControlNet
Ways to change the intensity of control:
Increase efforts: increase the weight, reduce the number of starting guidance steps and increase the number of ending guidance steps, and select ControlNet "more important" mode
Reduce intensity: reduce the weight, increase the number of starting guidance steps and reduce the number of ending guidance steps, and select the "prompt words are more important" mode
In Tu Sheng Diagram, the original Tu Sheng Diagram will be automatically loaded as an information graphic.
Introduction to the five major ControlNet models
Openpose: Control posture, hands, facial details
Several different Openpose preprocessors
Hand: hand bones
Face(Only): facial feature points
Ful: add it all together
Depth: Control space composition (depth)
Depth map: black far, white near
Several different Depth preprocessing
Leres has high accuracy and midas is more commonly used.
The higher the precision of preprocessing, the longer it generally takes.
Canny: Control line outlines
During preprocessing, the threshold controls the number of lines and should not be too dense.
Application: Line drawing coloring
Tips: Use Invert for line drawings with black lines on a white background and invert them to white lines on a black background for correct identification.
SoftEdge: Controls line contours, but is softer and more relaxed
Several different SofEdge preprocessors
not much difference
Compared with Canny, SoftEdge's restoration of contours is more "vivid" and not too rigid.
Tips: Appropriately "relaxing" ControlNet's control will help AI exert more of its own creativity
Scribble: Graffiti guide screen generation
You can extract it from pictures or draw it yourself
Application: Soul Painter
Multiple ControlNet application logic
In settings, enable multiple ControlNetUnits to use multiple ControlNets
The key to combinatorial logic: complement each other!
Correct demonstration: Canny Depth, using lines to fill in the details in the depth
Error demonstration: Canny SoftEdge, it is also a control edge. There is not much difference between opening two and opening one.
Three WebUI native zoom processing methods
Tushengtu:HD restoration
Essence: First generate a low-quality version, then "redraw" to generate a high-resolution version
It's a bit like taking a low-quality picture and making a "picture of a picture"
At present, it is used in most of the drawing processes.
Advantages and Disadvantages Analysis
Advantage
Does not change the frame composition (fixed via random seed)
Stably overcome problems caused by resolution such as multi-player and multiple heads.
Easy, clear and intuitive operation
Disadvantages
Still limited by maximum video memory
Computational speed is relatively slow
Occasionally "adding drama", inexplicable additional elements appear
Parameter analysis
gain
scale up
Set final size directly
Redraw width
It depends on the algorithm and is generally not set too high.
Recommend 0.3~0.5
Algorithm selection
Latent series: rich in details, but easy to cause picture distortion
Note: The redraw amplitude generally cannot be less than 0.5 (otherwise blur will appear)
GAN series: Keep the original image similar to the greatest extent, and the detailed effect is not as good as Latent.
If you are not sure, you can choose R-ESRGAN 4X without any thought.
*Choose the one with Anime 6B for the second dimension
The differences between different algorithms are actually not as big as imagined
Vincentian diagram: SD enlargement
Essence: Divide the picture into small pieces, redraw them, and then put them together into one big picture.
Advantages and Disadvantages Analysis
Advantage
Can break through memory limitations to obtain larger resolution (up to 4 times width and height)
The picture is high in fineness and the richness of details is excellent.
Disadvantages
The process of segmentation and redrawing is relatively uncontrollable (semantic misleading and dividing lines)
Cumbersome and relatively unintuitive to operate
Occasionally "adding drama", inexplicable additional elements appear
Parameter analysis
Redraw range: generally no more than 0.5, 0.2~0.3 is recommended
When multi-player or multi-head problems occur, the redraw range will be reduced.
Upscaling algorithm: same as in HD restoration
Overlapping pixels (Tile Overlap): The overlap between tiles, so that the picture stitching can be understood. The role of overlap: glue and buffer strips are more natural.
Understanding the role of overlap: glue buffer tape
When the seam feels too stiff, increase the overlapping pixels.
Extra: Post-processing (more)
Essence: Simple resolution improvement of pictures through artificial intelligence algorithm, no redrawing involved
Convenient, efficient, available at any time, you won’t lose money after using it
Advantages and Disadvantages Analysis
Advantage
Easy to use, simple to operate, can be called at any time
Fast calculation and no redrawing pressure
Does not change the image content at all
Disadvantages
The effect is not very significant (a bit tasteless)
(..This one defect is enough)
Parameter analysis
The second amplification algorithm
Similar to the "weight" logic of prompt words
You don’t need to set it, just using one algorithm is enough.
Can be processed in batches
After the WebUI version was updated to 1.6.0, the "HD Repair" function changed from the original check box to an expand/collapse label. It is enabled in the expanded state and not enabled in the collapsed state. The specific usage method is completely consistent with the original one.
partial redraw
Partial redraw basics
Basic operating procedures
Upload pictures to the "Partial Redraw" workspace, or send them to the Partial Redraw through the gallery browser, etc.
Adjust the brush size so that the drawing area covers the part that needs to be redrawn
Adjust the redrawing range and various parameters, modify the prompt words, and click Generate to realize redrawing.
Essence: redraw the picture, and then "piece back" the redrawn area
Core parameter analysis
mask area
The painted part is the mask area
The other parts are "non-mask" areas
masked content
Original image: input as it is and add noise and denoise, the redrawing effect is closer
Filling: Highly blur the original image and then add noise and denoise. The difference in redrawing effect will be slightly larger.
Latent space variables & blank latent space: Convert the mask area into pure latent space noise, and then the redrawing effect will be the largest.
Generally, a higher redraw range is required, otherwise it will cause confusion in the redraw area.
For different images, the effect may be different and you need to try more.
Mask and full image only
Whole picture: Redraw the whole picture, and then "piece back" the smudged parts.
Mask only: Redraw the smeared part and a small surrounding area separately, and then "piece back" the smeared part
Mask blur and reserved pixels
Mask blur: edge softness, similar to "selection feathering"
Reserved pixels: affects the size of the surrounding area extracted in mask-only mode
Other redraw methods
Graffiti redraw (inPaint Sketch)
Use a colored brush to paint, then repaint the painted area and add the painted color to the original image
It is often used to modify the wrong parts of the picture, such as correcting the wrong hand.
Operation: First paint the hand area of the original image with a color similar to the background, and then draw the hand image with a color similar to the skin.
Sketch
Apply with a colored brush and then repaint the entire image
The difference between redrawing and graffiti: one is for local parts and the other is for the whole picture
Mask redraw (MaskInpaint, Inpaint Upload)
Upload a black and white mask image to delineate the redraw area
By default: white is masked, black is unmasked
You can use PS and other software to cut out images to create mask images and export them.