The video explains how to achieve consistent styling when generating images with AI, specifically using LoRA training in ComfyUI. It covers the installation of necessary nodes, the preparation of a dataset, and the process of training a LoRA model to achieve a desired aesthetic.
[0:01] The video begins with a striking image of a photorealistic doll, setting a visual tone.
[0:01] This is followed by an image of a donkey, introducing a different subject.
[0:02] The video then transitions to an image of an elderly man, showcasing diversity in subjects.
[0:03] Next, a beautiful floral arrangement in a park appears, highlighting a natural subject.
[0:05] The scene changes to a nurse in what appears to be a stylized or sketched hospital room, suggesting a focus on unique visual styles.
[0:08] A miniature house is then displayed, perhaps as an example of a different asset type.
[0:11] Finally, a dog is shown, completing the initial montage of diverse imagery.
[0:14] The video then shifts to demonstrating the ComfyUI Manager, a crucial tool for managing nodes and extensions.
[0:15] The interface of the ComfyUI Manager is displayed, showing various options for managing the software.
[0:16] A status update message indicates that custom nodes are being updated and that new nodes are being checked for.
[0:17] The manager then lists available custom nodes, including descriptions and installation status.
[0:18] The video highlights the search functionality within the manager, suggesting how to find specific nodes.
[0:19] It’s mentioned that “all the links in the description of the video” will provide access to these tools.
[0:22] The process involves navigating the manager to find and install specific nodes related to training.
[0:23] The user searches for “training” within the ComfyUI Manager.
[0:24] A list of relevant nodes appears, including “Lona Training in Comfy” and “Lona Training in Comfy (Advanced)”.
[0:26] The video focuses on installing these two specific LoRA training nodes.
[0:28] The video then introduces the need for a tool that can tag images automatically.
[0:30] It explains that this tool will help in tagging images, which is crucial for training.
[0:31] The video demonstrates the use of “WD14 Tagger,” a tool for automatically tagging images.
[0:33] It suggests that you can use any tool that knows how to “see and analyze images” for tagging.
[0:37] The video mentions “WD14 Tagger” again, emphasizing its role in the process.
[0:39] “The data set, or in other words, the images” are highlighted as one of the most important elements for LoRA training.
[0:42] The video shows an image of a snowy landscape with trees, likely a sample image from the dataset.
[0:45] Another image, this time of a woman in a graveyard in winter, further illustrates the variety of potential training data.
[0:48] The video displays an image of a woman and the associated text tags that describe the image, demonstrating the importance of accurate tagging.
[0:51] The tags shown are: “1. girl, solo, long hair, looking_at_viewer, brown_hair, shirt, brown_eyes, jacket, white_shirt, outdoors, collared_shirt, bag, free, lips, backpack, building, realistic”.
[0:53] This emphasizes the need for descriptive and specific tags for effective LoRA training.
[0:56] The video illustrates the file structure for training data, showing the folder named “1_memory”.
[0:59] It highlights the convention of naming the dataset folder with a number followed by an underscore and the name of the object or style to be trained.
[1:06] The goal is to create a LoRA for a specific style, which is a combination of a central object and a style.
[1:08] The video shows a colored subject against a black and white background, demonstrating the concept of a stylized subject.
[1:13] This is contrasted with a sketched background, further illustrating the idea of applying a specific style.
[1:15] The video shows a red bicycle, another example of an object that could be trained.
[1:17] A doll in a dress is shown, serving as a visual reference for the LoRA training.
[1:20] The video shows the ComfyUI workflow with multiple nodes, illustrating the process of generating images.
[1:23] It emphasizes that the workflow shown is intended for the exact purpose of training a LoRA.
[1:28] The video mentions that links to the necessary files will be provided in the description.
[1:31] The workflow shows the connection of various nodes, including image processing and sampling.
[1:33] The video then displays the contents of the image dataset folder.
[1:35] An image of a modern building on a beach is shown, with associated text tags.
[1:38] The tags for this image include: “memory1, monochrome, greyscale, outdoors, sky, cloud, tree, no_humans, beach, building scenery, sand, palm_tree, house”.
[1:41] The video transitions to showing how to use the “BLIP Caption” or “LLaVA Caption” nodes to generate tags for the images.
[1:44] A “Caption” node is added to the ComfyUI workflow.
[1:47] The video shows how to connect the “Image Captioning” node to the workflow.
[1:48] It also demonstrates the addition of the “WD14 Tagger” node.
[1:50] The connection of the “WD14 Tagger” to the workflow is shown.
[1:53] The video explains that the path to the dataset folder needs to be pasted into the “WD14 Tagger” node.
[1:56] The path shown is “D:/RAWg/12_lora_check/memory1”.
[1:58] The video explains the need to choose a “trigger word” for the LoRA.
[2:01] In this case, the trigger word chosen is “memory1”.
[2:05] The video then shows the process of queuing the prompt for execution.
[2:09] The status of the task is displayed as “Running”.
[2:11] The video shows the output of the tagging process, with each image having a corresponding text file with descriptive tags.
[2:14] It suggests that users can correct any incorrect tags.
[2:17] The video highlights an example tag “water” and explains that if a tag is not relevant, it can be removed or corrected.
[2:22] The video then demonstrates the LoRA training process.
[2:24] It shows how to search for “training” nodes in ComfyUI.
[2:27] Two training nodes are presented: “Lora Training in ComfyUI” and “Lora Training in Comfy (Advanced)”.
[2:30] The video explains that the “Lora Training in ComfyUI” node is simpler, while the “Advanced” version offers more parameters.
[2:32] It encourages users to experiment with different parameters to see their effect.
[2:34] The video focuses on using the “Lora Training in ComfyUI” node.
[2:37] It shows how to select a base model, in this case, “DreamShaper_8_pruned.safetensors”.
[2:40] The video emphasizes the importance of accurately specifying the “data_path”, which should point to the folder containing the training images.
[2:43] The data path is set to “D:/RAWg/12_lora_check/memory1”.
[2:46] The video then moves to the “batch_size” parameter.
[2:48] It also highlights the “max_train_epochs” and “save_every_n_epochs” parameters.
[2:51] The video shows the file structure of the dataset again.
[2:54] It emphasizes that the “data_path” should point to the folder containing the images, not the images themselves.
[2:57] The video further stresses that the folder name should start with a number, followed by an underscore and the name without spaces.
[3:03] The “max_train_epochs” is set to “400”.
[3:11] The “output_name” is set to “memory1”.
[3:15] The “clip_skip” parameter is set to “2”.
[3:20] The “output_dir” is specified as “D:/stable-diffusion/ComfyUI/models/loras”.
[3:30] The video explains that the LoRA file will be saved in this directory by default.
[3:34] The “save_every_n_epochs” parameter is set to “100”.
[3:40] The video demonstrates queuing the training process.
[3:42] It then discusses how the training time depends on the number and size of the images, as well as the computational power.
[3:55] The video shows the output files, indicating the creation of multiple LoRA models with different epoch counts.
[4:02] It specifically points out the “save_every_n_epochs” parameter and its effect.
[4:11] The video then opens a basic ComfyUI workflow.
[4:13] It suggests using “DreamShaper_8” as a base model.
[4:16] The generated image of the doll is displayed.
[4:18] The video then loads another LoRA.
[4:21] It shows the output of this second LoRA training, which has fewer steps.
[4:24] The positive prompt used is “memory1 closeup photo of a doll on a side walk”.
[4:29] The video shows how to load the LoRA into the workflow.
[4:32] It then displays the generated image, noting that the quality is not as good as expected.
[4:35] The video proposes trying another LoRA with fewer steps.
[4:38] It shows the result of training with 200 epochs versus 400 epochs.
[4:41] The comparison highlights the difference in quality between the two training runs.
[4:44] The video then loads a LoRA with a different “strength” value.
[4:47] It displays the resulting image, demonstrating the effect of the “model_strength” parameter.
[4:53] The video concludes with a montage of the various images and models discussed.
[4:56] It encourages viewers to subscribe and ask questions.
[5:00] The video ends with a farewell.