Create Controllable Characters for your AI Movies [ADVANCED]

Create Controllable Characters for your AI Movies [ADVANCED]

[FULL GUIDE]

Jul 17, 2025

by Mickmumpitz

ADVANCED WORKFLOW GUIDE

ATTENTION! This is the guide for the advanced workflow. While everyone can read this guide, only Patreon supporters can download the actual workflow file. If you want to support the creation of these workflows, guides, and YouTube videos, please consider becoming a patron!

The advanced workflow improves upon the Free SMPL Workflow by creating iterative batch-by-batch video generation groups, allowing us to overcome the 81-frame generation limit of Wan 2.1.

Introduction

Learn the game-changing free AI VFX workflow and discover how to add controllable creatures & consistent characters to any footage using just one reference image and Wan 2.1. VACE in ComfyUI. This tutorial walks you through cutting-edge AI animation techniques that compete with expensive studio methods, including motion control using Blender and After Effects. We'll show you these breakthrough AI video generation methods by creating a complete smartphone-shot short film featuring a magical crystal-cat-dragon. Whether you're making YouTube content, social media videos, or professional projects, these AI-powered VFX techniques will transform your creative process and help you produce high-quality creature animations using completely free, open-source AI tools.

ADVANCED WORKFLOW GUIDE

ATTENTION! This is the guide for the advanced workflow. While everyone can read this guide, only Patreon supporters can download the actual workflow file. If you want to support the creation of these workflows, guides, and YouTube videos, please consider becoming a patron!

The advanced workflow improves upon the Free SMPL Workflow by creating iterative batch-by-batch video generation groups, allowing us to overcome the 81-frame generation limit of Wan 2.1.

Introduction

Learn the game-changing free AI VFX workflow and discover how to add controllable creatures & consistent characters to any footage using just one reference image and Wan 2.1. VACE in ComfyUI. This tutorial walks you through cutting-edge AI animation techniques that compete with expensive studio methods, including motion control using Blender and After Effects. We'll show you these breakthrough AI video generation methods by creating a complete smartphone-shot short film featuring a magical crystal-cat-dragon. Whether you're making YouTube content, social media videos, or professional projects, these AI-powered VFX techniques will transform your creative process and help you produce high-quality creature animations using completely free, open-source AI tools.

🎨 Workflow Sections

🟨 Important Notes
⬜ Input / Output / Model Loaders
🟩 Preparation
🟪 Video Generation

Installation

Download the .json file and drag and drop it into your ComfyUI window.
Install the missing custom nodes via the manager and restart ComfyUI.

Download Models

Wan2_1-T2V-14B_fp8_e5m2.safetensors:
https://huggingface.co/Kijai/WanVideo_comfy/blob/main/Wan2_1-T2V-14B_fp8_e5m2.safetensors
📁 ComfyUI/models/diffusion_models

Wan21_CausVid_14B_T2V_lora_rank32_v2.safetensors:
https://huggingface.co/Kijai/WanVideo_comfy/blob/main/Wan21_CausVid_14B_T2V_lora_rank32_v2.safetensors
📁 ComfyUI/models/loras

Wan2_1-VACE_module_14B_fp8_e4m3fn.safetensors:
https://huggingface.co/Kijai/WanVideo_comfy/blob/main/Wan2_1-VACE_module_14B_fp8_e4m3fn.safetensors
📁 ComfyUI/models/diffusion_models

umt5-xxl-enc-bf16.safetensors:
https://huggingface.co/Kijai/WanVideo_comfy/blob/main/umt5-xxl-enc-bf16.safetensors
📁 ComfyUI/models/text_encoders

Wan2_1_VAE_bf16.safetensors:
https://huggingface.co/Kijai/WanVideo_comfy/blob/main/Wan2_1_VAE_bf16.safetensors
📁 ComfyUI/models/vae

🎨 Workflow Sections

🟨 Important Notes
⬜ Input / Output / Model Loaders
🟩 Preparation
🟪 Video Generation

Installation

Download the .json file and drag and drop it into your ComfyUI window.
Install the missing custom nodes via the manager and restart ComfyUI.

Download Models

Wan2_1-T2V-14B_fp8_e5m2.safetensors:
https://huggingface.co/Kijai/WanVideo_comfy/blob/main/Wan2_1-T2V-14B_fp8_e5m2.safetensors
📁 ComfyUI/models/diffusion_models

Wan21_CausVid_14B_T2V_lora_rank32_v2.safetensors:
https://huggingface.co/Kijai/WanVideo_comfy/blob/main/Wan21_CausVid_14B_T2V_lora_rank32_v2.safetensors
📁 ComfyUI/models/loras

Wan2_1-VACE_module_14B_fp8_e4m3fn.safetensors:
https://huggingface.co/Kijai/WanVideo_comfy/blob/main/Wan2_1-VACE_module_14B_fp8_e4m3fn.safetensors
📁 ComfyUI/models/diffusion_models

umt5-xxl-enc-bf16.safetensors:
https://huggingface.co/Kijai/WanVideo_comfy/blob/main/umt5-xxl-enc-bf16.safetensors
📁 ComfyUI/models/text_encoders

Wan2_1_VAE_bf16.safetensors:
https://huggingface.co/Kijai/WanVideo_comfy/blob/main/Wan2_1_VAE_bf16.safetensors
📁 ComfyUI/models/vae

Before You Begin:

Thank you for considering supporting us! Since these workflows can be complex, we recommend testing the free versions first to ensure compatibility with your system. We cannot guarantee full compatibility with every system that's why we always provide the main functionalities for free!

Please take a moment to read through the entire guide. If you encounter any issues:

  1. Check the troubleshooting section at the end of the guide

  2. If problems persist, visit our Discord's #need-help channel and use the search function—many common issues have already been resolved

  3. If you cannot find it there ask and we will try to help you. give as much information as possible. Include screenshots, the error.

1. MODEL LOADERS

In this section, you can download the required models using the links on the left side. Additionally, you have the option to increase the blocks_to_swap value in the “Video Block Swap” step if you're working with limited VRAM.

2. VIDEO INPUT & SIZE SELECTION

We offer three predefined resolution settings. By default, 576p is selected as it provides a good balance between quality, generation time, and has also great animation — though this is extremely random, as we all know. You can also use the switch with 1 for 720p or 3 for 504p.

You can specify the number of frames to be generated in each video generation iteration using the "batch_size" node.
The “skip frames” option allows you to offset the video if needed.

At the bottom right, you can select whether to use the Inpainting method or the Start Frame + ControlNet method.

3. REFERENCE IMAGE

Upload the reference image that contains the character you want to integrate into the video. Nothing else is needed in this step – just load the image and you’re good to go.

4. POINT SYSTEM

The “Point System” section offers multiple options for defining point data for your animation.

In the center, you can setup the source of your Point Data:

  1. ComfyUI Spline Editor

  2. Coordinate data from external tools such as Blender or After Effects (plugins available)

  3. Spline-Path-Control Web interface

If you choose option 2, use this text field to paste your coordinate data. Below that field, you'll find the "Create Shape Image on Path" node. Here, you need to use the resolution of the project where your coordinates originated from. We limited the coordinate resolution to 720p maximum for performance reasons. Higher resolutions can significantly increase the processing time of this node.

If you have created coordinates in a high resolution, you can easily convert them to 720p using this web converter tool.

5. MANUAL POINTS

In the "Manual Points" section, the first frame of the video is loaded. You can then create splines on this frame. Based on these splines, it will automatically generate point data for the total frame range.

6. SPLINE PATH CONTROL

If you are using the SPLINE PATH CONTROL tool, enable this section and upload your exported video there.

Pay attention to the recommended shape settings in the Editor:

  • Color: Use 100% red

  • Width: 4

  • Size: 64 I 64

Inpainting vs. ControlNet

From this point, the workflow splits depending on which method you selected earlier in the “Video Input & Size Selection” section:

  • If you chose the Inpainting method, continue with the Inpainting Setup section.

  • If you selected the First Frame plus ControlNet method, move on to the Control Section at the bottom of the graph.

7. INPAINTING SETUP

In this section, you can choose (1) between two options: an animated inpainting mask or a manual static mask.

The animated inpainting mask is generated from the animated points. The grey masked area (2) indicates where the character from your reference image will be generated.

Use the "GrowMask" node (3) to expand or shrink the masked area. Be cautious though—large values can significantly increase processing time.

Alternatively, you can draw a static mask on the first frame (4). With this option, the figure will only appear inside your manually defined area.

8. CONTROL

If your scene includes another humanoid figure, the DWPose Estimator will generate an additional Pose ControlNet.

You have three options:

  1. Pose + Points

  2. Points

  3. Pose

This allows you to flexibly choose how the motion is controlled.

9. START FRAME

Choose here the first image for generation:

  1. Use first frame of video - great if your character is appearing from behind an object or a person.

  2. Use first frame with inpainting mask - best option if the character is already visible in the frame

  3. Import your own Start Frame - If you're using the First Frame plus ControlNet workflow, you can upload a prepared frame where the figure has already been inserted (e.g., using the Flux Inpainting Workflow).

10. STACKED ITERATIVE VIDEO GENERATION GROUPS

To overcome WAN's 81-frame limit, we introduce an iterative Video Generation approach. Each iteration group generates a portion of the video according to the batch size, continuously processing these batches as configured in the batch_size node VIDEO INPUT & SIZE SELECTION.

The process works as follows: It first generates the initial batch (e.g., 81 frames). Then, it automatically takes the last frame of that sequence and uses it as the starting point for the next batch. This process continues until all frames are generated, seamlessly "stitching" together multiple segments into one continuous video.

To process and prepare each batch, we have set up support groups for a maximum of 5 iterations. Feel free to duplicate and extend the nodes if you want to create more than 5 iterations of video generation.

Each video generation iteration group shares the same structure and functions as follows:

  1. A Bypass Node to temporarily deactivate each iteration, which helps when adjusting earlier settings or if you want to review the current iteration before continuing to the next

  2. A Prompt Node where detailed and descriptive prompts are essential for high-quality figure integration

  3. The Video Sampler, set to 8 steps by default. You can reduce this to 2 steps for faster previews and seed testing.

  4. A color match node that ensures consistency when merging each batch

  5. The generated video in this section for review

  6. The generated video with point animation overlay for review

  7. The final merged video containing all generated clips from each iteration section. Enable save_output only for the last iteration batch you use, as saving each batch is unnecessary.

To maintain control over the generated video, we recommend a methodical approach. Work through each section in detail by disabling subsequent iteration groups until you're satisfied with the current one. For each batch, adjust the prompt and test different seeds using just 2 sampling steps until you achieve the desired result. Then generate this video iteration at final quality using 8 steps, proceed to the next iteration group, and repeat this process. This methodical approach gives you precise control to achieve exactly the results you want.

1. MODEL LOADERS

In this section, you can download the required models using the links on the left side. Additionally, you have the option to increase the blocks_to_swap value in the “Video Block Swap” step if you're working with limited VRAM.

2. VIDEO INPUT & SIZE SELECTION

We offer three predefined resolution settings. By default, 576p is selected as it provides a good balance between quality, generation time, and has also great animation — though this is extremely random, as we all know. You can also use the switch with 1 for 720p or 3 for 504p.

You can specify the number of frames to be generated in each video generation iteration using the "batch_size" node.
The “skip frames” option allows you to offset the video if needed.

At the bottom right, you can select whether to use the Inpainting method or the Start Frame + ControlNet method.

3. REFERENCE IMAGE

Upload the reference image that contains the character you want to integrate into the video. Nothing else is needed in this step – just load the image and you’re good to go.

4. POINT SYSTEM

The “Point System” section offers multiple options for defining point data for your animation.

In the center, you can setup the source of your Point Data:

  1. ComfyUI Spline Editor

  2. Coordinate data from external tools such as Blender or After Effects (plugins available)

  3. Spline-Path-Control Web interface

If you choose option 2, use this text field to paste your coordinate data. Below that field, you'll find the "Create Shape Image on Path" node. Here, you need to use the resolution of the project where your coordinates originated from. We limited the coordinate resolution to 720p maximum for performance reasons. Higher resolutions can significantly increase the processing time of this node.

If you have created coordinates in a high resolution, you can easily convert them to 720p using this web converter tool.

5. MANUAL POINTS

In the "Manual Points" section, the first frame of the video is loaded. You can then create splines on this frame. Based on these splines, it will automatically generate point data for the total frame range.

6. SPLINE PATH CONTROL

If you are using the SPLINE PATH CONTROL tool, enable this section and upload your exported video there.

Pay attention to the recommended shape settings in the Editor:

  • Color: Use 100% red

  • Width: 4

  • Size: 64 I 64

Inpainting vs. ControlNet

From this point, the workflow splits depending on which method you selected earlier in the “Video Input & Size Selection” section:

  • If you chose the Inpainting method, continue with the Inpainting Setup section.

  • If you selected the First Frame plus ControlNet method, move on to the Control Section at the bottom of the graph.

7. INPAINTING SETUP

In this section, you can choose (1) between two options: an animated inpainting mask or a manual static mask.

The animated inpainting mask is generated from the animated points. The grey masked area (2) indicates where the character from your reference image will be generated.

Use the "GrowMask" node (3) to expand or shrink the masked area. Be cautious though—large values can significantly increase processing time.

Alternatively, you can draw a static mask on the first frame (4). With this option, the figure will only appear inside your manually defined area.

8. CONTROL

If your scene includes another humanoid figure, the DWPose Estimator will generate an additional Pose ControlNet.

You have three options:

  1. Pose + Points

  2. Points

  3. Pose

This allows you to flexibly choose how the motion is controlled.

9. START FRAME

Choose here the first image for generation:

  1. Use first frame of video - great if your character is appearing from behind an object or a person.

  2. Use first frame with inpainting mask - best option if the character is already visible in the frame

  3. Import your own Start Frame - If you're using the First Frame plus ControlNet workflow, you can upload a prepared frame where the figure has already been inserted (e.g., using the Flux Inpainting Workflow).

10. STACKED ITERATIVE VIDEO GENERATION GROUPS

To overcome WAN's 81-frame limit, we introduce an iterative Video Generation approach. Each iteration group generates a portion of the video according to the batch size, continuously processing these batches as configured in the batch_size node VIDEO INPUT & SIZE SELECTION.

The process works as follows: It first generates the initial batch (e.g., 81 frames). Then, it automatically takes the last frame of that sequence and uses it as the starting point for the next batch. This process continues until all frames are generated, seamlessly "stitching" together multiple segments into one continuous video.

To process and prepare each batch, we have set up support groups for a maximum of 5 iterations. Feel free to duplicate and extend the nodes if you want to create more than 5 iterations of video generation.

Each video generation iteration group shares the same structure and functions as follows:

  1. A Bypass Node to temporarily deactivate each iteration, which helps when adjusting earlier settings or if you want to review the current iteration before continuing to the next

  2. A Prompt Node where detailed and descriptive prompts are essential for high-quality figure integration

  3. The Video Sampler, set to 8 steps by default. You can reduce this to 2 steps for faster previews and seed testing.

  4. A color match node that ensures consistency when merging each batch

  5. The generated video in this section for review

  6. The generated video with point animation overlay for review

  7. The final merged video containing all generated clips from each iteration section. Enable save_output only for the last iteration batch you use, as saving each batch is unnecessary.

To maintain control over the generated video, we recommend a methodical approach. Work through each section in detail by disabling subsequent iteration groups until you're satisfied with the current one. For each batch, adjust the prompt and test different seeds using just 2 sampling steps until you achieve the desired result. Then generate this video iteration at final quality using 8 steps, proceed to the next iteration group, and repeat this process. This methodical approach gives you precise control to achieve exactly the results you want.

© 2025 Mickmumpitz

© 2025 Mickmumpitz

© 2025 Mickmumpitz