Create Controllable Characters for your AI Movies

Create Controllable Characters for your AI Movies

[FULL GUIDE]

Jul 17, 2025

by Mickmumpitz

Learn the game-changing free AI VFX workflow and discover how to add controllable creatures & consistent characters to any footage using just one reference image and Wan 2.1. VACE in ComfyUI. This tutorial walks you through cutting-edge AI animation techniques that compete with expensive studio methods, including motion control using Blender and After Effects. We'll show you these breakthrough AI video generation methods by creating a complete smartphone-shot short film featuring a magical crystal-cat-dragon. Whether you're making YouTube content, social media videos, or professional projects, these AI-powered VFX techniques will transform your creative process and help you produce high-quality creature animations using completely free, open-source AI tools.

Learn the game-changing free AI VFX workflow and discover how to add controllable creatures & consistent characters to any footage using just one reference image and Wan 2.1. VACE in ComfyUI. This tutorial walks you through cutting-edge AI animation techniques that compete with expensive studio methods, including motion control using Blender and After Effects. We'll show you these breakthrough AI video generation methods by creating a complete smartphone-shot short film featuring a magical crystal-cat-dragon. Whether you're making YouTube content, social media videos, or professional projects, these AI-powered VFX techniques will transform your creative process and help you produce high-quality creature animations using completely free, open-source AI tools.

🎨 Workflow Sections

🟨 Important Notes
⬜ Input / Output / Model Loaders
🟩 Preparation
🟪 Video Generation

Installation

Download the .json file and drag and drop it into your ComfyUI window.
Install the missing custom nodes via the manager and restart ComfyUI.

Download Models

Wan2_1-T2V-14B_fp8_e5m2.safetensors:
https://huggingface.co/Kijai/WanVideo_comfy/blob/main/Wan2_1-T2V-14B_fp8_e5m2.safetensors
📁 ComfyUI/models/diffusion_models

Wan21_CausVid_14B_T2V_lora_rank32_v2.safetensors:
https://huggingface.co/Kijai/WanVideo_comfy/blob/main/Wan21_CausVid_14B_T2V_lora_rank32_v2.safetensors
📁 ComfyUI/models/loras

Wan2_1-VACE_module_14B_fp8_e4m3fn.safetensors:
https://huggingface.co/Kijai/WanVideo_comfy/blob/main/Wan2_1-VACE_module_14B_fp8_e4m3fn.safetensors
📁 ComfyUI/models/diffusion_models

umt5-xxl-enc-bf16.safetensors:
https://huggingface.co/Kijai/WanVideo_comfy/blob/main/umt5-xxl-enc-bf16.safetensors
📁 ComfyUI/models/text_encoders

Wan2_1_VAE_bf16.safetensors:
https://huggingface.co/Kijai/WanVideo_comfy/blob/main/Wan2_1_VAE_bf16.safetensors
📁 ComfyUI/models/vae

🎨 Workflow Sections

🟨 Important Notes
⬜ Input / Output / Model Loaders
🟩 Preparation
🟪 Video Generation

Installation

Download the .json file and drag and drop it into your ComfyUI window.
Install the missing custom nodes via the manager and restart ComfyUI.

Download Models

Wan2_1-T2V-14B_fp8_e5m2.safetensors:
https://huggingface.co/Kijai/WanVideo_comfy/blob/main/Wan2_1-T2V-14B_fp8_e5m2.safetensors
📁 ComfyUI/models/diffusion_models

Wan21_CausVid_14B_T2V_lora_rank32_v2.safetensors:
https://huggingface.co/Kijai/WanVideo_comfy/blob/main/Wan21_CausVid_14B_T2V_lora_rank32_v2.safetensors
📁 ComfyUI/models/loras

Wan2_1-VACE_module_14B_fp8_e4m3fn.safetensors:
https://huggingface.co/Kijai/WanVideo_comfy/blob/main/Wan2_1-VACE_module_14B_fp8_e4m3fn.safetensors
📁 ComfyUI/models/diffusion_models

umt5-xxl-enc-bf16.safetensors:
https://huggingface.co/Kijai/WanVideo_comfy/blob/main/umt5-xxl-enc-bf16.safetensors
📁 ComfyUI/models/text_encoders

Wan2_1_VAE_bf16.safetensors:
https://huggingface.co/Kijai/WanVideo_comfy/blob/main/Wan2_1_VAE_bf16.safetensors
📁 ComfyUI/models/vae

Before You Begin:

Thank you for considering supporting us! Since these workflows can be complex, we recommend testing the free versions first to ensure compatibility with your system. We cannot guarantee full compatibility with every system that's why we always provide the main functionalities for free!

Please take a moment to read through the entire guide. If you encounter any issues:

  1. Check the troubleshooting section at the end of the guide

  2. If problems persist, visit our Discord's #need-help channel and use the search function—many common issues have already been resolved

  3. If you cannot find it there ask and we will try to help you. give as much information as possible. Include screenshots, the error.

1. MODEL LOADERS

In this section, you can download the required models using the links on the left side. Additionally, you have the option to increase the blocks_to_swap value in the “Video Block Swap” step if you're working with limited VRAM.

2. VIDEO INPUT & SIZE SELECTION

We offer three predefined resolution settings. By default, 576p is selected as it provides a good balance between quality, generation time, and has also great animation — though this is extremely random, as we all know. You can also use the switch with 1 for 720p or 3 for 504p.

You can define how many frames should be generated in the “frames” node. The “skip frames” option allows you to offset the video if needed.

At the bottom right, you can select whether to use the Inpainting method or the Start Frame + ControlNet method.

3. REFERENCE IMAGE

Upload the reference image that contains the character you want to integrate into the video. Nothing else is needed in this step – just load the image and you’re good to go.

4. POINT SYSTEM

The “Point System” section offers multiple options for defining point data for your animation.

In the center, you can setup the source of your Point Data:

If you choose option 2, use this text field to paste your coordinate data. Below that field, you'll find the "Create Shape Image on Path" node. Here, you need to use the resolution of the project where your coordinates originated from. We limited the coordinate resolution to 720p maximum for performance reasons. Higher resolutions can significantly increase the processing time of this node.

If you have created coordinates in a high resolution, you can easily convert them to 720p using this web converter tool.

5. MANUAL POINTS

In the "Manual Points" section, the first frame of the video is loaded. You can then create splines on this frame. Based on these splines, it will automatically generate point data for the total frame range.

6. SPLINE PATH CONTROL

If you are using the SPLINE PATH CONTROL tool, enable this section and upload your exported video there.

Pay attention to the recommended shape settings in the Editor:

  • Color: Use 100% red

  • Width: 4

  • Size: 64 I 64

Inpainting vs. ControlNet

From this point, the workflow splits depending on which method you selected earlier in the “Video Input & Size Selection” section:

  • If you chose the Inpainting method, continue with the Inpainting Setup section.

  • If you selected the First Frame plus ControlNet method, move on to the Control Section at the bottom of the graph.

7. INPAINTING SETUP

In this section, you can choose (1) between two options: an animated inpainting mask or a manual static mask.

The animated inpainting mask is generated from the animated points. The grey masked area (2) indicates where the character from your reference image will be generated.

Use the "GrowMask" node (3) to expand or shrink the masked area. Be cautious though—large values can significantly increase processing time.

Alternatively, you can draw a static mask on the first frame (4). With this option, the figure will only appear inside your manually defined area.

8. CONTROL

If your scene includes another humanoid figure, the DWPose Estimator will generate an additional Pose ControlNet.

You have three options:

  1. Pose + Points

  2. Points

  3. Pose

This allows you to flexibly choose how the motion is controlled.

9. START FRAME

Choose here the first image for generation:

  1. Use first frame of video - great if your character is appearing from behind an object or a person.

  2. Use first frame with inpainting mask - best option if the character is already visible in the frame

  3. Import your own Start Frame - If you're using the First Frame plus ControlNet workflow, you can upload a prepared frame where the figure has already been inserted (e.g., using the Flux Inpainting Workflow).

10. VIDEO GENERATION

In the final section, you'll find:

  1. a Bypass Node to temporarily deactivate the current node, which helps when adjusting earlier settings

  2. a Prompt Node, where it's best to use detailed and descriptive prompts to achieve high-quality figure integration

  3. the Video Sampler, which is set to 8 steps by default. You can lower this to 2 steps for faster previews and seed testing.

  4. On the right side, you can preview how the generated video behaves with the point system.

At the very end, you'll see the final video output without the overlayed point data.

1. MODEL LOADERS

In this section, you can download the required models using the links on the left side. Additionally, you have the option to increase the blocks_to_swap value in the “Video Block Swap” step if you're working with limited VRAM.

2. VIDEO INPUT & SIZE SELECTION

We offer three predefined resolution settings. By default, 576p is selected as it provides a good balance between quality, generation time, and has also great animation — though this is extremely random, as we all know. You can also use the switch with 1 for 720p or 3 for 504p.

You can define how many frames should be generated in the “frames” node. The “skip frames” option allows you to offset the video if needed.

At the bottom right, you can select whether to use the Inpainting method or the Start Frame + ControlNet method.

3. REFERENCE IMAGE

Upload the reference image that contains the character you want to integrate into the video. Nothing else is needed in this step – just load the image and you’re good to go.

4. POINT SYSTEM

The “Point System” section offers multiple options for defining point data for your animation.

In the center, you can setup the source of your Point Data:

If you choose option 2, use this text field to paste your coordinate data. Below that field, you'll find the "Create Shape Image on Path" node. Here, you need to use the resolution of the project where your coordinates originated from. We limited the coordinate resolution to 720p maximum for performance reasons. Higher resolutions can significantly increase the processing time of this node.

If you have created coordinates in a high resolution, you can easily convert them to 720p using this web converter tool.

5. MANUAL POINTS

In the "Manual Points" section, the first frame of the video is loaded. You can then create splines on this frame. Based on these splines, it will automatically generate point data for the total frame range.

6. SPLINE PATH CONTROL

If you are using the SPLINE PATH CONTROL tool, enable this section and upload your exported video there.

Pay attention to the recommended shape settings in the Editor:

  • Color: Use 100% red

  • Width: 4

  • Size: 64 I 64

Inpainting vs. ControlNet

From this point, the workflow splits depending on which method you selected earlier in the “Video Input & Size Selection” section:

  • If you chose the Inpainting method, continue with the Inpainting Setup section.

  • If you selected the First Frame plus ControlNet method, move on to the Control Section at the bottom of the graph.

7. INPAINTING SETUP

In this section, you can choose (1) between two options: an animated inpainting mask or a manual static mask.

The animated inpainting mask is generated from the animated points. The grey masked area (2) indicates where the character from your reference image will be generated.

Use the "GrowMask" node (3) to expand or shrink the masked area. Be cautious though—large values can significantly increase processing time.

Alternatively, you can draw a static mask on the first frame (4). With this option, the figure will only appear inside your manually defined area.

8. CONTROL

If your scene includes another humanoid figure, the DWPose Estimator will generate an additional Pose ControlNet.

You have three options:

  1. Pose + Points

  2. Points

  3. Pose

This allows you to flexibly choose how the motion is controlled.

9. START FRAME

Choose here the first image for generation:

  1. Use first frame of video - great if your character is appearing from behind an object or a person.

  2. Use first frame with inpainting mask - best option if the character is already visible in the frame

  3. Import your own Start Frame - If you're using the First Frame plus ControlNet workflow, you can upload a prepared frame where the figure has already been inserted (e.g., using the Flux Inpainting Workflow).

10. VIDEO GENERATION

In the final section, you'll find:

  1. a Bypass Node to temporarily deactivate the current node, which helps when adjusting earlier settings

  2. a Prompt Node, where it's best to use detailed and descriptive prompts to achieve high-quality figure integration

  3. the Video Sampler, which is set to 8 steps by default. You can lower this to 2 steps for faster previews and seed testing.

  4. On the right side, you can preview how the generated video behaves with the point system.

At the very end, you'll see the final video output without the overlayed point data.

© 2025 Mickmumpitz

© 2025 Mickmumpitz

© 2025 Mickmumpitz