The Motion Brush Feature through Kling API

A logo of PiAPI
PiAPI

Introduction

Kling's Motion Brush is a newly introduced tool, launched by Kuaishou on September 19, 2024, alongside the release of Kling 1.5.

A screenshot of Kling's announcement of Kling 1.5's Kling Motion Brush 's release on their official website
Kling's announcement of Kling 1.5's and Kling Motion Brush 's release on their official website

PiAPI's Host-Your-Account users already have access to the Kling 1.5 API. However, the motion brush feature will be available to our API users by around October 8, 2024.

In this blog, we’ll dive into what the motion brush tool is and compare the results with and without it to see just how much it enhances the AI's performance.

What is the Kling Motion Brush Feature?

Kling Motion Brush is a new feature in Kling’s image-to-video generative AI, currently available only in Kling 1.0 and not yet supported in Kling 1.5. This tool enables users to set movement paths for specific elements in the video.

By brushing over an area manually or using auto-segmentation, you can select the object in the image you want to move, then choose a path to set its movement directions.

In addition to setting movement paths, users can also apply the static brush, which stops the selected parts from moving.

In theory, this level of control allows users to create more dynamic motion, bringing still images to life with more movement precision. If you are interested in learning how to use the tool more effectively, we recommend checking out Kuaishou's Official Motion Brush User Guide.

With this in mind, we'll now examine the image-to-video evaluation framework which serves as the foundation for analyzing the effectiveness of the Kling Motion Brush.

Image-to-Video Evaluation Framework

For this comparison, we have taken the AIGCBench (Artificial Intelligence Generated Content Bench), an evaluation framework designed for AI-generated Image-to-Video content.

Although this is a framework designed to be used by computers, we've adjusted it for human evaluation, adapting it from an automated system into a manual process. Below are the four criteria the framework used, along with explanations for each.

Control-Video Alignment

We would assess how closely the output video aligns with the provided text prompt and image. This benchmark is essentially the same as the "prompt adherence" metric from our previous blog comparing Luma Dream Machine 1.5 vs 1.0.

Motion Affects

We evaluate whether the motion in the video is dynamic, realistic, smooth, and consistent with real-world physics.

Temporal Consistency

We'll assess whether adjacent frames show high coherence, maintain continuity throughout the video, and remain free of any visible artifacts, distortions, and errors.

Video Quality

This metric is straightforward - we check if the video has a high resolution and check for any blurring.

Same Prompt Comparison

With the evaluation framework established, let's now outline our comparison method. We will have three comparison examples, all based on popular cultural trends. Within each example, we will compare the output of three different workflows shown below:

  • Workflow 1 - Kling 1.0 without Motion Brush
  • Workflow 2 - Kling 1.0 with Motion Brush
  • Workflow 3 - Kling 1.5 without Motion Brush

All workflows within each example will use the same prompt and input image. We will express how we want the object to move with clear textual commands in the prompt. Workflow 1 & 3 will have to rely only on the prompt, whereas Workflow 2 will rely on the prompt and the motion path drawn. This is how we can compare how much improvement the motion brush feature brings.

Note, Kuaishou's official guide (Motion Brush User Guide) also advises that when using the motion brush feature, the user still should specify the desired movement in the prompt for optimal performance. Our Workflow 2 adheres to this recommended practice.

Now that you're familiar with our approach, let's jump right into the examples.

Example 1: Joker Folie à Deux

Because we are very excited about the upcoming release of Joker 2, thus for this example, we've chosen to revisit the iconic stair scene from the original Joker film.

Below is the image that we have found of Joaquin Phoenix's Joker on the stairs, which we'll use with a prompt to generate the videos for three workflows in this example.

The PNG image of Joaquin Phoenix's Joker on the stairs that will be used as the input to Kling API
The image of Joaquin Phoenix's Joker on the stairs that will be used an input to Kling API

The screenshot of the motion brush settings that we used for Workflow 2 (the Kling 1.0 model with Motion Brush) is shown as follows. As the green path illustrates, we intend the Joker to turn around and walk up the stairs.

A screenshot of the motion brush settings for the image of Joaquin Phoenix's Joker, that will be used as input for Workflow 2
The motion brush settings for the image of Joaquin Phoenix's Joker, that will be used as input to Kling API (Workflow 2)

And here are the output videos of the three workflows, with the prompt used shown in the description.

A GIF of the comparison between Kling 1.0,  Kling 1.0 with motion brush, and Kling 1.5 of Joaquin Phoenix's Joker walking up the stairs
A comparison of the three workflows of the prompt: "The camera slowly zooms out as Joker turns around, back facing the camera as he walks up the stairs."

As discussed in the evaluation framework section, the following is our analysis:

Control-Video Alignment

All three videos have the Joker walking up the stairs with his back facing the camera. But none show the slow zoom-out specified in the prompt.

Motion Effects

All three versions have smooth, realistic motion, which are consistent with real-world physics, without any abrupt movements.

Temporal Consistency

The videos are free of flickering, abrupt transitions, or artifacts, maintaining high frame-to-frame consistency. Although the Joker seems to be turning around a bit slower than usual in the first video.

Video Quality

While Kling 1.5 provides 1080p, the other two are limited to 720p. None of the videos above show any signs of blurring.


Overall, there doesn't seem to be much difference in this example for all four criteria.

Example 2: NFL Wallpaper

The NFL season just kicked off, and we’re having so much fun watching it that we decided to create animated NFL wallpapers for this example.

Below is an image generated using Midjourney API of the Dallas Cowboy Wallpaper, which we'll use with a prompt to generate the videos for three workflows in this example.

PNG image of a Dallas Cowboys player holding a football against a blue and black background, generated by Midjourney API
Prompt: "A Dallas Cowboys NFL player in a blue background with paint effects moving, phone wallpaper, inspiring."

The screenshot below shows the motion brush settings we used for workflow 2 (the Kling 1.0 model with Motion Brush). We've used the static brush to keep the NFL player in the center stationary.

A screenshot of the motion brush settings for the image of the Dallas Cowboys Wallpaper, that will be used as input into Kling API (Workflow 2)
The motion brush settings for the image of the Dallas Cowboys Wallpaper, that will be used as input into Kling API (Workflow 2)

And here are the video outputs for the three workflows.

A GIF of the comparison between Kling 1.0,  Kling 1.0 with motion brush, and Kling 1.5 of an animated NFL Wallpaper
A comparison of the three workflows of the prompt: "Cinematic loop of an NFL player in focus, stationary in vibrant blue background. Swirling effects move dynamically behind him. Camera is stationary and static."

As discussed in the evaluation framework section, the following is our analysis:

Control-Video Alignment

Workflow 2 (Kling 1.0 w/ Motion Brush) delivers the best Control-Video Alignment, with the NFL player staying completely still and the background creating the most convincing loop. Workflow 1 (Kling 1.0 w/o Motion Brush) also performed well. But if you look closely, the NFL player has slight movements, and the background loop isn’t as smooth as in Workflow 2. Meanwhile, Workflow 3 (Kling 1.5 w/o Motion Brush) has poor alignment, as the NFL player’s visible movement prevents a seamless loop when used as a wallpaper.

Motion Effects

All three versions have realistic and smooth motion, with no unnatural movements visible.

Temporal Consistency

None of the three videos show flickering, abrupt transitions, or artifacts, maintaining high frame-to-frame coherence.

Video Quality

Kling 1.5 outputs in 1080p, unlike the other two, which are restricted to 720p. None of the videos display any blurring.

Overall, for creating animated wallpapers, using motion brush seems like the clear choice since it lets you control which parts remain static, resulting in a more seamless loop.

Example 3: Sonic the Hedgehog 3

As Sonic 3 races toward theaters, we picked a scene from the original movie as an example. Being longtime Sonic fans, we felt this was the perfect time to showcase him!

Below is an image that we have found of Sonic, which we'll use with a prompt to generate the videos for three workflows in this example.

The PNG image of Sonic the hedgehog that will be used as the input to Kling API
The image of Sonic the Hedgehog that will be used as the input to Kling API

The screenshot below shows the motion brush settings that we used for Workflow 2 (the Kling 1.0 model with Motion Brush). As the green path illustrates, we intend Sonic to walk offscreen to the right.

A screenshot of the motion brush settings for the image of Sonic, which will be used as input into Kling API (Workflow 2)
The motion brush settings for the image of Sonic which be used as input into Kling API (Workflow 2)

And here are the output videos for the three workflows.

A GIF of the comparison between Kling 1.0,  Kling 1.0 with motion brush, and Kling 1.5 of Sonic walking offscreen to the right.
A comparison of the three workflows of the prompt: "Sonic walks offscreen to the right"

As discussed in the evaluation framework section, the following is our analysis:

Control-Video Alignment

Both Workflow 1 (Kling 1.0 w/o Motion Brush) and Workflow 2 (Kling 1.0 with Motion Brush) show strong Control-Video Alignment, as Sonic walks offscreen to the right in both cases. In contrast, Workflow 3 (Kling 1.5 w/o Motion Brush) shows the lowest level of Control-Video Alignment, with Sonic disappearing offscreen instead of walking off.

Motion Effects

Workflow 2 (Kling 1.0 with Motion Brush) performs best in this criterion, as Sonic walks offscreen naturally. In Workflow 3 (Kling 1.5 w/o Motion Brush), Sonic’s movements are natural, but he disappears rather than walking away. However, in Workflow 1 (Kling 1.0 w/o Motion Brush), Sonic’s legs shorten, and his body morphs unnaturally before walking offscreen.

Temporal Consistency

Workflows 1 and 2 maintain strong temporal consistency without any abrupt disruptions. However, Workflow 3 contains a sudden transition where Sonic disappears, and a red car takes his place.

Video Quality

Kling 1.5 provides 1080p resolution, compared to the other two restricted to 720p, and all videos are free from blurring.

In this example, Workflow 2 (Kling 1.0 with Motion Brush) has the best output, with Sonic’s movements remaining fluid and natural, and avoiding distortion or disappearance seen in other workflows.

Conclusion

Based on the three examples provided above, it's clear that using Motion Brush with Kling 1.0 offers the best control over image elements, outperforming other options in Control-Video Alignment, Motion Effects, and Temporal Consistency. However, Kling 1.5 still holds a slight advantage in video quality. But this isn't always the case, as the first example shows, the difference in output quality can sometimes be minimal.

We are excited to see how Kling's Motion Brush tool will evolve in the future. Even in its initial version, this tool provides powerful, precise control over image elements for image-to-video AI generation. We can't wait to see how it will turn out once the motion brush tool is added to Kling 1.5!

We hope that you found our comparison useful!

And if you are interested, check out our collection of generative AI APIs from PiAPI!


More Stories