Kling just dropped their first native audio model, VIDEO 2.6 and it's insane.
Previously, Kling's video models could only generate "silent visuals".
But now... native audio, insane realism, stronger understanding and full audio control.
More examples below.
This update introduces a groundbreaking "Native Audio" capability. The model completely transforms the traditional AI video workflow of "first generating silent visuals, then manually adding voiceovers and sound effects."
By deeply aligning the semantics of sounds and dynamic visuals from the physical world, VIDEO 2.6 enables the end-to-end generation of complete videos in a single go...
1. Text-to-Audio-Visual
Simply input text to generate a video that includes voice, sound effects, and ambient sounds.
Prompt: A young asian woman, casually dressed, sitting on a sofa in a cozy living room, softly saying: “I have a secret, Kling 2.6 is coming.”