VASA-1 can combine a single image with one audio clip and turn it into a video of a person talking. It's not just the lips moving to match the audio… it's the entire face. The head movements ...