homehome Home chatchat Notifications


New AI tool can generate videos from text inputs, and it's cool and scary

Things are about to get very weird very fast.

Mihai Andrei
October 3, 2022 @ 3:09 pm

share Share

In recent weeks, image-generating AIs have bloomed and have shown capability beyond anything we could have expected a few years ago. Now, algorithms are ready to take things to the next level and start producing videos — and a new AI seems capable of doing just that.

This video was generated by an algorithm.

Imagine “a dog wearing a Superhero outfit with red cape flying through the sky.” That’s all the text input you need to produce the clip above. Meta’s dryly-named Make-A-Video AI can generate short videos from only text, and while the effect is still rather crude, it’s definitely a remarkable achievement.

Make-A-Video is not available to the public yet (Meta says it will launch it officially in November), but it seems to work just like the image-generating AIs: you add in a text prompt, make it as descriptive as you wish, and then wait for the video.

The technology behind Make-A-Video builds on existing work used in text-to-image synthesis. In fact, just a couple of months ago, Meta announced its own text-to-image AI model called Make-A-Scene.

Producing videos instead of images is much more challenging though. From an AI engine perspective, a video is just a series of hundreds or thousands of images which means, for starters, that you need to train your engine with much more data. Large-scale video sets that can be used for training are also much scarcer than images. This means that for the near future, at least, video AIs will likely be restricted to big companies with a lot of resources.

Nevertheless, Meta’s AI seems to be already pretty competent. The company already showcased videos made in several styles such as Surreal, Realistic, or Stylized.

“A young couple walking in heavy rain” — Realistic.

There’s plenty of improvement left to be done, but already, the engine seems to be capable of incorporating different video angles and styles. The videos don’t exactly seem realistic, but they’re not that far off either.

“Horse drinking water” — Realistic.

It’s still early days and the videos are unidimensional — the subjects are doing one thing. Doing a sequence of things (and transitioning from one thing to the other) will undoubtedly be a major challenge, but given how fast the field is progressing, it’s not hard to envision realistic videos not being that far off.

Which begs an important question: are we nearing the point of realistic, convincing deepfakes?

Great progress, great concerns

It’s a flourishing time for visual-generating AIs. In the last month alone, AI startup Stability.AI launched Stable Diffusion, an open-source text-to-image system, which became immensely popular (its Discord channel has over 2 million users, being the largest on the platform), and DALL-E, the first “new age” image-generating AI became public.

But while these algorithms filter out offensive or potentially damaging prompts, the possibility of using AI-generated images (and videos) for disinformation and other nefarious uses. The image generating AIs are already at that level (or very close to it), and existing safeguards may not stop the floodgates for too long.

This image was created by an AI (DALL-E).

Meta also acknowledged the hazards of creating photorealistic videos on demand. They say they want to counteract this by adding a watermark “help ensure viewers know the video was generated with AI and is not a captured video.”

“We want to be thoughtful about how we build new generative AI systems like this. Make-A-Video uses publicly available datasets, which adds an extra level of transparency to the research. We are openly sharing this generative AI research and results with the community for their feedback, and will continue to use our responsible AI framework to refine and evolve our approach to this emerging technology,” Meta said in a blog post.

But having a watermark does little if anything — if you can build an AI to generate videos, then it won’t be too much of a problem to make one that removes the watermark.

Having the ability to generate videos is exciting for a number of reasons and no doubt, it’s going to get much better soon. Deepfakes are just around the corner, however. Perhaps it’s time to start thinking about safeguards.

The company also described their work in a non-peer-reviewed paper published today.

share Share

LG’s $60,000 Transparent TV Is So Luxe It’s Practically Invisible

This TV screen vanishes at the push of a button.

A Factory for Cyborg Insects? Researchers Unveil Mass Production of Robo-Roaches

The new system can turn cockroaches into cyborgs in under 70 seconds.

Origami-Inspired Heart Valve May Revolutionize Treatment for Toddlers

A team of researchers at UC Irvine has developed an origami-inspired heart valve that grows with toddlers.

AI thought X-rays are connected to eating refried beans or drinking beer

Instead of finding true medical insights, these algorithms sometimes rely on irrelevant factors — leading to misleading results.

AI is scheming to stay online — and then lying to humans

An alarming third party report almost looks like a prequel to Terminator.

Scientists Built a Radioactive Diamond Battery That Could Last Longer Than Human Civilization

A tiny diamond battery could power devices for thousands of years.

Is AI the New Dot-Com Bubble? The Year 2025 Has 1999 Vibes All Over It

AI technology has promised us many advances and 2025 looms ahead of us. Will the outputs match the promises?

New 3D Bio-printer Injects Living Cells Directly Onto Injuries To Heal Bones and Cartilage

In recent years, 3D printing has evolved dramatically. Once limited to materials like plastic or resin, it now extends to human cells, enabling the creation of living tissues. However, bioprinting remains a slow and limited process—until now. This latest innovation promises to change that. A team of researchers has introduced a new, cutting-edge bioprinting system […]

Google's DeepMind claims they have the best weather prediction model

After chess, Go, and proteins, has DeepMind taken over weather forecasting as well?

The David Mayer case: ChatGPT refuses to say some names. We have an idea why

Who are David Mayer and Brian Hood?