Meta’s movie Gen makes converting video clips

Video Gen: Changing the look of a video that’s been altered to show pompoms or dinosaurs in a desert

Nearly two years after powerful AI image and video generators hit the mainstream, AI companies have pushed the technology further: in just the last six months, major tech companies like Google and OpenAI are working on similar tools, along with smaller startups. OpenAI’s Sora, first announced in February, still hasn’t launched publicly; this week, a co-lead working on the video generator left the company for Google.

Movie Gen can be used to change the look of existing footage, add things that weren’t before, and change the style and transitions. One example shared by Meta is a video that appears to be an illustrated runner with one frame in which he is holding pompoms. The background has been altered to show a desert. In a third, the runner is wearing a dinosaur costume. Text prompt can be used to change things.

Meta’s chief product officer, Chris Cox, writes on Threads that the company “[isn’t] ready to release this as a product anytime soon,” as it’s still expensive and generation time is too long.

Learning What to Look For with Movie Gen, and What to Expect in AI Generative Tools: Meta Connect, July 15 – 23rd

Movie Gen was trained on what to look for. The specifics aren’t clear in Meta’s announcement post: “We’ve trained these models on a combination of licensed and publicly available data sets.” The sources of training data and what’s fair to scrape from the web remain a contentious issue for generative AI tools, and it’s rarely ever public knowledge what text, video, or audioclips were used to create any of the major models.

Creatives like filmmakers, photographers, artists, writers, and actors also worry about how AI generators will affect their livelihoods, and AI has been a central part of several strikes, including the historic joint Hollywood strikes by the Screen Actors Guild – American Federation of Television and Radio Artists (SAG-AFTRA) and Writers Guild of America (WGA) last year.

The company shared multiple 10-second clips generated with Movie Gen, including a Moo Deng-esque baby hippo swimming around, to demonstrate its capabilities. Movie Gen is not yet available for use, but it can be seen at the Meta Connect event where new hardware and the latest version of its large language model were displayed.

Audio bites can be generated alongside the videos with Movie Gen. An artificial intelligence man stands near a waterfall with splashes of water and the sound of a symphony, while a sports car roars and tires screech as it drives around a track and a snake slides alongside the jungle floor in a sample clip.

Meta shared some further details about Movie Gen in a research paper released Friday. Movie Gen Video consists of 30 billion parameters, while Movie Gen Audio consists of 13 billion parameters. The largest variant of Llama 3.1 has over 400 billion parameters, roughly equal to how capable the model is. Movie Gen can produce high-definition videos up to 16 seconds long, and Meta claims that it outperforms competitive models in overall video quality.

Mark Zuckerberg showed off Meta AI’s Imagine Me feature early this year, which allows users to pose as themselves and post an image of themselves on Threads in multiple scenarios. Movie Gen has a video version of the feature that is similar to ElfYourself on steroids.

Considering Meta’s legacy as a social media company, it’s possible that tools powered by Movie Gen will start popping up, eventually, inside of Facebook, Instagram, and WhatsApp. In September, a competitor shared its plans to make some parts of Veo video model more accessible for creators inside its shorts.

While larger tech companies are still holding off on fully releasing video models to the public, you are able to experiment with AI video tools right now from smaller, upcoming startups, like Runway and Pika. If you have ever been curious about what it would be like to be crushed with a mechanical press and then melted in a puddle, you should give Pikaffects a try.