in

A deep dive into a massive video model #AIModel

Beyond Language: Inside a Hundred-Trillion-Token Video Model

In this episode of the AI + a16z podcast, Luma Chief Scientist Jiaming Song discusses his career in video models and the release of Luma’s Dream Machine 3D video model with a16z General Partner Anjney Midha. The model demonstrates reasoning capabilities due to being trained on a large volume of high-quality video data. Jiaming explains the “bitter lesson” of training generative models, emphasizing the importance of using more compute power rather than developing priors. He highlights the shift towards using deep learning features in language and vision tasks, emphasizing the limitations of language data compared to visual data. Jiaming argues that scaling up data efforts for language models is challenging due to the limited sources of high-quality language data. He suggests that language itself is a prior in the face of richer data signals from the physical world. The discussion delves into the future of multimodal models and the potential for using more compute power to enhance AI capabilities.

Source link

Source link: https://a16z.com/podcast/beyond-language-inside-a-hundred-trillion-token-video-model/

What do you think?

Leave a Reply

GIPHY App Key not set. Please check settings

Fook Yi 34B 32K Model - Roleplay Model That Keeps on Giving - Play Locally

#FookYi 34B 32K Model roleplay keeps giving locally #modeling

Few-shot tool-use doesn’t really work (yet)

WhatsApp tests new AI tool ‘Imagine Me’ for selfie stickers #innovation