Menu
in

Meta releases new language models, open-sourced for multi-token prediction. #NLP

Meta Platforms Inc. has released four open-source language models that implement a new machine learning approach called multi-token prediction. These models generate four tokens at a time, aiming to make large language models faster and more accurate. The models are designed for code generation tasks and have 7 billion parameters each, trained on massive amounts of code samples. Meta also developed a fifth model with 13 billion parameters. The models consist of a shared trunk for initial computations and four output heads that generate one token each, allowing for four tokens to be produced simultaneously.

The company’s researchers believe that the multi-token prediction approach may improve code quality by mitigating the limitations of traditional teacher-forcing training methods. Meta tested the models using benchmark tests and found that they outperformed traditional models in accuracy and speed. The models performed 17% and 12% better on coding tasks compared to one-token-at-a-time models and were three times faster in generating output.

Overall, Meta’s new language models represent a significant advancement in machine learning for code generation tasks. The company’s research suggests that the multi-token prediction approach may offer improvements over traditional training methods, leading to higher-quality code generation.

Source link

Source link: https://siliconangle.com/2024/07/04/meta-open-sources-new-multi-token-prediction-language-models/

Leave a Reply

Exit mobile version