Meta open-sources pre-trained model for 'multi-token predictions' #CodeGeneration

Meta has introduced a new approach called ‘multi-token prediction’ for large-scale language models (LLMs) that outputs multiple tokens at once, as opposed to the traditional method of predicting one token at a time. This approach aims to improve the performance and training efficiency of LLMs. On July 4, Meta released four pre-trained models using multi-token prediction on the AI development platform Hugging Face, focusing on code generation tasks. Each model has 7 billion parameters and outputs four tokens at a time.

Testing on coding task benchmarks showed that the multi-token prediction model outperformed conventional LLMs by 17% on MBPP and 12% on HumanEval, with a three times faster output speed. This new approach has the potential to bridge the gap between humans and AI by enhancing language understanding, but it also raises concerns about potential misuse, such as generating misinformation and cyber attacks using AI.

The release of these advanced AI tools as open source has both advantages and disadvantages. While it improves the efficiency of LLMs, it also poses risks of misuse. Overall, the introduction of multi-token prediction by Meta represents a significant advancement in the field of large-scale language models and AI development.

Source link

Source link: https://gigazine.net/gsc_news/en/20240705-meta-open-sources-multi-token-prediction/