Meta’s large language models, Llama and Llama 2, generated interest for being launched under a more open, non-commercial license, allowing for fine-tuning and adaptation. The recent release of Llama 3 shows significant improvements in performance and training on a more diverse dataset. Llama 3 is Meta’s most capable model yet, with two sizes available: 8B and 70B parameter models. The context size has increased, and a decoder-only transformer architecture is used.
Llama 3 has been retooled for instruction following, reasoning, and code generation, with improvements in response diversity. The model has been trained on 15 trillion tokens of publicly available data, including non-English data. Safety measures have been implemented, including content moderation and code filtering tools. Llama 3 outperforms other models in various use cases and has been extensively tested for responsible development and deployment.
The evolution of open models like Llama will establish a healthy ecosystem of open source alternatives to proprietary large language models, enabling innovation in various applications. The data used to train Llama 3 was carefully vetted, ensuring quality and diversity. Meta’s approach to instruction-tuning integrates various techniques to enhance model quality. Overall, Llama 3 represents a significant advancement in open-source language models, paving the way for enhanced multilingual applications and responsible use of AI technology.
Source link
Source link: https://thenewstack.io/llama-3-how-metas-new-open-llm-compares-to-llama-1-and-2/
GIPHY App Key not set. Please check settings