in

AI Institute releases Tulu 2.5 Suite on Hugging Face #AdvancedAIModels

Allen Institute for AI Releases Tulu 2.5 Suite on Hugging Face: Advanced AI Models Trained with DPO and PPO, Featuring Reward and Value Models

The Allen Institute for AI has released the Tulu 2.5 suite, which includes models trained using Direct Preference Optimization (DPO) and Proximal Policy Optimization (PPO) methods. These models aim to enhance language model performance in text generation, instruction following, and reasoning. The suite consists of various models trained on diverse datasets to improve reward and value models. Notable variants include those trained on UltraFeedback data, Chatbot Arena data, StackExchange data, Nectar dataset, HH-RLHF dataset, and HelpSteer dataset. The suite leverages preference data, DPO, and PPO methodologies to optimize language model capabilities. The models have been evaluated across benchmarks, showing superior performance in reasoning, coding, and safety. Key improvements include enhanced instruction following, truthfulness, scalability with reward models up to 70 billion parameters, and the use of synthetic data like UltraFeedback. The Tulu 2.5 suite represents a significant advancement in preference-based learning for language models, setting a new benchmark for AI model performance and reliability. Future work will focus on optimizing components for even greater performance gains and expanding the suite with more diverse datasets.

Source link

Source link: https://www.marktechpost.com/2024/06/16/allen-institute-for-ai-releases-tulu-2-5-suite-on-hugging-face-advanced-ai-models-trained-with-dpo-and-ppo-featuring-reward-and-value-models/?amp

What do you think?

Leave a Reply

GIPHY App Key not set. Please check settings

RAG (Retrieval-Augmented Generation): Why It Is a Game Changer

RAG: The Game-Changing Technology Shaping the Future with #AI.

Imagine a world where…. Exploring many worlds in my mind… | by Constructs & Capital | Jun, 2024

Imagining multiple worlds, exploring endless possibilities. #imagination