in

#AI enhances LLMs with active inheritance for optimal performance #CohereAI

Cohere for AI Enhances Large Language Models LLMs with Active Inheritance: Steering Synthetic Data Generation for Optimal Performance and Reduced Bias

Synthetic data generation is a valuable technique in machine learning for creating large datasets when real-world data is limited. Researchers use synthetic data to train models more effectively, but challenges arise from biases and attributes introduced by the synthetic data. Current optimization methods include data augmentation, pseudo-labeling, data weighting, data pruning, and curriculum learning, but they have limitations in introducing new desirable attributes.

A new concept called “active inheritance” proposed by researchers aims to guide synthetic data generation towards specific objectives like high lexical diversity and low toxicity. This method involves selecting proxy labels, generating multiple samples for each prompt, and choosing the sample that maximizes the desired attribute. The approach, known as targeted sampling, has shown promising results, with models demonstrating significant improvements in length, linguistic diversity, and reduced toxicity.

The study also explored passive inheritance, where models inherit properties from synthetic data without explicit guidance, revealing sensitivity to the properties of the training data. This sensitivity raises concerns about unintended biases and attributes introduced into the models. The research emphasizes the importance of carefully curating synthetic data to avoid undesirable outcomes.

Overall, the research highlights the impact of synthetic data on large language models and introduces active inheritance as a method to steer synthetic data generation towards desirable characteristics. This approach enhances specific attributes, ensuring that models trained with synthetic data are effective and safe. Active inheritance represents a promising avenue for optimizing machine learning models and improving AI systems.

Source link

Source link: https://www.marktechpost.com/2024/07/03/cohere-for-ai-enhances-large-language-models-llms-with-active-inheritance-steering-synthetic-data-generation-for-optimal-performance-and-reduced-bias/?amp

What do you think?

Leave a Reply

GIPHY App Key not set. Please check settings

Dive into the serene realms of nature with #VijaCelmins inspired AI art! 🎨 Embrace the meticulous beauty and photorealistic detail through #Artvy. Perfect for fans of #deepart & #freeaiart — Create your masterpiece today! 🌌✨ | by Artvy.ai | Jul, 2024

Immerse in tranquil nature with Vija Celmins inspired AI art. #Artvy.

AI Company To Bring Deceased Actors' Voices To Life

AI company reviving deceased actors’ voices. #VoiceResurrection