AI training AI to improve, evolving into smarter artificial intelligence. #AItrainingAI

OpenAI has developed an AI assistant called CriticGPT to help trainers refine the GPT-4 model by spotting coding errors that humans might miss. After initial training, the model undergoes Reinforcement Learning from Human Feedback (RLHF) where trainers interact with the system to improve response accuracy. However, as the system improves, it can outpace the trainer’s expertise, making error identification challenging.

Last year, OpenAI faced criticism for crowd-sourcing training efforts to Kenyan workers at low pay. To address the challenge of refining code generation capabilities, CriticGPT was introduced to catch errors in ChatGPT’s code output. The company found that with CriticGPT’s help, people outperformed those without assistance 60% of the time.

A whitepaper titled “LLM Critics Help Catch LLM Bugs” was released, showing that LLMs (large language models) catch more bugs than qualified humans in code review, with model critiques preferred over human critiques 80% of the time. Collaboration between humans and CriticGPT reduced the AI’s rate of hallucinating responses compared to when CriticGPT worked alone, although it was still higher than when a human worked independently.

Overall, CriticGPT plays a crucial role in improving the accuracy and effectiveness of the GPT-4 model by assisting trainers in identifying and correcting coding errors that may be overlooked during the refinement process.

Source link

Source link: https://ca.news.yahoo.com/ai-now-being-trained-ai-181926045.html