OpenAI introduces CriticGPT, an AI model for identifying errors in ChatGPT-generated code, enhancing RLHF pipeline for AI trainers.
OpenAI has introduced CriticGPT, a new AI model built on GPT-4, designed to identify errors in code generated by ChatGPT. The tool will help improve the process of alignment in AI systems through Reinforcement Learning from Human Feedback (RLHF) by making the outputs from large language models more accurate. CriticGPT, which was trained on a dataset of code samples with intentionally inserted bugs, aids human AI reviewers in checking code generated by ChatGPT. The model shows encouraging competency when analyzing code and identifying errors, enabling human colleagues to spot AI "hallucinations" that they may not notice on their own. CriticGPT will be integrated into OpenAI's RLHF labelling pipeline, aiming to provide AI trainers with better tools to evaluate complex AI outputs.