in

AlphaCode 2 Technical Report: A concise overview and analysis. #coding

Brief Review — AlphaCode 2 Technical Report | by Sik-Ho Tsang | Mar, 2024

AlphaCode 2 is a code generation model that utilizes Gemini Pro as its foundation model, differentiating it from the original AlphaCode. The model undergoes two rounds of fine-tuning using the GOLD method on an updated version of the CodeContests dataset, containing 15 thousand problems and 30 million human code samples.

Sampling in AlphaCode 2 generates up to a million code samples per problem, incorporating randomized temperature parameters to promote diversity. Only C++ samples are used, as they are deemed to be of higher quality compared to Python.

The generated code samples are filtered by executing them on test inputs, removing those that do not produce the expected output or fail to compile. On average, 95% of the samples are eliminated through this filtering process.

After filtering, an average of 50 thousand candidates per problem remain. A separate model is trained to create new test inputs for each problem, grouping similar code samples into clusters based on the produced outputs. The 10 largest clusters are retained, with a single sample per cluster submitted to the online judge for evaluation.

Lastly, a second Gemini Pro model is fine-tuned to assign an estimated correctness score between 0 and 1 to code samples. This scoring model is used to compute scores for each code sample in the clusters, selecting the best candidate sample based on these scores to create the final list of 10 submissions.

Source link

Source link: https://sh-tsang.medium.com/brief-review-alphacode-2-technical-report-b460dcbca202?source=rss——llm-5

What do you think?

Leave a Reply

GIPHY App Key not set. Please check settings

Larry Summers thinks AI could replace ‘almost all' forms of labor. Just don’t expect a 'productivity miracle' soon

Larry Summers predicts AI will replace most jobs soon. #automation

Combined Preference and Supervised Fine Tuning with ORPO

ORPO enhances supervised fine tuning for combined preference learning. #MachineLearning