banksy23 / code-eval Goto Github PK
View Code? Open in Web Editor NEWEvaluate the code generation ability of Code LLMs. Supports mainstream benchmarks such as HumanEval and MBPP. Define chat templates and pre/post-processing code in interface form to facilitate customizing the logic for extracting valid code snippets for testing for one's own model.