SciReplicate-Bench: Benchmarking LLMs in Agent-driven Algorithmic Reproduction from Research Papers
Published in COLM 2025, 2025
Benchmarking agentic LLM systems on reproducing algorithms described in research papers.
Recommended citation: Yanzheng Xiang, Hanqi Yan, Shuyin Ouyang, Lin Gui, Yulan He. 2025. "SciReplicate-Bench: Benchmarking LLMs in Agent-driven Algorithmic Reproduction from Research Papers." Under review at COLM 2025.
Download Paper
