I presented our latest work on SciReplicate-Bench and shared methodologies for building agentic LLM systems that can reliably reproduce code from scientific publications. The talk covered benchmarking strategies, memory management, and tooling considerations for research automation.