Diego Caples dCaples

Backstory. Skipped college after a Google engineer sponsored a $100k independent research grant; relocated to SF to do AI research full-time.

Head of Research, AGI, Inc. Trained a computer-use agent ranked #1 on OSWorld & AndroidWorld; first superhuman on both. [OSWorld benchmark] · [AndroidWorld leaderboard]
Created RealEvals.xyz. Realistic mini-internet for training & evaluating web agents (benchmark + RL envs). Used internally by leading AI labs to train their models.
MATS — Mechanistic Interpretability Fellow. Produced mech-interp analysis techniques used internally at OpenAI. [Scaling Sparse Feature Circuit Finding]
Winner — Anthropic × Pear. SHIELD: RL-trained coding agent that finds & fixes vulnerabilities; trained Llama to exceed GPT-4 on a public vuln-finding/fixing dataset.
Winner — Anthropic × AGI House. MCP auto-trainer: Trains any model to use any MCP. scrapes MCP servers, autogenerates tasks; GRPO + LLM-as-judge teaches small OSS LLMs to use any MCP.
Winner — WeaveHacks 2 ($12k). Daydreamer: developed a novel way to turn YouTube-scale video data directly into robot policies, solving robotics’ data problem; video-diffusion dream → pose → execute loop; self-trains with VLM feedback. [Devpost]

Provide feedback