@LiorOnAI
Imbue just open-sourced Evolver. A tool that uses LLMs to automatically optimize code and prompts. They hit 95% on ARC-AGI-2 benchmarks. That's GPT-5.2-level performance from an open model. Evolver works like natural selection for code. You give it three things: 1. Starting code or prompt 2. A way to score results 3. An LLM that suggests improvements Then it runs in a loop. It picks high-scoring solutions. Mutates them. Tests the mutations. Keeps what works. The key difference from random mutation: LLMs propose targeted fixes. When a solution fails on specific inputs, the LLM sees those failures. It suggests changes to fix them. Most suggestions don't help. But some do. Those survivors become parents for the next generation. Evolver adds smart optimizations: > Batch mutations: fix multiple failures at once > Learning logs: share discoveries across branches > Post-mutation filters: skip bad mutations before scoring The verification step alone cuts costs 10x. This works on any problem where LLMs can read the code and you can score the output. You can now auto-optimize: - Agentic workflows - Prompt templates - Code performance - Reasoning chains No gradient descent needed. No differentiable functions required.