@emollick
Here’s an independent domain extension of METR’s famous time-horizon analysis, applying it to offensive cybersecurity with real human expert timing data Similar to METR: 5.7 months doubling time. Frontier models now succeed 50% of the time at tasks that take human experts 10.5h. https://t.co/7qzxaZUe96