@omarsar0
@karpathy You reach a point where you do need to have good eval workflows to assess changes and improvements in how agents perform work and interact. That eval doesn't exist. It comes through taste and experimentation. Fun to be a researcher in these times. But also great to be a builder.