@dair_ai
GitHub already has millions of repos full of procedural knowledge. The work introduces a framework for extracting agent skills directly from open-source repos. The pipeline analyzes repo structure, identifies procedural knowledge through dense retrieval, and translates it into standardized SKILL.md format with a progressive disclosure architecture so agents can discover thousands of skills without context window degradation. Manually authoring agent skills doesn't scale. Automated extraction achieved 40% gains in knowledge transfer efficiency while matching human-crafted quality. Still early on this, and there is more work needed for self-discovered and self-improving skills to work well at scale. As the agent skill ecosystem grows, mining existing repos could unlock scalable capability acquisition without having to retrain models. Paper: https://t.co/MAt8Goetcr Learn to build effective AI agents in our academy: https://t.co/LRnpZN7L4c