🐦 Twitter Post Details

Viewing enriched Twitter post

@omarsar0

LLM-based Agents for Automated Bug Fixing Analyzes 7 leading LLM-based bug fixing systems on the SWE-bench Lite benchmark, finding MarsCode Agent (developed by ByteDance) achieved the highest success rate at 39.33%. Reveals that for error localization line-level fault localization accuracy is more critical than file-level accuracy, and bug reproduction capabilities significantly impact fixing success. Shows that 24/168 resolved issues could only be solved using reproduction techniques, though reproduction sometimes misled LLMs when issue descriptions were already clear. Concludes that improvements are needed in both LLM reasoning capabilities and Agent workflow design to enhance automated bug fixing effectiveness. This paper highlights the challenging nature of some domains, like code, and the opportunities to innovate further in agentic workflow design.

View on Twitter

🔧 Raw API Response

{
  "user": {
    "created_at": "2015-09-04T12:59:26.000Z",
    "default_profile_image": false,
    "description": "Building with AI Agents @dair_ai • Prev: Meta AI, Elastic, Galactica LLM, PhD • I also teach how to build with LLMs, RAG & AI Agents ⬇️",
    "fast_followers_count": 0,
    "favourites_count": 27933,
    "followers_count": 216713,
    "friends_count": 532,
    "has_custom_timelines": true,
    "is_translator": false,
    "listed_count": 3688,
    "location": "",
    "media_count": 2656,
    "name": "elvis",
    "normal_followers_count": 216713,
    "possibly_sensitive": false,
    "profile_banner_url": "https://pbs.twimg.com/profile_banners/3448284313/1565974901",
    "profile_image_url_https": "https://pbs.twimg.com/profile_images/939313677647282181/vZjFWtAn_normal.jpg",
    "screen_name": "omarsar0",
    "statuses_count": 12439,
    "translator_type": "regular",
    "url": "https://t.co/JBU5beHQNs",
    "verified": true,
    "withheld_in_countries": [],
    "id_str": "3448284313"
  },
  "id": "1859964808789135668",
  "conversation_id": "1859964808789135668",
  "full_text": "LLM-based Agents for Automated Bug Fixing\n\nAnalyzes 7 leading LLM-based bug fixing systems on the SWE-bench Lite benchmark, finding MarsCode Agent (developed by ByteDance) achieved the highest success rate at 39.33%.\n\nReveals that for error localization line-level fault localization accuracy is more critical than file-level accuracy, and bug reproduction capabilities significantly impact fixing success.\n\nShows that 24/168 resolved issues could only be solved using reproduction techniques, though reproduction sometimes misled LLMs when issue descriptions were already clear.\n\nConcludes that improvements are needed in both LLM reasoning capabilities and Agent workflow design to enhance automated bug fixing effectiveness.\n\nThis paper highlights the challenging nature of some domains, like code, and the opportunities to innovate further in agentic workflow design.",
  "reply_count": 5,
  "retweet_count": 48,
  "favorite_count": 189,
  "hashtags": [],
  "symbols": [],
  "user_mentions": [],
  "urls": [],
  "media": [
    {
      "media_url": "https://pbs.twimg.com/media/Gc_sIwRakAAvqjL.png",
      "type": "photo"
    }
  ],
  "url": "https://twitter.com/omarsar0/status/1859964808789135668",
  "created_at": "2024-11-22T14:19:01.000Z",
  "#sort_index": "1859964808789135668",
  "view_count": 17154,
  "quote_count": 2,
  "is_quote_tweet": false,
  "is_retweet": false,
  "is_pinned": false,
  "is_truncated": true,
  "startUrl": "https://x.com/omarsar0/status/1859964808789135668"
}