🐦 Twitter Post Details

Viewing enriched Twitter post

@_philschmid

GPT-4 for coding at home! Qwen 2.5 Coder 7B outperforms other @OpenAI GPT-4 0613 and open LLMs < 33B, including @BigCodeProject StartCoder, @MistralAI Codestral, or Deepseek, and is released under Apache 2.0. 🤯 Details: 🚀 Three model sizes: 1.5B, 7B, and 32B (coming soon) up to 128K tokens using YaRN 📚 Pre-trained on 5.5 trillion tokens, post-trained on tens of millions example (no details on # tokens) ⚖️ 7:2:1 ratio of public code data, synthetic data, and text data outperformed other combinations, even those with more code proportion. ✅ Build scalable synthetic data generation using LLM scorers, checklist-based scoring, and sandbox for code verification to filter out low-quality data. 🌐 Trained on 92+ programming languages and Incorporated multilingual code instruction data 📏 To improve long context, create instruction pairs with FIM format using AST 🎯 Adopted a two-stage post-training process—starting with diverse, low-quality data (tens of millions) for broad learning, followed by high-quality data with rejection sampling for refinement (millions). 🧹 Performed decontamination on all datasets (pre & post) to ensure integrity using a 10-gram overlap method 🏆 7B Outperforms other open Code LLMs < 40B, including Mistral Codestral, or Deepseek 🥇 7B matches OpenAI GPT-4 0613 on various benchmarks 🤗 Released under Apache 2.0 and available on @huggingface Models: https://t.co/esgNKlOxAt Paper: https://t.co/Vv6rS14QJA

View on Twitter

📊 Media Metadata

{
  "score": 0.86,
  "scored_at": "2025-08-09T13:46:07.552440",
  "import_source": "network_archive_import",
  "links_checked": true,
  "checked_at": "2025-08-10T10:32:49.607260",
  "media": [
    {
      "type": "photo",
      "url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/1838122129792839918/media_0.png?",
      "filename": "media_0.png"
    },
    {
      "id": "",
      "type": "photo",
      "url": null,
      "media_url": "https://pbs.twimg.com/media/GYJSOwCWgAAQ0kN.png",
      "media_url_https": null,
      "display_url": null,
      "expanded_url": null
    }
  ],
  "reprocessed_at": "2025-08-12T15:26:06.505335",
  "reprocessed_reason": "missing_media_array",
  "original_structure": "had_both"
}

🔧 Raw API Response

{
  "user": {
    "created_at": "2019-06-18T18:39:49.000Z",
    "default_profile_image": false,
    "description": "Tech Lead and LLMs at @huggingface 👨🏻‍💻 🤗  AWS ML Hero 🦸🏻 | Cloud & ML enthusiast | 📍Nuremberg | 🇩🇪 https://t.co/l1ppq3q3hk",
    "fast_followers_count": 0,
    "favourites_count": 5136,
    "followers_count": 27723,
    "friends_count": 820,
    "has_custom_timelines": false,
    "is_translator": false,
    "listed_count": 656,
    "location": "Nürnberg",
    "media_count": 999,
    "name": "Philipp Schmid",
    "normal_followers_count": 27723,
    "possibly_sensitive": false,
    "profile_banner_url": "https://pbs.twimg.com/profile_banners/1141052916570214400/1725456070",
    "profile_image_url_https": "https://pbs.twimg.com/profile_images/1831321531852496896/1yBZG884_normal.jpg",
    "screen_name": "_philschmid",
    "statuses_count": 3072,
    "translator_type": "none",
    "url": "https://t.co/8BDXIK6omb",
    "verified": true,
    "withheld_in_countries": [],
    "id_str": "1141052916570214400"
  },
  "id": "1838122129792839918",
  "conversation_id": "1838122129792839918",
  "full_text": "GPT-4 for coding at home! Qwen 2.5 Coder 7B outperforms other @OpenAI GPT-4 0613 and open LLMs < 33B, including @BigCodeProject StartCoder, @MistralAI Codestral, or Deepseek, and is released under Apache 2.0. 🤯\n\nDetails:\n🚀 Three model sizes: 1.5B, 7B, and 32B (coming soon) up to 128K tokens using YaRN\n📚 Pre-trained on 5.5 trillion tokens, post-trained on tens of millions example (no details on # tokens)\n⚖️ 7:2:1 ratio of public code data, synthetic data, and text data outperformed other combinations, even those with more code proportion.\n✅ Build scalable synthetic data generation using LLM scorers, checklist-based scoring, and sandbox for code verification to filter out low-quality data.\n🌐 Trained on 92+ programming languages and Incorporated multilingual code instruction data\n📏 To improve long context, create instruction pairs with FIM format using AST\n🎯 Adopted a two-stage post-training process—starting with diverse, low-quality data (tens of millions) for broad learning, followed by high-quality data with rejection sampling for refinement (millions).\n🧹 Performed decontamination on all datasets (pre & post) to ensure integrity using a 10-gram overlap method\n🏆 7B Outperforms other open Code LLMs < 40B, including Mistral Codestral, or Deepseek\n🥇 7B matches OpenAI GPT-4 0613 on various benchmarks\n🤗 Released under Apache 2.0 and available on @huggingface\n\nModels: https://t.co/esgNKlOxAt\nPaper: https://t.co/Vv6rS14QJA",
  "reply_count": 9,
  "retweet_count": 75,
  "favorite_count": 458,
  "hashtags": [],
  "symbols": [],
  "user_mentions": [
    {
      "id_str": "4398626122",
      "name": "OpenAI",
      "screen_name": "OpenAI",
      "profile": "https://twitter.com/OpenAI"
    },
    {
      "id_str": "1554445522664148993",
      "name": "BigCode",
      "screen_name": "BigCodeProject",
      "profile": "https://twitter.com/BigCodeProject"
    },
    {
      "id_str": "1667249535519805451",
      "name": "Mistral AI",
      "screen_name": "MistralAI",
      "profile": "https://twitter.com/MistralAI"
    }
  ],
  "urls": [],
  "media": [
    {
      "media_url": "https://pbs.twimg.com/media/GYJSOwCWgAAQ0kN.png",
      "type": "photo"
    }
  ],
  "url": "https://twitter.com/_philschmid/status/1838122129792839918",
  "created_at": "2024-09-23T07:44:01.000Z",
  "#sort_index": "1838122129792839918",
  "view_count": 83769,
  "quote_count": 11,
  "is_quote_tweet": false,
  "is_retweet": false,
  "is_pinned": false,
  "is_truncated": true,
  "startUrl": "https://x.com/_philschmid/status/1838122129792839918"
}