🐦 Twitter Post Details

Viewing enriched Twitter post

@teortaxesTex

People are so used to models trained for Huggingface Leaderboard they're in disbelief upon seeing a production-grade one. Maybe they shouldn't. Smol Qwens are samples of Tongyi Qianwen, not proofs of concept; to Alibaba, they're kind of like what T5-XXL is to Google. Alibaba is a world-class corporation actually trying, no BS, to capture the vast Chinese LLM assistant market. That's why this report talks so much about practical aspects and objectives they pursued, not only muh MMLU/HEval scores (and even with HEval, they go for the state-of-the-art HumanEvalPack benchmark). This paper, incomplete though it may be (it's particularly secretive about the dataset, understandably evoking extra suspicion), is a treasure trove of insight into almost-frontier proprietary LLMs. This is something like what we should've expected to see if @karpathy got his way and OpenAI published that small open-source model to teach the community a little share of their tricks. https://t.co/sQREvJ6dzz In the realm of LLaMA finetunes, @gigaml 's X1-Large and probably @XLangAI Lemur are comparable, but we know so much less about them. X1 is genuinely superior to LLama2-70B across the board, which is more than I can say for all the fancy imitative finetunes. As @iamgingertrash would probably argue, this is the difference in incentives.

View on Twitter

🔧 Raw API Response

{
  "user": {
    "created_at": "2010-09-18T13:32:22.000Z",
    "default_profile_image": false,
    "description": "Ours is the age of unaligned utilitarians. Other problems are relatively unimportant, but sometimes I tweet about them anyway.\n(кто/кого)",
    "fast_followers_count": 0,
    "favourites_count": 17952,
    "followers_count": 2054,
    "friends_count": 759,
    "has_custom_timelines": true,
    "is_translator": false,
    "listed_count": 36,
    "location": "",
    "media_count": 1344,
    "name": "Teortaxes",
    "normal_followers_count": 2054,
    "possibly_sensitive": false,
    "profile_banner_url": "https://pbs.twimg.com/profile_banners/192201556/1682742940",
    "profile_image_url_https": "https://pbs.twimg.com/profile_images/1652169745037242368/KRPTShbG_normal.jpg",
    "screen_name": "teortaxesTex",
    "statuses_count": 14041,
    "translator_type": "none",
    "verified": false,
    "withheld_in_countries": [],
    "id_str": "192201556"
  },
  "id": "1706431894970077364",
  "conversation_id": "1706431894970077364",
  "full_text": "People are so used to models trained for Huggingface Leaderboard they're in disbelief upon seeing a production-grade one. Maybe they shouldn't. Smol Qwens are samples of Tongyi Qianwen, not proofs of concept; to Alibaba, they're kind of like what T5-XXL is to Google.\n\nAlibaba is a world-class corporation actually trying, no BS, to capture the vast Chinese LLM assistant market. That's why this report talks so much about practical aspects and objectives they pursued, not only muh MMLU/HEval scores (and even with HEval, they go for the state-of-the-art HumanEvalPack benchmark). This paper, incomplete though it may be (it's particularly secretive about the dataset, understandably evoking extra suspicion), is a treasure trove of insight into almost-frontier proprietary LLMs. This is something like what we should've expected to see if @karpathy got his way and OpenAI published that small open-source model to teach the community a little share of their tricks.\nhttps://t.co/sQREvJ6dzz\n\nIn the realm of LLaMA finetunes, @gigaml 's X1-Large and probably @XLangAI  Lemur are comparable, but we know so much less about them. X1 is genuinely superior to LLama2-70B across the board, which is more than I can say for all the fancy imitative finetunes. As @iamgingertrash would probably argue, this is the difference in incentives.",
  "reply_count": 3,
  "retweet_count": 26,
  "favorite_count": 270,
  "hashtags": [],
  "symbols": [],
  "user_mentions": [],
  "urls": [],
  "media": [
    {
      "media_url": "https://pbs.twimg.com/media/F65yBSaWwAApyhg.jpg",
      "type": "photo"
    },
    {
      "media_url": "https://pbs.twimg.com/media/F65zz18WcAAd73m.png",
      "type": "photo"
    },
    {
      "media_url": "https://pbs.twimg.com/media/F650paCWQAAFfk9.jpg",
      "type": "photo"
    },
    {
      "media_url": "https://pbs.twimg.com/media/F650-cIWoAAd2Zs.jpg",
      "type": "photo"
    }
  ],
  "url": "https://twitter.com/teortaxesTex/status/1706431894970077364",
  "created_at": "2023-09-25T22:14:02.000Z",
  "#sort_index": "1706431894970077364",
  "view_count": 57928,
  "quote_count": 0,
  "is_quote_tweet": true,
  "is_retweet": false,
  "is_pinned": false,
  "is_truncated": true,
  "quoted_tweet": {
    "user": {
      "created_at": "2018-06-23T12:21:23.000Z",
      "default_profile_image": false,
      "description": "Content Intelligence & Cyberpsychology \n\n@knutjaegersberg@sigmoid.social\n\nhttps://t.co/xnBUK02PMq",
      "fast_followers_count": 0,
      "favourites_count": 70786,
      "followers_count": 4350,
      "friends_count": 3996,
      "has_custom_timelines": true,
      "is_translator": false,
      "listed_count": 85,
      "location": "Gronau, Germany",
      "media_count": 2598,
      "name": "Knut Jägersberg",
      "normal_followers_count": 4350,
      "possibly_sensitive": false,
      "profile_banner_url": "https://pbs.twimg.com/profile_banners/1010498049058201600/1688786256",
      "profile_image_url_https": "https://pbs.twimg.com/profile_images/1416049148722360320/_fKe7C6c_normal.jpg",
      "screen_name": "JagersbergKnut",
      "statuses_count": 54013,
      "translator_type": "none",
      "url": "https://t.co/ts02Vdnj6z",
      "verified": false,
      "withheld_in_countries": [],
      "id_str": "1010498049058201600"
    },
    "id": "1706309414976700423",
    "conversation_id": "1706309414976700423",
    "full_text": "Qwen-14B beats larger models in benchmarks, LLM community wonders how\n\n@bimedotcom @TheAIObserverX @Khulood_Almani @debashis_dutta @sonu_monika @theomitsa @BetaMoroney @Analytics_699 @Shi4Tech @FmFrancoise @enilev @sallyeaves @IanLJones98\n\nhttps://t.co/2oEInWaIiB https://t.co/SAhcohLzJc",
    "reply_count": 9,
    "retweet_count": 11,
    "favorite_count": 62,
    "hashtags": [],
    "symbols": [],
    "user_mentions": [
      {
        "id_str": "1546164530",
        "name": "BusinessIntelligence",
        "screen_name": "bimedotcom",
        "profile": "https://twitter.com/bimedotcom"
      },
      {
        "id_str": "1186144637012074497",
        "name": "Nat 🇬🇪 (Inactive)",
        "screen_name": "TheAIObserverX",
        "profile": "https://twitter.com/TheAIObserverX"
      },
      {
        "id_str": "1403861754808049666",
        "name": "Dr. Khulood Almani | د.خلود المانع",
        "screen_name": "Khulood_Almani",
        "profile": "https://twitter.com/Khulood_Almani"
      },
      {
        "id_str": "92764064",
        "name": "Dr. Debashis Dutta PhD PMP",
        "screen_name": "debashis_dutta",
        "profile": "https://twitter.com/debashis_dutta"
      },
      {
        "id_str": "1025638818068590592",
        "name": "Dr. Monika Sonu I Founder Healthinnovationtoolbox",
        "screen_name": "sonu_monika",
        "profile": "https://twitter.com/sonu_monika"
      },
      {
        "id_str": "914274290081689601",
        "name": "Dr. Theophano Mitsa ☦️🇬🇷🇺🇸",
        "screen_name": "theomitsa",
        "profile": "https://twitter.com/theomitsa"
      },
      {
        "id_str": "1175377309676781568",
        "name": "Tony Moroney",
        "screen_name": "BetaMoroney",
        "profile": "https://twitter.com/BetaMoroney"
      },
      {
        "id_str": "1155470031225884675",
        "name": "Mack #Tech4Good",
        "screen_name": "Analytics_699",
        "profile": "https://twitter.com/Analytics_699"
      },
      {
        "id_str": "16476911",
        "name": "💙 #TechForGood 💙",
        "screen_name": "Shi4Tech",
        "profile": "https://twitter.com/Shi4Tech"
      },
      {
        "id_str": "3229980963",
        "name": "Françoise Morvan",
        "screen_name": "FmFrancoise",
        "profile": "https://twitter.com/FmFrancoise"
      },
      {
        "id_str": "130344472",
        "name": "Eveline Ruehlin",
        "screen_name": "enilev",
        "profile": "https://twitter.com/enilev"
      },
      {
        "id_str": "3131243261",
        "name": "Prof. Sally Eaves",
        "screen_name": "sallyeaves",
        "profile": "https://twitter.com/sallyeaves"
      },
      {
        "id_str": "55230288",
        "name": "Ian Jones",
        "screen_name": "IanLJones98",
        "profile": "https://twitter.com/IanLJones98"
      }
    ],
    "urls": [
      {
        "url": "https://t.co/2oEInWaIiB",
        "expanded_url": "https://cevalbenchmark.com/static/leaderboard.html",
        "display_url": "cevalbenchmark.com/static/leaderb…"
      }
    ],
    "media": [
      {
        "media_url": "https://pbs.twimg.com/media/F64HOfMWEAEHIPZ.jpg",
        "type": "photo"
      }
    ],
    "url": "https://twitter.com/JagersbergKnut/status/1706309414976700423",
    "created_at": "2023-09-25T14:07:20.000Z",
    "#sort_index": "1706431894970077400",
    "view_count": 65279,
    "quote_count": 2,
    "is_quote_tweet": false,
    "is_retweet": false,
    "is_pinned": false,
    "is_truncated": false
  },
  "startUrl": "https://twitter.com/teortaxestex/status/1706431894970077364"
}