🐦 Twitter Post Details

Viewing enriched Twitter post

@ArtificialAnlys

Wait - is the new GPT-4o a smaller and less intelligent model? We have completed running our independent evals on OpenAI’s GPT-4o release yesterday and are consistently measuring materially lower eval scores than the August release of GPT-4o. GPT-4o (Nov) vs GPT-4o (Aug): ➤ Artificial Analysis Quality Index decrease from 77 to 71 (now equal to GPT-4o mini) ➤ GPQA Diamond decrease from 51% to 39%, MATH decrease from 78% to 69% ➤ Speed increase from ~80 output tokens/s to ~180 tokens/s ➤ No pricing change Our Output Speed benchmarks are currently measuring ~180 output tokens/s for the Nov 20th model, while the August model shows ~80 tokens/s. We have generally observed significantly faster speeds on launch day for OpenAI models (likely due to OpenAI provisioning capacity ahead of adoption), but previously have not seen a 2x speed difference. Based on this data, we conclude that it is likely that OpenAI’s Nov 20th GPT-4o model is a smaller model than the August release. Given that OpenAI has not cut prices for the Nov 20th version, we recommend that developers do not shift workloads away from the August version without careful testing.

🔧 Raw API Response

{
  "user": {
    "created_at": "2024-01-06T04:21:21.000Z",
    "default_profile_image": false,
    "description": "Independent analysis of AI models and hosting providers - choose the best model and API provider for your use-case",
    "fast_followers_count": 0,
    "favourites_count": 1114,
    "followers_count": 18320,
    "friends_count": 467,
    "has_custom_timelines": false,
    "is_translator": false,
    "listed_count": 348,
    "location": "San Francisco",
    "media_count": 305,
    "name": "Artificial Analysis",
    "normal_followers_count": 18320,
    "possibly_sensitive": false,
    "profile_banner_url": "https://pbs.twimg.com/profile_banners/1743487864934162432/1704519394",
    "profile_image_url_https": "https://pbs.twimg.com/profile_images/1810946341511766016/3mg9KIaQ_normal.jpg",
    "screen_name": "ArtificialAnlys",
    "statuses_count": 611,
    "translator_type": "none",
    "url": "https://t.co/hEm5Kv0ktE",
    "verified": true,
    "withheld_in_countries": [],
    "id_str": "1743487864934162432"
  },
  "id": "1859614633654616310",
  "conversation_id": "1859614633654616310",
  "full_text": "Wait - is the new GPT-4o a smaller and less intelligent model?\n\nWe have completed running our independent evals on OpenAI’s GPT-4o release yesterday and are consistently measuring materially lower eval scores than the August release of GPT-4o.\n\nGPT-4o (Nov) vs GPT-4o (Aug):\n➤ Artificial Analysis Quality Index decrease from 77 to 71 (now equal to GPT-4o mini)\n➤ GPQA Diamond decrease from 51% to 39%, MATH decrease from 78% to 69%\n➤ Speed increase from ~80 output tokens/s to ~180 tokens/s\n➤ No pricing change\n\nOur Output Speed benchmarks are currently measuring ~180 output tokens/s for the Nov 20th model, while the August model shows ~80 tokens/s. We have generally observed significantly faster speeds on launch day for OpenAI models (likely due to OpenAI provisioning capacity ahead of adoption), but previously have not seen a 2x speed difference.\n\nBased on this data, we conclude that it is likely that OpenAI’s Nov 20th GPT-4o model is a smaller model than the August release.\n\nGiven that OpenAI has not cut prices for the Nov 20th version, we recommend that developers do not shift workloads away from the August version without careful testing.",
  "reply_count": 51,
  "retweet_count": 115,
  "favorite_count": 912,
  "hashtags": [],
  "symbols": [],
  "user_mentions": [],
  "urls": [],
  "media": [
    {
      "media_url": "https://pbs.twimg.com/media/Gc6s3ygbUAA5Se1.jpg",
      "type": "photo"
    }
  ],
  "url": "https://twitter.com/ArtificialAnlys/status/1859614633654616310",
  "created_at": "2024-11-21T15:07:33.000Z",
  "#sort_index": "1859614633654616310",
  "view_count": 184121,
  "quote_count": 67,
  "is_quote_tweet": false,
  "is_retweet": false,
  "is_pinned": false,
  "is_truncated": true,
  "startUrl": "https://x.com/artificialanlys/status/1859614633654616310"
}