🐦 Twitter Post Details

Viewing enriched Twitter post

@nearcyan

Successfully got Claude to order me lunch all by himself! Notes after 8 hours of using the new model: • Anthropic really does not want you to do this - anything involving logging into accounts and especially making purchases is RLHF'd away more intensely than usual. In fact my agents worked better on the previous model (not because the model was better, but because it cared much less when I wanted it to purchase items). I'm likely the first non-Anthropic employee to have had Sonnet-3.5 (new) autonomously purchase me food due to the difficulty. These posttraining changes have many interesting effects on the model in other areas. • If you use their demo repository you will hit rate limits very quickly. Even on a tier 2 or 3 API account I'd hit >2.5M tokens in ~15 minutes of agent usage. This is primarily due to a large amount of images in the context window. • Anthropic's demo worked instantly for me (which is impressive!), but re-implementing proper tool usage independently is cumbersome and there's few examples and only one (longer) page of documentation. • I don't think Anthropic intends for this to actually be used yet. The likely reasons for the release are a combination of competitive factors, financial factors, red-teaming factors, and a few others. • Although the restrictions can be frustrating, one has to keep in mind the scale that these companies operate at to garner sympathy; If they release a web agent that just does things it could easily delete all of your files, charge thousands to your credit card, tweet your passwords, etc. • A litigious milieu is the enemy of personal autonomy and freedom.

View on Twitter

📊 Media Metadata

{
  "score": 0.91,
  "scored_at": "2025-08-09T13:46:07.549461",
  "import_source": "network_archive_import",
  "media": [
    {
      "type": "photo",
      "url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/1848875226043703762/media_0.jpg?",
      "filename": "media_0.jpg"
    },
    {
      "id": "",
      "type": "photo",
      "url": null,
      "media_url": "https://pbs.twimg.com/media/GaiFpqzbEAUjL1S.jpg",
      "media_url_https": null,
      "display_url": null,
      "expanded_url": null
    }
  ],
  "reprocessed_at": "2025-08-12T15:25:13.368500",
  "reprocessed_reason": "missing_media_array",
  "original_structure": "had_both"
}

🔧 Raw API Response

{
  "user": {
    "created_at": "2019-05-14T04:46:57.000Z",
    "default_profile_image": false,
    "description": "chief claude connoisseur",
    "fast_followers_count": 0,
    "favourites_count": 28218,
    "followers_count": 64226,
    "friends_count": 952,
    "has_custom_timelines": false,
    "is_translator": false,
    "listed_count": 875,
    "location": "San Francisco, CA",
    "media_count": 1359,
    "name": "near",
    "normal_followers_count": 64226,
    "possibly_sensitive": false,
    "profile_banner_url": "https://pbs.twimg.com/profile_banners/1128159740599656448/1730934687",
    "profile_image_url_https": "https://pbs.twimg.com/profile_images/1859474520891076608/GyR06V4v_normal.png",
    "screen_name": "nearcyan",
    "statuses_count": 10629,
    "translator_type": "none",
    "url": "https://t.co/IdaJwZJCXm",
    "verified": true,
    "withheld_in_countries": [],
    "id_str": "1128159740599656448"
  },
  "id": "1848875226043703762",
  "conversation_id": "1848875226043703762",
  "full_text": "Successfully got Claude to order me lunch all by himself!\n\nNotes after 8 hours of using the new model:\n\n• Anthropic really does not want you to do this - anything involving logging into accounts and especially making purchases is RLHF'd away more intensely than usual. In fact my agents worked better on the previous model (not because the model was better, but because it cared much less when I wanted it to purchase items). I'm likely the first non-Anthropic employee to have had Sonnet-3.5 (new) autonomously purchase me food due to the difficulty. These posttraining changes have many interesting effects on the model in other areas.\n\n• If you use their demo repository you will hit rate limits very quickly. Even on a tier 2 or 3 API account I'd hit >2.5M tokens in ~15 minutes of agent usage. This is primarily due to a large amount of images in the context window.\n\n• Anthropic's demo worked instantly for me (which is impressive!), but re-implementing proper tool usage independently is cumbersome and there's few examples and only one (longer) page of documentation.\n\n• I don't think Anthropic intends for this to actually be used yet. The likely reasons for the release are a combination of competitive factors, financial factors, red-teaming factors, and a few others.\n\n• Although the restrictions can be frustrating, one has to keep in mind the scale that these companies operate at to garner sympathy; If they release a web agent that just does things it could easily delete all of your files, charge thousands to your credit card, tweet your passwords, etc.\n\n• A litigious milieu is the enemy of personal autonomy and freedom.",
  "reply_count": 54,
  "retweet_count": 119,
  "favorite_count": 1902,
  "hashtags": [],
  "symbols": [],
  "user_mentions": [],
  "urls": [],
  "media": [
    {
      "media_url": "https://pbs.twimg.com/media/GaiFpqzbEAUjL1S.jpg",
      "type": "photo"
    }
  ],
  "url": "https://twitter.com/nearcyan/status/1848875226043703762",
  "created_at": "2024-10-22T23:52:58.000Z",
  "#sort_index": "1848875226043703762",
  "view_count": 195781,
  "quote_count": 27,
  "is_quote_tweet": false,
  "is_retweet": false,
  "is_pinned": false,
  "is_truncated": true,
  "startUrl": "https://x.com/nearcyan/status/1848875226043703762"
}