@AnthropicAI
We also systematically show that the features we find are more interpretable than the neurons, using both a blinded human evaluator and a large language model (autointerpretability). ๐ https://t.co/XQvzENHMrp https://t.co/dawkxhAvix
Viewing enriched Twitter post
We also systematically show that the features we find are more interpretable than the neurons, using both a blinded human evaluator and a large language model (autointerpretability). ๐ https://t.co/XQvzENHMrp https://t.co/dawkxhAvix
{
"media": [
{
"type": "photo",
"url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/1709986966220030382/media_0.png?",
"filename": "media_0.png"
},
{
"type": "photo",
"url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/1709986966220030382/media_1.png?",
"filename": "media_1.png"
}
],
"processed_at": "2025-12-19T20:19:50.162428",
"pipeline_version": "2.0"
} {
"type": "tweet",
"id": "1709986966220030382",
"url": "https://x.com/AnthropicAI/status/1709986966220030382",
"twitterUrl": "https://twitter.com/AnthropicAI/status/1709986966220030382",
"text": "We also systematically show that the features we find are more interpretable than the neurons, using both a blinded human evaluator and a large language model (autointerpretability).\n\n๐ https://t.co/XQvzENHMrp https://t.co/dawkxhAvix",
"source": "Twitter for iPhone",
"retweetCount": 8,
"replyCount": 2,
"likeCount": 260,
"quoteCount": 1,
"viewCount": 39714,
"createdAt": "Thu Oct 05 17:40:37 +0000 2023",
"lang": "en",
"bookmarkCount": 8,
"isReply": true,
"inReplyToId": "1709986963615269284",
"conversationId": "1709986949711200722",
"displayTextRange": [
0,
209
],
"inReplyToUserId": null,
"inReplyToUsername": null,
"author": {
"type": "user",
"userName": "AnthropicAI",
"url": "https://x.com/AnthropicAI",
"twitterUrl": "https://twitter.com/AnthropicAI",
"id": "1353836358901501952",
"name": "Anthropic",
"isVerified": false,
"isBlueVerified": true,
"verifiedType": null,
"profilePicture": "https://pbs.twimg.com/profile_images/1798110641414443008/XP8gyBaY_normal.jpg",
"coverPicture": "https://pbs.twimg.com/profile_banners/1353836358901501952/1719228429",
"description": "We're an AI safety and research company that builds reliable, interpretable, and steerable AI systems. Talk to our AI assistant @claudeai on https://t.co/FhDI3KQh0n.",
"location": "",
"followers": 716132,
"following": 35,
"status": "",
"canDm": false,
"canMediaTag": true,
"createdAt": "Mon Jan 25 22:45:28 +0000 2021",
"entities": {
"description": {
"urls": [
{
"display_url": "claude.ai",
"expanded_url": "https://claude.ai",
"url": "https://t.co/FhDI3KQh0n",
"indices": [
141,
164
]
}
]
},
"url": {
"urls": [
{
"display_url": "anthropic.com",
"expanded_url": "https://anthropic.com",
"url": "https://t.co/w94SABjAXZ",
"indices": [
0,
23
]
}
]
}
},
"fastFollowersCount": 0,
"favouritesCount": 1480,
"hasCustomTimelines": false,
"isTranslator": false,
"mediaCount": 535,
"statusesCount": 1299,
"withheldInCountries": [],
"affiliatesHighlightedLabel": {},
"possiblySensitive": false,
"pinnedTweetIds": [
"2001686747185394148"
],
"profile_bio": {
"description": "We're an AI safety and research company that builds reliable, interpretable, and steerable AI systems. Talk to our AI assistant @claudeai on https://t.co/FhDI3KQh0n."
},
"isAutomated": false,
"automatedBy": null
},
"extendedEntities": {
"media": [
{
"display_url": "pic.x.com/dawkxhAvix",
"expanded_url": "https://x.com/AnthropicAI/status/1709986966220030382/photo/1",
"ext_alt_text": "Two histogram plots, each showing a purple histogram (labeled Features) and a teal histogram (labeled Neurons). The first plot is titled \"Manual Interpretability\", with x-axis \"Rubric Value\". The purple Features histogram is mostly on the right, over the Rubric values of 10 to 14. The team Neuron histogram has a tall spike on the left at Rubric value of 0, and a flat short distribution from 2 to 14 for the rest. The second plot is titled \"Automated Interpretability - Activation\", with x-axis \"Spearman Correlation\". The purple Features histogram has a flat plateau from a correlation of 1 to 0.6, then slowly tapers down to the left to a correlation of 0. The teal Neurons plot has a tall spike at a correlation of 0, and then a bell curve distribution centered around 0.3.",
"id_str": "1709962666951712768",
"indices": [
210,
233
],
"media_key": "3_1709962666951712768",
"media_url_https": "https://pbs.twimg.com/media/F7sB_F7WIAAy-Ez.png",
"type": "photo",
"url": "https://t.co/dawkxhAvix",
"ext_media_availability": {
"status": "Available"
},
"features": {
"large": {
"faces": []
},
"medium": {
"faces": []
},
"small": {
"faces": []
},
"orig": {
"faces": []
}
},
"sizes": {
"large": {
"h": 1332,
"w": 1551,
"resize": "fit"
},
"medium": {
"h": 1031,
"w": 1200,
"resize": "fit"
},
"small": {
"h": 584,
"w": 680,
"resize": "fit"
},
"thumb": {
"h": 150,
"w": 150,
"resize": "crop"
}
},
"original_info": {
"height": 1332,
"width": 1551,
"focus_rects": [
{
"x": 0,
"y": 302,
"w": 1551,
"h": 869
},
{
"x": 70,
"y": 0,
"w": 1332,
"h": 1332
},
{
"x": 152,
"y": 0,
"w": 1168,
"h": 1332
},
{
"x": 403,
"y": 0,
"w": 666,
"h": 1332
},
{
"x": 0,
"y": 0,
"w": 1551,
"h": 1332
}
]
},
"media_results": {
"result": {
"media_key": "3_1709962666951712768"
}
}
}
]
},
"card": null,
"place": {},
"entities": {
"hashtags": [],
"media": [
{
"display_url": "pic.x.com/dawkxhAvix",
"expanded_url": "https://x.com/AnthropicAI/status/1709986966220030382/photo/1",
"ext_alt_text": "Two histogram plots, each showing a purple histogram (labeled Features) and a teal histogram (labeled Neurons). The first plot is titled \"Manual Interpretability\", with x-axis \"Rubric Value\". The purple Features histogram is mostly on the right, over the Rubric values of 10 to 14. The team Neuron histogram has a tall spike on the left at Rubric value of 0, and a flat short distribution from 2 to 14 for the rest. The second plot is titled \"Automated Interpretability - Activation\", with x-axis \"Spearman Correlation\". The purple Features histogram has a flat plateau from a correlation of 1 to 0.6, then slowly tapers down to the left to a correlation of 0. The teal Neurons plot has a tall spike at a correlation of 0, and then a bell curve distribution centered around 0.3.",
"id_str": "1709962666951712768",
"indices": [
210,
233
],
"media_key": "3_1709962666951712768",
"media_url_https": "https://pbs.twimg.com/media/F7sB_F7WIAAy-Ez.png",
"type": "photo",
"url": "https://t.co/dawkxhAvix",
"ext_media_availability": {
"status": "Available"
},
"features": {
"large": {
"faces": []
},
"medium": {
"faces": []
},
"small": {
"faces": []
},
"orig": {
"faces": []
}
},
"sizes": {
"large": {
"h": 1332,
"w": 1551,
"resize": "fit"
},
"medium": {
"h": 1031,
"w": 1200,
"resize": "fit"
},
"small": {
"h": 584,
"w": 680,
"resize": "fit"
},
"thumb": {
"h": 150,
"w": 150,
"resize": "crop"
}
},
"original_info": {
"height": 1332,
"width": 1551,
"focus_rects": [
{
"x": 0,
"y": 302,
"w": 1551,
"h": 869
},
{
"x": 70,
"y": 0,
"w": 1332,
"h": 1332
},
{
"x": 152,
"y": 0,
"w": 1168,
"h": 1332
},
{
"x": 403,
"y": 0,
"w": 666,
"h": 1332
},
{
"x": 0,
"y": 0,
"w": 1551,
"h": 1332
}
]
},
"media_results": {
"result": {
"media_key": "3_1709962666951712768"
}
}
}
],
"symbols": [],
"timestamps": [],
"urls": [
{
"display_url": "transformer-circuits.pub/2023/monosemanโฆ",
"expanded_url": "https://transformer-circuits.pub/2023/monosemantic-features/index.html",
"url": "https://t.co/XQvzENHMrp",
"indices": [
186,
209
]
}
],
"user_mentions": []
},
"quoted_tweet": null,
"retweeted_tweet": null,
"article": null
}