🐦 Twitter Post Details

Viewing enriched Twitter post

@iScienceLuvr

MolmoAct: Action Reasoning Models that can Reason in Space "Reasoning is central to purposeful action, yet most robotic foundation models map perception and instructions directly to control, which limits adaptability, generalization, and semantic grounding. We introduce Action Reasoning Models (ARMs), a class of vision-language-action models that integrate perception, planning, and control through a structured three-stage pipeline. Our model, MolmoAct, encodes observations and instructions into depth-aware perception tokens, generates mid-level spatial plans as editable trajectory traces, and predicts precise low-level actions, enabling explainable and steerable behavior. MolmoAct-7B-D achieves strong performance across simulation and real-world settings: 70.5% zero-shot accuracy on SimplerEnv Visual Matching tasks, surpassing closed-source Pi-0 and GR00T N1"

View on Twitter

📊 Media Metadata

{
  "media": [
    {
      "type": "photo",
      "url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/1955239742917972220/media_0.jpg?",
      "filename": "media_0.jpg"
    }
  ],
  "processed_at": "2025-08-14T07:32:26.831010",
  "pipeline_version": "2.0"
}

🔧 Raw API Response

{
  "type": "tweet",
  "id": "1955239742917972220",
  "url": "https://x.com/iScienceLuvr/status/1955239742917972220",
  "twitterUrl": "https://twitter.com/iScienceLuvr/status/1955239742917972220",
  "text": "MolmoAct: Action Reasoning Models that can Reason in Space\n\n\"Reasoning is central to purposeful action, yet most robotic foundation  models map perception and instructions directly to control, which limits  adaptability, generalization, and semantic grounding. We introduce  Action Reasoning Models (ARMs), a class of vision-language-action models  that integrate perception, planning, and control through a structured  three-stage pipeline. Our model, MolmoAct, encodes observations and  instructions into depth-aware perception tokens, generates mid-level  spatial plans as editable trajectory traces, and predicts precise  low-level actions, enabling explainable and steerable behavior.  MolmoAct-7B-D achieves strong performance across simulation and  real-world settings: 70.5% zero-shot accuracy on SimplerEnv Visual  Matching tasks, surpassing closed-source Pi-0 and GR00T N1\"",
  "source": "Twitter for iPhone",
  "retweetCount": 18,
  "replyCount": 3,
  "likeCount": 162,
  "quoteCount": 0,
  "viewCount": 12635,
  "createdAt": "Tue Aug 12 12:07:35 +0000 2025",
  "lang": "en",
  "bookmarkCount": 101,
  "isReply": false,
  "inReplyToId": null,
  "conversationId": "1955239742917972220",
  "inReplyToUserId": null,
  "inReplyToUsername": null,
  "author": {
    "type": "user",
    "userName": "iScienceLuvr",
    "url": "https://x.com/iScienceLuvr",
    "twitterUrl": "https://twitter.com/iScienceLuvr",
    "id": "441465751",
    "name": "Tanishq Mathew Abraham, Ph.D.",
    "isVerified": false,
    "isBlueVerified": true,
    "verifiedType": null,
    "profilePicture": "https://pbs.twimg.com/profile_images/1913710019729821696/Qge4zx6u_normal.jpg",
    "coverPicture": "https://pbs.twimg.com/profile_banners/441465751/1738204246",
    "description": "",
    "location": "",
    "followers": 79909,
    "following": 1237,
    "status": "",
    "canDm": true,
    "canMediaTag": true,
    "createdAt": "Tue Dec 20 03:45:50 +0000 2011",
    "entities": {
      "description": {
        "urls": []
      },
      "url": {}
    },
    "fastFollowersCount": 0,
    "favouritesCount": 104613,
    "hasCustomTimelines": true,
    "isTranslator": false,
    "mediaCount": 2422,
    "statusesCount": 17748,
    "withheldInCountries": [],
    "affiliatesHighlightedLabel": {},
    "possiblySensitive": false,
    "pinnedTweetIds": [
      "1952221233648718307"
    ],
    "profile_bio": {
      "description": "CEO @SophontAI |\nPhD at 19 (2023) |\nFounder, ex CEO @MedARC_AI |\nex Research Director Stability AI | \nBiomed. engineer @ 14 |\nTEDx talk➡https://t.co/xPxwKTq6Qb",
      "entities": {
        "description": {
          "urls": [
            {
              "display_url": "bit.ly/3tpAuan",
              "expanded_url": "https://bit.ly/3tpAuan",
              "indices": [
                136,
                159
              ],
              "url": "https://t.co/xPxwKTq6Qb"
            }
          ],
          "user_mentions": [
            {
              "id_str": "0",
              "indices": [
                4,
                14
              ],
              "name": "",
              "screen_name": "SophontAI"
            },
            {
              "id_str": "0",
              "indices": [
                52,
                62
              ],
              "name": "",
              "screen_name": "MedARC_AI"
            }
          ]
        },
        "url": {
          "urls": [
            {
              "display_url": "sophontai.com",
              "expanded_url": "https://sophontai.com",
              "indices": [
                0,
                23
              ],
              "url": "https://t.co/uQ936JTZf1"
            }
          ]
        }
      }
    },
    "isAutomated": false,
    "automatedBy": null
  },
  "extendedEntities": {
    "media": [
      {
        "allow_download_status": {
          "allow_download": true
        },
        "display_url": "pic.twitter.com/c303x9SHYg",
        "expanded_url": "https://twitter.com/iScienceLuvr/status/1955239742917972220/photo/1",
        "ext_media_availability": {
          "status": "Available"
        },
        "features": {
          "large": {},
          "orig": {}
        },
        "id_str": "1955239513909075968",
        "indices": [
          275,
          298
        ],
        "media_key": "3_1955239513909075968",
        "media_results": {
          "id": "QXBpTWVkaWFSZXN1bHRzOgwAAQoAARsiZ/qFW6AACgACGyJoL9daEPwAAA==",
          "result": {
            "__typename": "ApiMedia",
            "id": "QXBpTWVkaWE6DAABCgABGyJn+oVboAAKAAIbImgv11oQ/AAA",
            "media_key": "3_1955239513909075968"
          }
        },
        "media_url_https": "https://pbs.twimg.com/media/GyJn-oVboAAFyTD.jpg",
        "original_info": {
          "focus_rects": [
            {
              "h": 778,
              "w": 1390,
              "x": 0,
              "y": 0
            },
            {
              "h": 1390,
              "w": 1390,
              "x": 0,
              "y": 0
            },
            {
              "h": 1585,
              "w": 1390,
              "x": 0,
              "y": 0
            },
            {
              "h": 1811,
              "w": 906,
              "x": 0,
              "y": 0
            },
            {
              "h": 1811,
              "w": 1390,
              "x": 0,
              "y": 0
            }
          ],
          "height": 1811,
          "width": 1390
        },
        "sizes": {
          "large": {
            "h": 1811,
            "w": 1390
          }
        },
        "type": "photo",
        "url": "https://t.co/c303x9SHYg"
      }
    ]
  },
  "card": null,
  "place": {},
  "entities": {},
  "quoted_tweet": null,
  "retweeted_tweet": null,
  "article": null
}