🐦 Twitter Post Details

Viewing enriched Twitter post

@iScienceLuvr

Sample More to Think Less: Group Filtered Policy Optimization for Concise Reasoning "We introduce GFPO (Group Filtered Policy Optimization), which curbs this length explosion by sampling larger groups per problem during training and filtering responses to train on based on two key metrics: (1) response length and (2) token efficiency: reward per token ratio. By sampling more at training time, we teach models to think less at inference time. On the Phi-4-reasoning model, GFPO cuts GRPO's length inflation by 46-71% across challenging STEM and coding benchmarks (AIME 24/25, GPQA, Omni-MATH, LiveCodeBench) while maintaining accuracy. Optimizing for reward per token further increases reductions in length inflation to 71-85%. We also propose Adaptive Difficulty GFPO, which dynamically allocates more training resources to harder problems based on real-time difficulty estimates, improving the balance between computational efficiency and accuracy especially on difficult questions. GFPO demonstrates that increased training-time compute directly translates to reduced test-time compute--a simple yet effective trade-off for efficient reasoning."

View on Twitter

📊 Media Metadata

{
  "media": [
    {
      "type": "photo",
      "url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/1955955524790575212/media_0.jpg?",
      "filename": "media_0.jpg"
    }
  ],
  "processed_at": "2025-08-14T16:21:52.971817",
  "pipeline_version": "2.0"
}

🔧 Raw API Response

{
  "type": "tweet",
  "id": "1955955524790575212",
  "url": "https://x.com/iScienceLuvr/status/1955955524790575212",
  "twitterUrl": "https://twitter.com/iScienceLuvr/status/1955955524790575212",
  "text": "Sample More to Think Less: Group Filtered Policy Optimization for Concise Reasoning\n\n\"We introduce GFPO (Group Filtered Policy Optimization), which curbs  this length explosion by sampling larger groups per problem during  training and filtering responses to train on based on two key metrics:  (1) response length and (2) token efficiency: reward per token ratio. By  sampling more at training time, we teach models to think less at  inference time. On the Phi-4-reasoning model, GFPO cuts GRPO's length  inflation by 46-71% across challenging STEM and coding benchmarks (AIME  24/25, GPQA, Omni-MATH, LiveCodeBench) while maintaining accuracy.  Optimizing for reward per token further increases reductions in length  inflation to 71-85%. We also propose Adaptive Difficulty GFPO, which  dynamically allocates more training resources to harder problems based  on real-time difficulty estimates, improving the balance between  computational efficiency and accuracy especially on difficult questions.  GFPO demonstrates that increased training-time compute directly  translates to reduced test-time compute--a simple yet effective  trade-off for efficient reasoning.\"",
  "source": "Twitter for iPhone",
  "retweetCount": 14,
  "replyCount": 2,
  "likeCount": 94,
  "quoteCount": 1,
  "viewCount": 6570,
  "createdAt": "Thu Aug 14 11:31:51 +0000 2025",
  "lang": "en",
  "bookmarkCount": 63,
  "isReply": false,
  "inReplyToId": null,
  "conversationId": "1955955524790575212",
  "inReplyToUserId": null,
  "inReplyToUsername": null,
  "author": {
    "type": "user",
    "userName": "iScienceLuvr",
    "url": "https://x.com/iScienceLuvr",
    "twitterUrl": "https://twitter.com/iScienceLuvr",
    "id": "441465751",
    "name": "Tanishq Mathew Abraham, Ph.D.",
    "isVerified": false,
    "isBlueVerified": true,
    "verifiedType": null,
    "profilePicture": "https://pbs.twimg.com/profile_images/1913710019729821696/Qge4zx6u_normal.jpg",
    "coverPicture": "https://pbs.twimg.com/profile_banners/441465751/1738204246",
    "description": "",
    "location": "",
    "followers": 79929,
    "following": 1237,
    "status": "",
    "canDm": true,
    "canMediaTag": true,
    "createdAt": "Tue Dec 20 03:45:50 +0000 2011",
    "entities": {
      "description": {
        "urls": []
      },
      "url": {}
    },
    "fastFollowersCount": 0,
    "favouritesCount": 104625,
    "hasCustomTimelines": true,
    "isTranslator": false,
    "mediaCount": 2427,
    "statusesCount": 17759,
    "withheldInCountries": [],
    "affiliatesHighlightedLabel": {},
    "possiblySensitive": false,
    "pinnedTweetIds": [
      "1952221233648718307"
    ],
    "profile_bio": {
      "description": "CEO @SophontAI |\nPhD at 19 (2023) |\nFounder, ex CEO @MedARC_AI |\nex Research Director Stability AI | \nBiomed. engineer @ 14 |\nTEDx talk➡https://t.co/xPxwKTq6Qb",
      "entities": {
        "description": {
          "urls": [
            {
              "display_url": "bit.ly/3tpAuan",
              "expanded_url": "https://bit.ly/3tpAuan",
              "indices": [
                136,
                159
              ],
              "url": "https://t.co/xPxwKTq6Qb"
            }
          ],
          "user_mentions": [
            {
              "id_str": "0",
              "indices": [
                4,
                14
              ],
              "name": "",
              "screen_name": "SophontAI"
            },
            {
              "id_str": "0",
              "indices": [
                52,
                62
              ],
              "name": "",
              "screen_name": "MedARC_AI"
            }
          ]
        },
        "url": {
          "urls": [
            {
              "display_url": "sophontai.com",
              "expanded_url": "https://sophontai.com",
              "indices": [
                0,
                23
              ],
              "url": "https://t.co/uQ936JTZf1"
            }
          ]
        }
      }
    },
    "isAutomated": false,
    "automatedBy": null
  },
  "extendedEntities": {
    "media": [
      {
        "allow_download_status": {
          "allow_download": true
        },
        "display_url": "pic.twitter.com/aL37gt5BeF",
        "expanded_url": "https://twitter.com/iScienceLuvr/status/1955955524790575212/photo/1",
        "ext_media_availability": {
          "status": "Available"
        },
        "features": {
          "large": {},
          "orig": {}
        },
        "id_str": "1955955351112867840",
        "indices": [
          282,
          305
        ],
        "media_key": "3_1955955351112867840",
        "media_results": {
          "id": "QXBpTWVkaWFSZXN1bHRzOgwAAQoAARsk8wdbm2AACgACGyTzL8ua4GwAAA==",
          "result": {
            "__typename": "ApiMedia",
            "id": "QXBpTWVkaWE6DAABCgABGyTzB1ubYAAKAAIbJPMvy5rgbAAA",
            "media_key": "3_1955955351112867840"
          }
        },
        "media_url_https": "https://pbs.twimg.com/media/GyTzB1ubYAAxbog.jpg",
        "original_info": {
          "focus_rects": [
            {
              "h": 786,
              "w": 1404,
              "x": 0,
              "y": 0
            },
            {
              "h": 1404,
              "w": 1404,
              "x": 0,
              "y": 0
            },
            {
              "h": 1601,
              "w": 1404,
              "x": 0,
              "y": 0
            },
            {
              "h": 1827,
              "w": 914,
              "x": 0,
              "y": 0
            },
            {
              "h": 1827,
              "w": 1404,
              "x": 0,
              "y": 0
            }
          ],
          "height": 1827,
          "width": 1404
        },
        "sizes": {
          "large": {
            "h": 1827,
            "w": 1404
          }
        },
        "type": "photo",
        "url": "https://t.co/aL37gt5BeF"
      }
    ]
  },
  "card": null,
  "place": {},
  "entities": {},
  "quoted_tweet": null,
  "retweeted_tweet": null,
  "article": null
}