🐦 Twitter Post Details

Viewing enriched Twitter post

@HelloSurgeAI

What is SFT data and what role does it play in state-of-the-art LLMs? Supervised finetuning (SFT) in the context of RLHF deals with further tuning an initial language model using demonstration data. At Surge AI, we provide SFT data for top LLM teams to finetune their LLMs. Here is what we have observed: SFT data typically involves collecting demonstration data including prompts and in-depth responses written by human annotators demonstrating how the model should respond to the prompt. Specifically, You take a set of commands and obtain human-written responses for each. The SFT training dataset consists of <prompt, ideal generation> pairs used to finetune the pre-trained LLM to output human-like responses. So let’s say you are aligning an LLM-powered dialogue system then you need to collect dialogue-style instructions/responses data that cater to that use case. Similarly, as shown in the figure, if you want high-quality code generation capabilities you can also provide instruction + written responses as part of the SFT data. This leads to the first important component, also referred to as supervised policy, for training an RLHF LLM. But why go through all this process when training RLHF LLMs? The core idea of SFT is to provide a high-quality initialization for the RLHF process. It’s widely applied by some of the most advanced closed and open-sourced LLMs. To make this work, you need to collect lots of demonstration data but the challenge is collecting high-quality and diverse demonstration data at scale. SFT data can be written by different annotators and can incorporate a lot of noise as response quality and style can vary from annotator to annotator. Controlling for this is key. According to reported insights, you will need to collect thousands of examples to ensure you are tuning a high-quality LLM. SFT data helps to improve target areas that allow steering the LLM better to your needs. We can help with your SFT data needs! If you need help with collecting high-quality preference or SFT data, reach out to our team here: https://t.co/OSm4aHIOP6

View on Twitter

📊 Media Metadata

{
  "media": [
    {
      "url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/1691805661317795965/media_0.jpg?",
      "media_url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/1691805661317795965/media_0.jpg?",
      "type": "photo",
      "filename": "media_0.jpg"
    }
  ],
  "downloaded_to_supabase": true,
  "processed_at": "2025-08-14T07:00:00Z"
}

🔧 Raw API Response

{
  "type": "tweet",
  "id": "1691805661317795965",
  "url": "https://x.com/HelloSurgeAI/status/1691805661317795965",
  "twitterUrl": "https://twitter.com/HelloSurgeAI/status/1691805661317795965",
  "text": "What is SFT data and what role does it play in state-of-the-art LLMs?\n\nSupervised finetuning (SFT) in the context of RLHF deals with further tuning an initial language model using demonstration data.\n\nAt Surge AI, we provide SFT data for top LLM teams to finetune their LLMs. Here is what we have observed:\n\nSFT data typically involves collecting demonstration data including prompts and in-depth responses written by human annotators demonstrating how the model should respond to the prompt. Specifically, You take a set of commands and obtain human-written responses for each.\n\nThe SFT training dataset consists of <prompt, ideal generation> pairs used to finetune the pre-trained LLM to output human-like responses. So let’s say you are aligning an LLM-powered dialogue system then you need to collect dialogue-style instructions/responses data that cater to that use case. Similarly, as shown in the figure, if you want high-quality code generation capabilities you can also provide instruction + written responses as part of the SFT data.\n\nThis leads to the first important component, also referred to as supervised policy, for training an RLHF LLM. But why go through all this process when training RLHF LLMs? The core idea of SFT is to provide a high-quality initialization for the RLHF process. It’s widely applied by some of the most advanced closed and open-sourced LLMs.\n\nTo make this work, you need to collect lots of demonstration data but the challenge is collecting high-quality and diverse demonstration data at scale. SFT data can be written by different annotators and can incorporate a lot of noise as response quality and style can vary from annotator to annotator. Controlling for this is key.\n\nAccording to reported insights, you will need to collect thousands of examples to ensure you are tuning a high-quality LLM. SFT data helps to improve target areas that allow steering the LLM better to your needs.\n\nWe can help with your SFT data needs! If you need help with collecting high-quality preference or SFT data, reach out to our team here: https://t.co/OSm4aHIOP6",
  "source": "Twitter for iPhone",
  "retweetCount": 2,
  "replyCount": 0,
  "likeCount": 32,
  "quoteCount": 0,
  "viewCount": 4410,
  "createdAt": "Wed Aug 16 13:34:36 +0000 2023",
  "lang": "en",
  "bookmarkCount": 12,
  "isReply": false,
  "inReplyToId": null,
  "conversationId": "1691805661317795965",
  "inReplyToUserId": null,
  "inReplyToUsername": null,
  "author": {
    "type": "user",
    "userName": "HelloSurgeAI",
    "url": "https://x.com/HelloSurgeAI",
    "twitterUrl": "https://twitter.com/HelloSurgeAI",
    "id": "1267866160894222343",
    "name": "Surge AI",
    "isVerified": false,
    "isBlueVerified": true,
    "verifiedType": null,
    "profilePicture": "https://pbs.twimg.com/profile_images/1942208543618220032/tBPg9A4s_normal.jpg",
    "coverPicture": "https://pbs.twimg.com/profile_banners/1267866160894222343/1751563043",
    "description": "",
    "location": "",
    "followers": 5395,
    "following": 142,
    "status": "",
    "canDm": true,
    "canMediaTag": true,
    "createdAt": "Tue Jun 02 17:10:41 +0000 2020",
    "entities": {
      "description": {
        "urls": []
      },
      "url": {}
    },
    "fastFollowersCount": 0,
    "favouritesCount": 249,
    "hasCustomTimelines": true,
    "isTranslator": false,
    "mediaCount": 183,
    "statusesCount": 613,
    "withheldInCountries": [],
    "affiliatesHighlightedLabel": {},
    "possiblySensitive": false,
    "pinnedTweetIds": [
      "1681343766123143168"
    ],
    "profile_bio": {
      "description": "Human data for AGI. Our mission: to raise AGI with the richness of human intelligence — curious, witty, imaginative, and full of unexpected brilliance.",
      "entities": {
        "description": {},
        "url": {
          "urls": [
            {
              "display_url": "surgehq.ai",
              "expanded_url": "https://www.surgehq.ai",
              "indices": [
                0,
                23
              ],
              "url": "https://t.co/6bGF7OxrIX"
            }
          ]
        }
      }
    },
    "isAutomated": false,
    "automatedBy": null
  },
  "extendedEntities": {
    "media": [
      {
        "display_url": "pic.twitter.com/tDbYXItL7d",
        "expanded_url": "https://twitter.com/HelloSurgeAI/status/1691805661317795965/photo/1",
        "ext_media_availability": {
          "status": "Available"
        },
        "features": {
          "large": {
            "faces": [
              {
                "h": 67,
                "w": 67,
                "x": 32,
                "y": 487
              }
            ]
          },
          "orig": {
            "faces": [
              {
                "h": 67,
                "w": 67,
                "x": 32,
                "y": 487
              }
            ]
          }
        },
        "id_str": "1691797209107333120",
        "indices": [
          282,
          305
        ],
        "media_key": "3_1691797209107333120",
        "media_results": {
          "id": "QXBpTWVkaWFSZXN1bHRzOgwAAQoAARd6eJkg1wAACgACF3qASQ/XkH0AAA==",
          "result": {
            "__typename": "ApiMedia",
            "id": "QXBpTWVkaWE6DAABCgABF3p4mSDXAAAKAAIXeoBJD9eQfQAA",
            "media_key": "3_1691797209107333120"
          }
        },
        "media_url_https": "https://pbs.twimg.com/media/F3p4mSDXAAABiQk.jpg",
        "original_info": {
          "focus_rects": [
            {
              "h": 650,
              "w": 1160,
              "x": 0,
              "y": 0
            },
            {
              "h": 712,
              "w": 712,
              "x": 79,
              "y": 0
            },
            {
              "h": 712,
              "w": 625,
              "x": 123,
              "y": 0
            },
            {
              "h": 712,
              "w": 356,
              "x": 257,
              "y": 0
            },
            {
              "h": 712,
              "w": 1160,
              "x": 0,
              "y": 0
            }
          ],
          "height": 712,
          "width": 1160
        },
        "sizes": {
          "large": {
            "h": 712,
            "w": 1160
          }
        },
        "type": "photo",
        "url": "https://t.co/tDbYXItL7d"
      }
    ]
  },
  "card": null,
  "place": {},
  "entities": {
    "urls": [
      {
        "display_url": "surgehq.ai/rlhf",
        "expanded_url": "https://surgehq.ai/rlhf",
        "indices": [
          2066,
          2089
        ],
        "url": "https://t.co/OSm4aHIOP6"
      }
    ]
  },
  "quoted_tweet": null,
  "retweeted_tweet": null,
  "article": null
}