Your curated collection of saved posts and media
What is SFT data and what role does it play in state-of-the-art LLMs? Supervised finetuning (SFT) in the context of RLHF deals with further tuning an initial language model using demonstration data. At Surge AI, we provide SFT data for top LLM teams to finetune their LLMs. Here is what we have observed: SFT data typically involves collecting demonstration data including prompts and in-depth responses written by human annotators demonstrating how the model should respond to the prompt. Specifically, You take a set of commands and obtain human-written responses for each. The SFT training dataset consists of <prompt, ideal generation> pairs used to finetune the pre-trained LLM to output human-like responses. So letβs say you are aligning an LLM-powered dialogue system then you need to collect dialogue-style instructions/responses data that cater to that use case. Similarly, as shown in the figure, if you want high-quality code generation capabilities you can also provide instruction + written responses as part of the SFT data. This leads to the first important component, also referred to as supervised policy, for training an RLHF LLM. But why go through all this process when training RLHF LLMs? The core idea of SFT is to provide a high-quality initialization for the RLHF process. Itβs widely applied by some of the most advanced closed and open-sourced LLMs. To make this work, you need to collect lots of demonstration data but the challenge is collecting high-quality and diverse demonstration data at scale. SFT data can be written by different annotators and can incorporate a lot of noise as response quality and style can vary from annotator to annotator. Controlling for this is key. According to reported insights, you will need to collect thousands of examples to ensure you are tuning a high-quality LLM. SFT data helps to improve target areas that allow steering the LLM better to your needs. We can help with your SFT data needs! If you need help with collecting high-quality preference or SFT data, reach out to our team here: https://t.co/OSm4aHIOP6
Making LLMs reliable is a tough task. But this is where a lot of the LLM research and development work is focused. Let's take a look at how LLMs are made reliable: At Surge AI, we work with the top AI companies to improve LLM reliability. This effort is essential to enable wider applicability in even higher-stakes domains. Reliability not only focuses on getting models to output what users want in terms of specifics and quality but also ensuring that no unwanted output (e.g., toxic content) is produced by the model. A lot of the current efforts to increase reliability focus on ad-hoc approaches and prompt engineering. More recently, there have been more efforts to develop a more systematic framework to improve reliability while training models. This has led to a lot of interest in red teaming. Red teaming deals with identifying risks in LLMs through adversarial prompting. It has been applied not only to general-purpose LLMs like Claude and ChatGPT but also to more recent code LLMs like Llama Code. The challenge with red teaming is that, if not done right, it can lead to LLMs over-refusing and potentially leading to a bad user experience. In addition, the reality is that red teaming requires deep expertise in working with LLMs. We deeply believe that in order to make LLMs safer, useful, and more reliable, comprehensive red teaming is critical. But you don't need to hear this from us. Many large LLM companies have also publicly expressed huge interest in red teaming. If you are looking for deep expertise in training LLMs and red teaming, reach out to learn how our world-class team can help: https://t.co/iFBrffKYKT
π¦ Congrats Llama 3! π¦ Frontier LLM developers know the only human data that brings them to the top π https://t.co/3GcEWMpugq
Olympics? Forget it. There's a more exciting race going on π Congrats to all our friends at Google! https://t.co/amLWRxCIjZ
Small, focused teams can achieve incredible things β very proud of what weβve built! https://t.co/f1lIkCYOkG https://t.co/7h8jmN8Ckw
Nice words from @timoreilly! https://t.co/U6TRiV2TND
open source platform for logs, metrics, and traces https://t.co/TCvn2wYXSe
Open source alternative to AWS. https://t.co/bsMbIzSOZ8
Well, there just goes a whole lotta votes⦠https://t.co/L9NfzathuG
Wow. Ok. https://t.co/zgO725MAba
Drawing day - Get your tickets! Go to https://t.co/BbJjbIA21S for a chance at #HALF of a projected $14,500. Drawing tonight at 6. #ShirleysWay Lic#Org2527 Watch LIVE on our Joker's Wild LIVE FB page https://t.co/mVbUXWDQXG
2 days left to get tickets at https://t.co/BAqsFBIKgV. You could win #HALF of a projected $24,000 in prize money. #ShirleysWay Lic#Org2527 Watch LIVE on our Wheel of Cash LIVE FB page https://t.co/zC7JWM1Rnk
1 day left to get tickets at https://t.co/BAqsFBIKgV. You could win #HALF of a projected $24,000 in prize money. #ShirleysWay Lic#Org2527 Watch LIVE on our Wheel of Cash LIVE FB page https://t.co/3lvN7xSJ6J
Drawing day! Get your tickets at https://t.co/BAqsFBIKgV. You could win #HALF of a projected $24,000 in prize money. Drawing tonight at 5! #ShirleysWay Lic#Org2527 Watch LIVE on our Wheel of Cash LIVE FB page https://t.co/u5ockA9L1i
Drawing Day - Get your tickets online at https://t.co/Fff8vgqRdJ . You could win #HALF of a projected $52,000. Drawing tonight at 8:30! Lic#Org2527 We will be at South End BBQ Watch LIVE on our Queen of Hearts LIVE FB page https://t.co/paY62Ss678
Drawing Day - Get your tickets online at https://t.co/Fff8vgqRdJ . You could win #HALF of a projected $52,000. Drawing tonight at 8:30! Lic#Org2527 We will be at South End BBQ Watch LIVE on our Queen of Hearts LIVE FB page https://t.co/YNJiBVIaeN
Drawing day - Get your tickets! Go to https://t.co/BbJjbIA21S for a chance at #HALF of a projected $16,000. Drawing tonight at 6. #ShirleysWay Lic#Org2527 Watch LIVE on our Joker's Wild LIVE FB page https://t.co/7ULQSaX4Ai
Nader Summer camp: I corralled the kids from bring your child to work day and taught them how to vibe code on Replit! π€©π€ They designed and built a video game while their parents had βmeetingsβ https://t.co/lqeaKmhZcc
All in a day's work https://t.co/sWURFlRz8P
Great week for ai Couple of OSS models, one that can run on edge, and GPT 5 Gonna be a busy weekend π€ https://t.co/shimU9HMTC
I found the most dedicated employee in the NVIDIA parking garage https://t.co/0p2Q96qmDe
I'm 100% in favor of purpose-built robots. While AI could enable humanoids that do lots of things imperfectly, specialized form factors are the only path to a sustainable robotics business. Thanks to everyone who turned out to see Dusty at the Cerebral Valley AI Summit. https://t.co/fM8Hn7sgPw
Crypto unlocks vocation - aka doing what you like and are good at: @BAXUSco https://t.co/MrXoaUuJlD https://t.co/zyP6q2RKqj
and if you want to get a bit 'meta': https://t.co/0JamV4pIMJ
History, not as litany of facts, but as a window from the past into the present, and plausible futures. Braudel and the annales school are where it's at Related / an intro for folks: https://t.co/J2LcKazs5K
This is amazing. I'm learning to accept that I fundamentally just don't care about that much battles, diplomatic treaties etc. But if I can read about birth rates or why people started drinking coffee or whatever I'm transfixed https://t.co/hEhJEBbdLY
It happens in better families https://t.co/DMUqPXSnel
Anthropic Tool Caching in AI SDK v5 https://t.co/G3eaQpncmL
New in-product feedback page. Simple, clean. effective. Goes directly to a slack channel we read. https://t.co/JDzDPWkGFz
Productive couple of days https://t.co/Ox5DRUAdkh

This agent has been actively managing my Base AI portfolio Forgot about it Checked in today Itβs up 19% Good stimmy https://t.co/zPJOtzt0gO
Rebalanced: ETH remains the largest position. DEGEN fully exited. Increased GIZA, KTA, VIRTUAL on strong price trends. Trimmed weaker tokens. USDbC set at 20% for capital defense. Monitoring for further momentum among top ranked tokens.
Itβs on August 16 my first boxing fight How to create a market on @Polymarket ? https://t.co/ACJ5AhT04j
Breaking: GPT 5 still canβt draw proper charts Accenture consultants safe for a few more months https://t.co/tJWGjRcGI2