🐦 Twitter Post Details

Viewing enriched Twitter post

@kwindla

Tiny SOTA model release today: v3 of the Smart Turn semantic VAD model. Smart Turn is a native audio, open source, open data, open training code model for detecting whether a human has stopped speaking and expects a voice agent to respond. The model now runs in <60ms on most cloud vCPUs, faster than that on your local CPU, and in <10ms on GPU. Running on CPU makes it essentially free to use this in a voice AI agent. 23 languages, and you can contribute data or data labeling to add a language or improve the model performance in any of the existing language. This model is a community effort. We think that for what we built it for, this model benchmarks better than any other model that's currently available. But if you're interested in turn detection, you should also check out excellent recent work from the @krispHQ and @ultravox_dot_ai, teams, which have released models that are very good, make somewhat different trade-offs compared to the Smart Turn model, and do better than Smart Turn relative to their respective goals. Super-fun things happening all the time these days in voice AI! Anybody can use the Smart Turn model in any deployment. It has no license restrictions and is completely open source. It's bundled into the upcoming @pipecat_ai 0.0.85 release. And, of course, it's available on the Pipecat Cloud voice agent hosting platform.

View on Twitter

📊 Media Metadata

{
  "media": [
    {
      "url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/1966359269080707363/media_0.jpg?",
      "media_url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/1966359269080707363/media_0.jpg?",
      "type": "photo",
      "filename": "media_0.jpg"
    }
  ],
  "processed_at": "2025-09-18T13:52:33.531798",
  "pipeline_version": "2.0"
}

🔧 Raw API Response

{
  "type": "tweet",
  "id": "1966359269080707363",
  "url": "https://x.com/kwindla/status/1966359269080707363",
  "twitterUrl": "https://twitter.com/kwindla/status/1966359269080707363",
  "text": "Tiny SOTA model release today: v3 of the Smart Turn semantic VAD model.\n\nSmart Turn is a native audio, open source, open data, open training code model for detecting whether a human has stopped speaking and expects a voice agent to respond.\n\nThe model now runs in <60ms on most cloud vCPUs, faster than that on your local CPU, and in <10ms on GPU. Running on CPU makes it essentially free to use this in a voice AI agent.\n\n23 languages, and you can contribute data or data labeling to add a language or improve the model performance in any of the existing language. This model is a community effort.\n\nWe think that for what we built it for, this model benchmarks better than any other model that's currently available. But if you're interested in turn detection, you should also check out excellent recent work from the @krispHQ and @ultravox_dot_ai, teams, which have released models that are very good, make somewhat different trade-offs compared to the Smart Turn model, and do better than Smart Turn relative to their respective goals.\n\nSuper-fun things happening all the time these days in voice AI!  Anybody can use the Smart Turn model in any deployment. It has no license restrictions and is completely open source. It's bundled into the upcoming\n@pipecat_ai 0.0.85 release. And, of course, it's available on the Pipecat Cloud voice agent hosting platform.",
  "source": "Twitter for iPhone",
  "retweetCount": 22,
  "replyCount": 11,
  "likeCount": 151,
  "quoteCount": 2,
  "viewCount": 7401,
  "createdAt": "Fri Sep 12 04:32:37 +0000 2025",
  "lang": "en",
  "bookmarkCount": 119,
  "isReply": false,
  "inReplyToId": null,
  "conversationId": "1966359269080707363",
  "displayTextRange": [
    0,
    281
  ],
  "inReplyToUserId": null,
  "inReplyToUsername": null,
  "author": {
    "type": "user",
    "userName": "kwindla",
    "url": "https://x.com/kwindla",
    "twitterUrl": "https://twitter.com/kwindla",
    "id": "16375739",
    "name": "kwindla",
    "isVerified": false,
    "isBlueVerified": true,
    "verifiedType": null,
    "profilePicture": "https://pbs.twimg.com/profile_images/1790772534914551808/YpwkVUIl_normal.jpg",
    "coverPicture": "https://pbs.twimg.com/profile_banners/16375739/1502226088",
    "description": "",
    "location": "San Francisco, CA",
    "followers": 10761,
    "following": 3834,
    "status": "",
    "canDm": true,
    "canMediaTag": true,
    "createdAt": "Sat Sep 20 07:14:14 +0000 2008",
    "entities": {
      "description": {
        "urls": []
      },
      "url": {}
    },
    "fastFollowersCount": 0,
    "favouritesCount": 7283,
    "hasCustomTimelines": true,
    "isTranslator": false,
    "mediaCount": 1187,
    "statusesCount": 5423,
    "withheldInCountries": [],
    "affiliatesHighlightedLabel": {
      "label": {
        "badge": {
          "url": "https://pbs.twimg.com/profile_images/1855992730360713216/_x37w4M7_bigger.jpg"
        },
        "description": "Daily",
        "url": {
          "url": "https://twitter.com/trydaily",
          "url_type": "DeepLink"
        },
        "user_label_type": "BusinessLabel",
        "user_label_display_type": "Badge"
      }
    },
    "possiblySensitive": false,
    "pinnedTweetIds": [
      "1901762329467035952"
    ],
    "profile_bio": {
      "description": "Infrastructure and developer tools for real-time voice, video, and AI. @trydaily // ᓚᘏᗢ // @pipecat_ai",
      "entities": {
        "description": {
          "user_mentions": [
            {
              "id_str": "0",
              "indices": [
                71,
                80
              ],
              "name": "",
              "screen_name": "trydaily"
            },
            {
              "id_str": "0",
              "indices": [
                91,
                102
              ],
              "name": "",
              "screen_name": "pipecat_ai"
            }
          ]
        },
        "url": {
          "urls": [
            {
              "display_url": "machine-theory.com",
              "expanded_url": "https://machine-theory.com/",
              "indices": [
                0,
                23
              ],
              "url": "https://t.co/plyseTkcW0"
            }
          ]
        }
      }
    },
    "isAutomated": false,
    "automatedBy": null
  },
  "extendedEntities": {
    "media": [
      {
        "display_url": "pic.twitter.com/jTBtF3oqpl",
        "expanded_url": "https://twitter.com/kwindla/status/1966359269080707363/photo/1",
        "ext_media_availability": {
          "status": "Available"
        },
        "features": {
          "large": {},
          "orig": {}
        },
        "id_str": "1966359055368335364",
        "indices": [
          282,
          305
        ],
        "media_key": "3_1966359055368335364",
        "media_results": {
          "id": "QXBpTWVkaWFSZXN1bHRzOgwAAQoAARtJ6SSEGjAECgACG0npVkZaMSMAAA==",
          "result": {
            "__typename": "ApiMedia",
            "id": "QXBpTWVkaWE6DAABCgABG0npJIQaMAQKAAIbSelWRloxIwAA",
            "media_key": "3_1966359055368335364"
          }
        },
        "media_url_https": "https://pbs.twimg.com/media/G0npJIQaMAQMq_7.jpg",
        "original_info": {
          "focus_rects": [
            {
              "h": 885,
              "w": 1580,
              "x": 0,
              "y": 0
            },
            {
              "h": 1300,
              "w": 1300,
              "x": 0,
              "y": 0
            },
            {
              "h": 1300,
              "w": 1140,
              "x": 0,
              "y": 0
            },
            {
              "h": 1300,
              "w": 650,
              "x": 0,
              "y": 0
            },
            {
              "h": 1300,
              "w": 1580,
              "x": 0,
              "y": 0
            }
          ],
          "height": 1300,
          "width": 1580
        },
        "sizes": {
          "large": {
            "h": 1300,
            "w": 1580
          }
        },
        "type": "photo",
        "url": "https://t.co/jTBtF3oqpl"
      }
    ]
  },
  "card": null,
  "place": {},
  "entities": {
    "user_mentions": [
      {
        "id_str": "986981778903764994",
        "indices": [
          820,
          828
        ],
        "name": "Krisp",
        "screen_name": "krispHQ"
      },
      {
        "id_str": "1567287105747034114",
        "indices": [
          833,
          849
        ],
        "name": "Ultravox AI",
        "screen_name": "ultravox_dot_ai"
      },
      {
        "id_str": "1789730809940938752",
        "indices": [
          1255,
          1266
        ],
        "name": "Pipecat AI",
        "screen_name": "pipecat_ai"
      }
    ]
  },
  "quoted_tweet": null,
  "retweeted_tweet": null,
  "isLimitedReply": false,
  "article": null
}