🐦 Twitter Post Details

Viewing enriched Twitter post

@rohanpaul_ai

🇨🇳 Another great Chinese Model, OmniHuman-1.5 from ByteDance Turns 1 image plus a voice track into expressive avatar video by pairing a System 1 and System 2 inspired planner with a Diffusion Transformer, Produces coherent motion for over 1 minute with moving camera and multi character scenes. Most avatar models move to the beat of the audio but miss meaning, so gestures feel generic and emotions feel shallow. The fix here is a Multimodal LLM planner that listens to the speech and drafts a structured plan describing intent, emotions, beats, and high level actions, which gives the motion engine clear semantic targets instead of only rhythm. The motion engine is a Multimodal Diffusion Transformer that fuses the plan with audio, the single reference image, and optional text prompts, then synthesizes continuous body, face, and head motion that matches both words and tone. A key trick is a Pseudo Last Frame, a synthetic target that summarizes the next expected state, which stabilizes fusion across modalities and keeps motion consistent over long spans. From just 1 image and speech, the system outputs speaking avatars with synchronized lips, context aware gestures, and continuous camera movement, and it also supports multi character interactions without manual choreography. Reported results show strong lip sync accuracy, high video quality, natural motion, and close match to text prompts, and the same setup works on nonhuman characters too.

View on Twitter

📊 Media Metadata

{
  "media": [
    {
      "url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/1961876199875235984/media_0.mp4?",
      "media_url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/1961876199875235984/media_0.mp4?",
      "type": "video",
      "filename": "media_0.mp4"
    },
    {
      "url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/1961876199875235984/media_1.mp4?",
      "media_url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/1961876199875235984/media_1.mp4?",
      "type": "video",
      "filename": "media_1.mp4"
    },
    {
      "url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/1961876199875235984/media_2.mp4?",
      "media_url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/1961876199875235984/media_2.mp4?",
      "type": "video",
      "filename": "media_2.mp4"
    },
    {
      "url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/1961876199875235984/media_3.mp4?",
      "media_url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/1961876199875235984/media_3.mp4?",
      "type": "video",
      "filename": "media_3.mp4"
    }
  ],
  "processed_at": "2025-09-01T19:54:14.190023",
  "pipeline_version": "2.0"
}

🔧 Raw API Response

{
  "type": "tweet",
  "id": "1961876199875235984",
  "url": "https://x.com/rohanpaul_ai/status/1961876199875235984",
  "twitterUrl": "https://twitter.com/rohanpaul_ai/status/1961876199875235984",
  "text": "🇨🇳 Another great Chinese Model, OmniHuman-1.5 from ByteDance \n\nTurns 1 image plus a voice track into expressive avatar video by pairing a System 1 and System 2 inspired planner with a Diffusion Transformer, \n\nProduces coherent motion for over 1 minute with moving camera and multi character scenes.\n\nMost avatar models move to the beat of the audio but miss meaning, so gestures feel generic and emotions feel shallow.\n\nThe fix here is a Multimodal LLM planner that listens to the speech and drafts a structured plan describing intent, emotions, beats, and high level actions, which gives the motion engine clear semantic targets instead of only rhythm.\n\nThe motion engine is a Multimodal Diffusion Transformer that fuses the plan with audio, the single reference image, and optional text prompts, then synthesizes continuous body, face, and head motion that matches both words and tone.\n\nA key trick is a Pseudo Last Frame, a synthetic target that summarizes the next expected state, which stabilizes fusion across modalities and keeps motion consistent over long spans.\n\nFrom just 1 image and speech, the system outputs speaking avatars with synchronized lips, context aware gestures, and continuous camera movement, and it also supports multi character interactions without manual choreography.\n\nReported results show strong lip sync accuracy, high video quality, natural motion, and close match to text prompts, and the same setup works on nonhuman characters too.",
  "source": "Twitter for iPhone",
  "retweetCount": 90,
  "replyCount": 28,
  "likeCount": 562,
  "quoteCount": 17,
  "viewCount": 61375,
  "createdAt": "Sat Aug 30 19:38:30 +0000 2025",
  "lang": "en",
  "bookmarkCount": 530,
  "isReply": false,
  "inReplyToId": null,
  "conversationId": "1961876199875235984",
  "displayTextRange": [
    0,
    281
  ],
  "inReplyToUserId": null,
  "inReplyToUsername": null,
  "author": {
    "type": "user",
    "userName": "rohanpaul_ai",
    "url": "https://x.com/rohanpaul_ai",
    "twitterUrl": "https://twitter.com/rohanpaul_ai",
    "id": "2588345408",
    "name": "Rohan Paul",
    "isVerified": false,
    "isBlueVerified": true,
    "verifiedType": null,
    "profilePicture": "https://pbs.twimg.com/profile_images/1816185267037859840/Fd18CH0v_normal.jpg",
    "coverPicture": "https://pbs.twimg.com/profile_banners/2588345408/1729559315",
    "description": "",
    "location": "Ex Inv Banking (Deutsche)",
    "followers": 81306,
    "following": 8193,
    "status": "",
    "canDm": true,
    "canMediaTag": false,
    "createdAt": "Wed Jun 25 22:38:54 +0000 2014",
    "entities": {
      "description": {
        "urls": []
      },
      "url": {}
    },
    "fastFollowersCount": 0,
    "favouritesCount": 45266,
    "hasCustomTimelines": true,
    "isTranslator": false,
    "mediaCount": 20861,
    "statusesCount": 51671,
    "withheldInCountries": [],
    "affiliatesHighlightedLabel": {},
    "possiblySensitive": false,
    "pinnedTweetIds": [
      "1962309655834874272"
    ],
    "profile_bio": {
      "description": "Compiling in real-time, the race towards AGI.\n\n🗞️ Don't miss my daily top 1% AI analysis newsletter directly to your inbox 👉 https://t.co/6LBxO8215l",
      "entities": {
        "description": {
          "urls": [
            {
              "display_url": "rohan-paul.com",
              "expanded_url": "https://www.rohan-paul.com",
              "indices": [
                125,
                148
              ],
              "url": "https://t.co/6LBxO8215l"
            }
          ]
        },
        "url": {
          "urls": [
            {
              "display_url": "rohan-paul.com",
              "expanded_url": "http://www.rohan-paul.com",
              "indices": [
                0,
                23
              ],
              "url": "https://t.co/2NKnK0wIil"
            }
          ]
        }
      }
    },
    "isAutomated": false,
    "automatedBy": null
  },
  "extendedEntities": {
    "media": [
      {
        "additional_media_info": {
          "monetizable": false
        },
        "allow_download_status": {
          "allow_download": true
        },
        "display_url": "pic.twitter.com/2NC9K5uM3z",
        "expanded_url": "https://twitter.com/rohanpaul_ai/status/1961876199875235984/video/1",
        "ext_media_availability": {
          "status": "Available"
        },
        "id_str": "1961875094877442048",
        "indices": [
          282,
          305
        ],
        "media_key": "13_1961875094877442048",
        "media_results": {
          "id": "QXBpTWVkaWFSZXN1bHRzOgwABAoAARs5+wEb2xAAAAA=",
          "result": {
            "__typename": "ApiMedia",
            "id": "QXBpTWVkaWE6DAAECgABGzn7ARvbEAAAAA==",
            "media_key": "13_1961875094877442048"
          }
        },
        "media_url_https": "https://pbs.twimg.com/amplify_video_thumb/1961875094877442048/img/ofRo7_PT69zJIJMc.jpg",
        "original_info": {
          "focus_rects": [],
          "height": 722,
          "width": 1280
        },
        "sizes": {
          "large": {
            "h": 722,
            "w": 1280
          }
        },
        "type": "video",
        "url": "https://t.co/2NC9K5uM3z",
        "video_info": {
          "aspect_ratio": [
            640,
            361
          ],
          "duration_millis": 18933,
          "variants": [
            {
              "content_type": "application/x-mpegURL",
              "url": "https://video.twimg.com/amplify_video/1961875094877442048/pl/P4Yy6g1peqs3Hju0.m3u8?tag=21&v=a79"
            },
            {
              "bitrate": 256000,
              "content_type": "video/mp4",
              "url": "https://video.twimg.com/amplify_video/1961875094877442048/vid/avc1/478x270/MCd27m9vwHMifpXn.mp4?tag=21"
            },
            {
              "bitrate": 832000,
              "content_type": "video/mp4",
              "url": "https://video.twimg.com/amplify_video/1961875094877442048/vid/avc1/638x360/T_U27qRKI7LLpjYv.mp4?tag=21"
            },
            {
              "bitrate": 2176000,
              "content_type": "video/mp4",
              "url": "https://video.twimg.com/amplify_video/1961875094877442048/vid/avc1/1280x722/ZAvB8tP5LW9rSvLD.mp4?tag=21"
            }
          ]
        }
      },
      {
        "additional_media_info": {
          "monetizable": false
        },
        "allow_download_status": {
          "allow_download": true
        },
        "display_url": "pic.twitter.com/2NC9K5uM3z",
        "expanded_url": "https://twitter.com/rohanpaul_ai/status/1961876199875235984/video/1",
        "ext_media_availability": {
          "status": "Available"
        },
        "id_str": "1961875094877417472",
        "indices": [
          282,
          305
        ],
        "media_key": "13_1961875094877417472",
        "media_results": {
          "id": "QXBpTWVkaWFSZXN1bHRzOgwABAoAARs5+wEb2rAAAAA=",
          "result": {
            "__typename": "ApiMedia",
            "id": "QXBpTWVkaWE6DAAECgABGzn7ARvasAAAAA==",
            "media_key": "13_1961875094877417472"
          }
        },
        "media_url_https": "https://pbs.twimg.com/amplify_video_thumb/1961875094877417472/img/-rMV0Pj06zDywOUG.jpg",
        "original_info": {
          "focus_rects": [],
          "height": 704,
          "width": 1248
        },
        "sizes": {
          "large": {
            "h": 704,
            "w": 1248
          }
        },
        "type": "video",
        "url": "https://t.co/2NC9K5uM3z",
        "video_info": {
          "aspect_ratio": [
            39,
            22
          ],
          "duration_millis": 8040,
          "variants": [
            {
              "content_type": "application/x-mpegURL",
              "url": "https://video.twimg.com/amplify_video/1961875094877417472/pl/vPTWZvNTxjW9n-tr.m3u8?tag=21"
            },
            {
              "bitrate": 256000,
              "content_type": "video/mp4",
              "url": "https://video.twimg.com/amplify_video/1961875094877417472/vid/avc1/478x270/ZAdPdeyQGeOI8PwU.mp4?tag=21"
            },
            {
              "bitrate": 832000,
              "content_type": "video/mp4",
              "url": "https://video.twimg.com/amplify_video/1961875094877417472/vid/avc1/638x360/5lkYzEhSf3IU4kwM.mp4?tag=21"
            },
            {
              "bitrate": 2176000,
              "content_type": "video/mp4",
              "url": "https://video.twimg.com/amplify_video/1961875094877417472/vid/avc1/1248x704/2Ug2n5lUs152T9Yh.mp4?tag=21"
            }
          ]
        }
      },
      {
        "additional_media_info": {
          "monetizable": false
        },
        "allow_download_status": {
          "allow_download": true
        },
        "display_url": "pic.twitter.com/2NC9K5uM3z",
        "expanded_url": "https://twitter.com/rohanpaul_ai/status/1961876199875235984/video/1",
        "ext_media_availability": {
          "status": "Available"
        },
        "id_str": "1961875096609710080",
        "indices": [
          282,
          305
        ],
        "media_key": "13_1961875096609710080",
        "media_results": {
          "id": "QXBpTWVkaWFSZXN1bHRzOgwABAoAARs5+wGDG2AAAAA=",
          "result": {
            "__typename": "ApiMedia",
            "id": "QXBpTWVkaWE6DAAECgABGzn7AYMbYAAAAA==",
            "media_key": "13_1961875096609710080"
          }
        },
        "media_url_https": "https://pbs.twimg.com/amplify_video_thumb/1961875096609710080/img/mFmWvlkCVONaNEtv.jpg",
        "original_info": {
          "focus_rects": [],
          "height": 722,
          "width": 1280
        },
        "sizes": {
          "large": {
            "h": 722,
            "w": 1280
          }
        },
        "type": "video",
        "url": "https://t.co/2NC9K5uM3z",
        "video_info": {
          "aspect_ratio": [
            640,
            361
          ],
          "duration_millis": 38133,
          "variants": [
            {
              "content_type": "application/x-mpegURL",
              "url": "https://video.twimg.com/amplify_video/1961875096609710080/pl/nHn1IaYvFMAocvBQ.m3u8?tag=21&v=494"
            },
            {
              "bitrate": 256000,
              "content_type": "video/mp4",
              "url": "https://video.twimg.com/amplify_video/1961875096609710080/vid/avc1/478x270/jwkyW03Bh83QePEg.mp4?tag=21"
            },
            {
              "bitrate": 832000,
              "content_type": "video/mp4",
              "url": "https://video.twimg.com/amplify_video/1961875096609710080/vid/avc1/638x360/Z4nSEf10zoRVHpFU.mp4?tag=21"
            },
            {
              "bitrate": 2176000,
              "content_type": "video/mp4",
              "url": "https://video.twimg.com/amplify_video/1961875096609710080/vid/avc1/1280x722/I-L387upb_8bwEOH.mp4?tag=21"
            }
          ]
        }
      },
      {
        "additional_media_info": {
          "monetizable": false
        },
        "allow_download_status": {
          "allow_download": true
        },
        "display_url": "pic.twitter.com/2NC9K5uM3z",
        "expanded_url": "https://twitter.com/rohanpaul_ai/status/1961876199875235984/video/1",
        "ext_media_availability": {
          "status": "Available"
        },
        "id_str": "1961875096764870656",
        "indices": [
          282,
          305
        ],
        "media_key": "13_1961875096764870656",
        "media_results": {
          "id": "QXBpTWVkaWFSZXN1bHRzOgwABAoAARs5+wGMWvAAAAA=",
          "result": {
            "__typename": "ApiMedia",
            "id": "QXBpTWVkaWE6DAAECgABGzn7AYxa8AAAAA==",
            "media_key": "13_1961875096764870656"
          }
        },
        "media_url_https": "https://pbs.twimg.com/amplify_video_thumb/1961875096764870656/img/KBy4f3Im4nFgz3ww.jpg",
        "original_info": {
          "focus_rects": [],
          "height": 1080,
          "width": 1916
        },
        "sizes": {
          "large": {
            "h": 1080,
            "w": 1916
          }
        },
        "type": "video",
        "url": "https://t.co/2NC9K5uM3z",
        "video_info": {
          "aspect_ratio": [
            479,
            270
          ],
          "duration_millis": 95866,
          "variants": [
            {
              "content_type": "application/x-mpegURL",
              "url": "https://video.twimg.com/amplify_video/1961875096764870656/pl/0lgMYUQwxWHjNXig.m3u8?tag=21&v=0be"
            },
            {
              "bitrate": 256000,
              "content_type": "video/mp4",
              "url": "https://video.twimg.com/amplify_video/1961875096764870656/vid/avc1/478x270/Eu8rvvnhk7KxjHVM.mp4?tag=21"
            },
            {
              "bitrate": 832000,
              "content_type": "video/mp4",
              "url": "https://video.twimg.com/amplify_video/1961875096764870656/vid/avc1/638x360/nmk8gA0wpIYko3EE.mp4?tag=21"
            },
            {
              "bitrate": 2176000,
              "content_type": "video/mp4",
              "url": "https://video.twimg.com/amplify_video/1961875096764870656/vid/avc1/1276x720/0VgNIaMKNnODg32u.mp4?tag=21"
            },
            {
              "bitrate": 10368000,
              "content_type": "video/mp4",
              "url": "https://video.twimg.com/amplify_video/1961875096764870656/vid/avc1/1916x1080/AL19SHpiDCv46Dx4.mp4?tag=21"
            }
          ]
        }
      }
    ]
  },
  "card": null,
  "place": {},
  "entities": {},
  "quoted_tweet": null,
  "retweeted_tweet": null,
  "isLimitedReply": false,
  "article": null
}