🐦 Twitter Post Details

Viewing enriched Twitter post

@_akhaliq

Presto! Distilling Steps and Layers for Accelerating Music Generation Despite advances in diffusion-based text-to-music (TTM) methods, efficient, high-quality generation remains a challenge. We introduce Presto!, an approach to inference acceleration for score-based diffusion transformers via reducing both sampling steps and cost per step. To reduce steps, we develop a new score-based distribution matching distillation (DMD) method for the EDM-family of diffusion models, the first GAN-based distillation method for TTM. To reduce the cost per step, we develop a simple, but powerful improvement to a recent layer distillation method that improves learning via better preserving hidden state variance. Finally, we combine our step and layer distillation methods together for a dual-faceted approach. We evaluate our step and layer distillation methods independently and show each yield best-in-class performance. Our combined distillation method can generate high-quality outputs with improved diversity, accelerating our base model by 10-18x (230/435ms latency for 32 second mono/stereo 44.1kHz, 15x faster than comparable SOTA) -- the fastest high-quality TTM to our knowledge.

🔧 Raw API Response

{
  "user": {
    "created_at": "2014-04-27T00:20:12.000Z",
    "default_profile_image": false,
    "description": "AI research paper tweets, ML @Gradio (acq. by @HuggingFace 🤗)\n\ndm for promo, submit papers here: https://t.co/UzmYN5XOCi",
    "fast_followers_count": 0,
    "favourites_count": 34384,
    "followers_count": 357492,
    "friends_count": 2858,
    "has_custom_timelines": true,
    "is_translator": false,
    "listed_count": 4274,
    "location": "",
    "media_count": 17112,
    "name": "AK",
    "normal_followers_count": 357492,
    "possibly_sensitive": false,
    "profile_banner_url": "https://pbs.twimg.com/profile_banners/2465283662/1610997549",
    "profile_image_url_https": "https://pbs.twimg.com/profile_images/1451191636810092553/kpM5Fe12_normal.jpg",
    "screen_name": "_akhaliq",
    "statuses_count": 37824,
    "translator_type": "none",
    "url": "https://t.co/q2Qoey80Gx",
    "verified": true,
    "withheld_in_countries": [],
    "id_str": "2465283662"
  },
  "id": "1843501455606518125",
  "conversation_id": "1843501455606518125",
  "full_text": "Presto!\n\nDistilling Steps and Layers for Accelerating Music Generation\n\nDespite advances in diffusion-based text-to-music (TTM) methods, efficient, high-quality generation remains a challenge. We introduce Presto!, an approach to inference acceleration for score-based diffusion transformers via reducing both sampling steps and cost per step. To reduce steps, we develop a new score-based distribution matching distillation (DMD) method for the EDM-family of diffusion models, the first GAN-based distillation method for TTM. To reduce the cost per step, we develop a simple, but powerful improvement to a recent layer distillation method that improves learning via better preserving hidden state variance. Finally, we combine our step and layer distillation methods together for a dual-faceted approach. We evaluate our step and layer distillation methods independently and show each yield best-in-class performance. Our combined distillation method can generate high-quality outputs with improved diversity, accelerating our base model by 10-18x (230/435ms latency for 32 second mono/stereo 44.1kHz, 15x faster than comparable SOTA) -- the fastest high-quality TTM to our knowledge.",
  "reply_count": 7,
  "retweet_count": 29,
  "favorite_count": 127,
  "hashtags": [],
  "symbols": [],
  "user_mentions": [],
  "urls": [],
  "media": [
    {
      "media_url": "https://pbs.twimg.com/ext_tw_video_thumb/1843501417018892288/pu/img/zhJJayNS6QcDKSvH.jpg",
      "type": "video",
      "video_url": "https://video.twimg.com/ext_tw_video/1843501417018892288/pu/vid/avc1/852x480/FQ6032Q_2cCzgkDX.mp4?tag=12"
    }
  ],
  "url": "https://twitter.com/_akhaliq/status/1843501455606518125",
  "created_at": "2024-10-08T03:59:32.000Z",
  "#sort_index": "1843501455606518125",
  "view_count": 30123,
  "quote_count": 2,
  "is_quote_tweet": false,
  "is_retweet": false,
  "is_pinned": false,
  "is_truncated": true,
  "startUrl": "https://x.com/_akhaliq/status/1843501455606518125"
}