🐦 Twitter Post Details

Viewing enriched Twitter post

@_akhaliq

QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models paper page: https://t.co/RKBzkiuDhs Recently years have witnessed a rapid development of large language models (LLMs). Despite the strong ability in many language-understanding tasks, the heavy computational burden largely restricts the application of LLMs especially when one needs to deploy them onto edge devices. In this paper, we propose a quantization-aware low-rank adaptation (QA-LoRA) algorithm. The motivation lies in the imbalanced degrees of freedom of quantization and adaptation, and the solution is to use group-wise operators which increase the degree of freedom of quantization meanwhile decreasing that of adaptation. QA-LoRA is easily implemented with a few lines of code, and it equips the original LoRA with two-fold abilities: (i) during fine-tuning, the LLM's weights are quantized (e.g., into INT4) to reduce time and memory usage; (ii) after fine-tuning, the LLM and auxiliary weights are naturally integrated into a quantized model without loss of accuracy. We apply QA-LoRA to the LLaMA and LLaMA2 model families and validate its effectiveness in different fine-tuning datasets and downstream scenarios.

View on Twitter

🔧 Raw API Response

{
  "user": {
    "created_at": "2014-04-27T00:20:12.000Z",
    "default_profile_image": false,
    "description": "AI research paper tweets, ML @Gradio (acq. by @HuggingFace 🤗)\n\ndm for promo",
    "fast_followers_count": 0,
    "favourites_count": 26631,
    "followers_count": 237610,
    "friends_count": 1888,
    "has_custom_timelines": true,
    "is_translator": false,
    "listed_count": 3170,
    "location": "subscribe → ",
    "media_count": 13881,
    "name": "AK",
    "normal_followers_count": 237610,
    "possibly_sensitive": false,
    "profile_banner_url": "https://pbs.twimg.com/profile_banners/2465283662/1610997549",
    "profile_image_url_https": "https://pbs.twimg.com/profile_images/1451191636810092553/kpM5Fe12_normal.jpg",
    "screen_name": "_akhaliq",
    "statuses_count": 21880,
    "translator_type": "none",
    "url": "https://t.co/TbGnXZJwEc",
    "verified": false,
    "withheld_in_countries": [],
    "id_str": "2465283662"
  },
  "id": "1706863594917269514",
  "conversation_id": "1706863594917269514",
  "full_text": "QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models\n\npaper page: https://t.co/RKBzkiuDhs\n\nRecently years have witnessed a rapid development of large language models (LLMs). Despite the strong ability in many language-understanding tasks, the heavy computational burden largely restricts the application of LLMs especially when one needs to deploy them onto edge devices. In this paper, we propose a quantization-aware low-rank adaptation (QA-LoRA) algorithm. The motivation lies in the imbalanced degrees of freedom of quantization and adaptation, and the solution is to use group-wise operators which increase the degree of freedom of quantization meanwhile decreasing that of adaptation. QA-LoRA is easily implemented with a few lines of code, and it equips the original LoRA with two-fold abilities: (i) during fine-tuning, the LLM's weights are quantized (e.g., into INT4) to reduce time and memory usage; (ii) after fine-tuning, the LLM and auxiliary weights are naturally integrated into a quantized model without loss of accuracy. We apply QA-LoRA to the LLaMA and LLaMA2 model families and validate its effectiveness in different fine-tuning datasets and downstream scenarios.",
  "reply_count": 1,
  "retweet_count": 70,
  "favorite_count": 292,
  "hashtags": [],
  "symbols": [],
  "user_mentions": [],
  "urls": [
    {
      "url": "https://t.co/eTdKYoioQA",
      "expanded_url": "https://huggingface.co/papers/2309.14717",
      "display_url": "huggingface.co/papers/2309.14…"
    }
  ],
  "media": [
    {
      "media_url": "https://pbs.twimg.com/media/F6__YKEWkAA-gqa.jpg",
      "type": "photo"
    }
  ],
  "url": "https://twitter.com/_akhaliq/status/1706863594917269514",
  "created_at": "2023-09-27T02:49:27.000Z",
  "#sort_index": "1706863594917269514",
  "view_count": 46191,
  "quote_count": 4,
  "is_quote_tweet": false,
  "is_retweet": false,
  "is_pinned": false,
  "is_truncated": true,
  "startUrl": "https://twitter.com/_akhaliq/status/1706863594917269514"
}