🐦 Twitter Post Details

Viewing enriched Twitter post

@omarsar0

Introducing Colossal-LLaMA-2 Huge release by Colossal-AI. They present an open-source and commercial-free domain-specific LLM solution to build your own large-scale models at a much lower cost. Utilizes only approximately 0.0085 trillion tokens of data, investing 15 hours, and incurring training costs in the range of a few hundred dollars. This strategy led to a Chinese LLaMA-2 model outperforming competitors across multiple evaluation benchmarks. Lots of new improvements in this release, including: - vocabulary expansion and model initialization to extend to Chinese while preserving English language capabilities - complete data cleaning system and toolkit for selecting higher data quality used to train the models - a multi-stage, hierarchical continual pre-training scheme: 1) large-scale pertaining, 2) Chinese knowledge injection stage, 3) relevant knowledge replay stage; this approach ensures the model progresses equally in both Chinese and English abilities. - bucket training to ensure a balanced distribution of data Personally, the most interesting bit of this release is the focus and possibility of training lightweight domain-specific LLMs in a cost-effective way. This will unlock the ability to fine-tune these foundation models for all kinds of applications that meet specific business needs. Check out the blog here: https://t.co/D5U2dBjcIx ColossalAI repo: https://t.co/jatXbyQyby

View on Twitter

🔧 Raw API Response

{
  "user": {
    "created_at": "2015-09-04T12:59:26.000Z",
    "default_profile_image": false,
    "description": "I share insights & advances in LLMs • Building @dair_ai • Prev: Meta AI, Galactica LLM, PapersWithCode, Elastic, PhD • Author of Prompting Guide (1.7M users)",
    "fast_followers_count": 0,
    "favourites_count": 23277,
    "followers_count": 160404,
    "friends_count": 418,
    "has_custom_timelines": true,
    "is_translator": false,
    "listed_count": 2946,
    "location": "",
    "media_count": 1683,
    "name": "elvis",
    "normal_followers_count": 160404,
    "possibly_sensitive": false,
    "profile_banner_url": "https://pbs.twimg.com/profile_banners/3448284313/1565974901",
    "profile_image_url_https": "https://pbs.twimg.com/profile_images/939313677647282181/vZjFWtAn_normal.jpg",
    "screen_name": "omarsar0",
    "statuses_count": 9470,
    "translator_type": "regular",
    "url": "https://t.co/o4KzoHf52W",
    "verified": false,
    "withheld_in_countries": [],
    "id_str": "3448284313"
  },
  "id": "1706322724228931844",
  "conversation_id": "1706322724228931844",
  "full_text": "Introducing Colossal-LLaMA-2\n\nHuge release by Colossal-AI. They present an open-source and commercial-free domain-specific LLM solution to build your own large-scale models at a much lower cost.\n\nUtilizes only approximately 0.0085 trillion tokens of data, investing 15 hours, and incurring training costs in the range of a few hundred dollars.\n\nThis strategy led to a Chinese LLaMA-2 model outperforming competitors across multiple evaluation benchmarks. \n\nLots of new improvements in this release, including:\n\n- vocabulary expansion and model initialization to extend to Chinese while preserving English language capabilities\n\n- complete data cleaning system and toolkit for selecting higher data quality used to train the models\n\n- a multi-stage, hierarchical continual pre-training scheme: 1) large-scale pertaining, 2) Chinese knowledge injection stage, 3) relevant knowledge replay stage; this approach ensures the model progresses equally in both Chinese and English abilities.\n\n- bucket training to ensure a balanced distribution of data\n\nPersonally, the most interesting bit of this release is the focus and possibility of training lightweight domain-specific LLMs in a cost-effective way. This will unlock the ability to fine-tune these foundation models for all kinds of applications that meet specific business needs.\n\nCheck out the blog here: https://t.co/D5U2dBjcIx\n\nColossalAI repo: https://t.co/jatXbyQyby",
  "reply_count": 1,
  "retweet_count": 101,
  "favorite_count": 442,
  "hashtags": [],
  "symbols": [],
  "user_mentions": [],
  "urls": [],
  "media": [
    {
      "media_url": "https://pbs.twimg.com/media/F64L97xWcAAj2-5.jpg",
      "type": "photo"
    }
  ],
  "url": "https://twitter.com/omarsar0/status/1706322724228931844",
  "created_at": "2023-09-25T15:00:14.000Z",
  "#sort_index": "1706322724228931844",
  "view_count": 65760,
  "quote_count": 5,
  "is_quote_tweet": false,
  "is_retweet": false,
  "is_pinned": false,
  "is_truncated": true,
  "startUrl": "https://twitter.com/omarsar0/status/1706322724228931844"
}