🐦 Twitter Post Details

Viewing enriched Twitter post

@omarsar0

I already use LLMs for many things like coding, researching, and writing. But one of the most common and time-consuming tasks for me today is reviewing content/code. Regardless of whether content/code is generated by me or an LLM, it still goes through a thorough review. Given the difficulties LLMs have with knowledge-intensive tasks and the knowledge gaps, I wonder if there is still a way to automate and scale reviewing efforts. Of all the tasks I perform on a day-to-day basis, this is the task that I am least confident that LLMs can do well. For instance, it might be interesting to use RAG or language-powered agents (specifically multiple agents with humans in the loop) to steer a comprehensive review process. I think RLAIF might also be an interesting approach to borrow inspiration from. I haven't really seen any such convincing works that focus on solving reviewing as a standalone problem but it might actually be an interesting application of LLMs. I think reviewing is the type of task that will require the best of the components we have today, including a lot of personalization. I have also managed to develop some very efficient LLM-powered evaluation systems with high efficacy using prompt engineering. There is a lot we can learn from building better evaluation systems that can transfer to automated reviewing systems. More to come on this. Stay tuned!

View on Twitter

🔧 Raw API Response

{
  "user": {
    "created_at": "2015-09-04T12:59:26.000Z",
    "default_profile_image": false,
    "description": "I share insights & advances in LLMs • Building @dair_ai • Prev: Meta AI, Galactica LLM, PapersWithCode, Elastic, PhD • Author of Prompting Guide (1.8M users)",
    "fast_followers_count": 0,
    "favourites_count": 23315,
    "followers_count": 161341,
    "friends_count": 423,
    "has_custom_timelines": true,
    "is_translator": false,
    "listed_count": 2981,
    "location": "",
    "media_count": 1692,
    "name": "elvis",
    "normal_followers_count": 161341,
    "possibly_sensitive": false,
    "profile_banner_url": "https://pbs.twimg.com/profile_banners/3448284313/1565974901",
    "profile_image_url_https": "https://pbs.twimg.com/profile_images/939313677647282181/vZjFWtAn_normal.jpg",
    "screen_name": "omarsar0",
    "statuses_count": 9493,
    "translator_type": "regular",
    "url": "https://t.co/HHAExmS4Et",
    "verified": false,
    "withheld_in_countries": [],
    "id_str": "3448284313"
  },
  "id": "1709717560013324436",
  "conversation_id": "1709717560013324436",
  "full_text": "I already use LLMs for many things like coding, researching, and writing. \n\nBut one of the most common and time-consuming tasks for me today is reviewing content/code.\n\nRegardless of whether content/code is generated by me or an LLM, it still goes through a thorough review.\n\nGiven the difficulties LLMs have with knowledge-intensive tasks and the knowledge gaps, I wonder if there is still a way to automate and scale reviewing efforts. \n\nOf all the tasks I perform on a day-to-day basis, this is the task that I am least confident that LLMs can do well.\n\nFor instance, it might be interesting to use RAG or language-powered agents (specifically multiple agents with humans in the loop) to steer a comprehensive review process. I think RLAIF might also be an interesting approach to borrow inspiration from.\n\nI haven't really seen any such convincing works that focus on solving reviewing as a standalone problem but it might actually be an interesting application of LLMs. I think reviewing is the type of task that will require the best of the components we have today, including a lot of personalization. \n\nI have also managed to develop some very efficient LLM-powered evaluation systems with high efficacy using prompt engineering. There is a lot we can learn from building better evaluation systems that can transfer to automated reviewing systems.\n\nMore to come on this. Stay tuned!",
  "reply_count": 5,
  "retweet_count": 12,
  "favorite_count": 89,
  "hashtags": [],
  "symbols": [],
  "user_mentions": [],
  "urls": [],
  "media": [
    {
      "media_url": "https://pbs.twimg.com/media/F7oij0ZXIAAB8W9.jpg",
      "type": "photo"
    }
  ],
  "url": "https://twitter.com/omarsar0/status/1709717560013324436",
  "created_at": "2023-10-04T23:50:05.000Z",
  "#sort_index": "1709717560013324436",
  "view_count": 13792,
  "quote_count": 0,
  "is_quote_tweet": false,
  "is_retweet": false,
  "is_pinned": false,
  "is_truncated": true,
  "startUrl": "https://twitter.com/omarsar0/status/1709717560013324436"
}