@omarsar0
I already use LLMs for many things like coding, researching, and writing. But one of the most common and time-consuming tasks for me today is reviewing content/code. Regardless of whether content/code is generated by me or an LLM, it still goes through a thorough review. Given the difficulties LLMs have with knowledge-intensive tasks and the knowledge gaps, I wonder if there is still a way to automate and scale reviewing efforts. Of all the tasks I perform on a day-to-day basis, this is the task that I am least confident that LLMs can do well. For instance, it might be interesting to use RAG or language-powered agents (specifically multiple agents with humans in the loop) to steer a comprehensive review process. I think RLAIF might also be an interesting approach to borrow inspiration from. I haven't really seen any such convincing works that focus on solving reviewing as a standalone problem but it might actually be an interesting application of LLMs. I think reviewing is the type of task that will require the best of the components we have today, including a lot of personalization. I have also managed to develop some very efficient LLM-powered evaluation systems with high efficacy using prompt engineering. There is a lot we can learn from building better evaluation systems that can transfer to automated reviewing systems. More to come on this. Stay tuned!