@omarsar0
Skeleton-of-Thought: LLMs can do parallel decoding Interesting prompting strategy which firsts generate an answer skeleton and then performs parallel API calls to generate the content of each skeleton point. Reports quality improvements in addition to speed-up of up to 2.39x.… https://t.co/SG6OmLdvUw https://t.co/B9pVGwpsFc