@rasbt
Llama 2 is awesome, however, coding tasks were not its strong suite. (See HumanEval, a coding-related evaluation task from the paper Evaluating Large Language Models Trained on Code.) The new 34B CodeLlama model is twice as good as the original 70B Llama 2 model and closes the gap to (the much larger) GPT 4. Excited to read the 47-page CodeLlama paper (via https://t.co/ROW8qUcTXw) in more detail in the next couple of days.