@btibor91
OpenAI published "Why Language Models Hallucinate", explaining the root causes of AI hallucinations and proposing solutions to reduce them - Language models hallucinate because standard training and evaluation procedures reward guessing over acknowledging uncertainty, with most evaluations measuring model performance in a way that encourages guessing rather than honesty about uncertainty since when models are graded only on accuracy, they are encouraged to guess rather than say "I don't know" - Hallucinations originate during pretraining when models learn through pretraining, a process of predicting the next word in huge amounts of text without "true/false" labels attached to each statement, making it doubly hard to distinguish valid statements from invalid ones, especially for arbitrary low-frequency facts like a pet's birthday that cannot be predicted from patterns alone and lead to hallucinations - The researchers conclude that accuracy-based evals need to be updated so that their scoring discourages guessing since if the main scoreboards keep rewarding lucky guesses, models will keep learning to guess, and that hallucinations are not inevitable because language models can abstain when uncertain