Stop Scoring. Start Ranking.
Stop Scoring. Start Ranking. If you’ve asked an LLM to rate a document on a scale of 1 to 10, you might have noticed something strange. The scores cluster together, most responses land between 3 and 7, and the sorted list you end up with is nearly useless. LLMs