Chain-of-thought reasoning (CoT) can dramatically improve LLM performance
Q: But what *type* of reasoning do LLMs use when performing CoT? Is it genuine reasoning, or is it driven by shallow heuristics like memorization?
A: Both!
🔗 1/n arxiv.org/abs/2407.01687
@RTomMcCoy @cocosci_lab We test LLMs on decoding shift ciphers, simple ciphers in which each letter is shifted forward a certain distance in the alphabet. Eg, DOG shifted 1 is EPH
Why shift ciphers? They let us disentangle reasoning from heuristics! (see quoted thread)