#ACL2023NLP
Do causal language models (CLMs) yield good representations with good isotropy and discrimination?
The answer is not always! To address the issue, our ACL2023 paper (arxiv.org/pdf/2210.01185…) proposes ContraCLM.
Joint work with @DejiaoZhang@nihalj_
We show that CodeGen (350M to 16B) pretrained on source code, and text-based CLMs (smaller than GPT2-Large) generated representations suffer from anisotropy and poor discrimination.