1/ GLM-130B outperforms OpenAI's GPT-3 175B and Google's PALM 540B on critical benchmarks.
AND it's open sourced, which means — you can run this model on your own machine, for free.
2/ GLM-130B is instruction-finetuned, leverages Chinchilla scaling laws, and has bells and whistles like 4-bit quantization and bidirectional attention.
With 4-bit quantization, the model can run on 1 x 80 GB A100 or a consumer GPU rig.
3/ The paper, written by a team at a public university, is well-written, and methodology well-documented: