selected the subset of datasets (the top 11 rows in Figure 2, below) from The Pile that we found to be of the highest relative quality. Then, following a similar approach as that used to generate Pile-CC, we downloaded and filtered two recent Common Crawl (CC) snapshots.
based evaluation setting on the open-source project lm-evaluation-harness and made task-specific changes as appropriate to align settings more closely with prior work. evaluated MT-NLG in zero-, one-, and few-shot settings without performing search for the optimal number of shots
observed that the model can infer basic mathematical operations from context (sample 1), even when the symbols are badly obfuscated (sample 2). While far from claiming numeracy, the model seems to go beyond only memorization for arithmetic
• • •
Missing some Tweet in this thread? You can try to
force a refresh
VQGAN + CLIP "matte painting of a city built on top of a giant turtle walking slowly towards the viewer with clear blue skies and a lush green landscape | trending on artstation" + 3D photo inpainting
networks match the FID of StyleGAN2 but differ dramatically in their internal representations, and
they are fully equivariant to translation and rotation even at subpixel scales