typedfemale Profile picture
a really exciting new account "advanced pytorch user" - @cHHillee alt: @typedalt
Jan 6, 2023 4 tweets 1 min read
does causal attention annoy anyone else? you compute a whole matrix multiplication only to throw half of it away! isn't it also deceiving when making claims about how well transformers utilize GPUs?