Eric Wallace Profile picture
Researcher at OpenAI working to make language models more trustworthy, secure, and private.
Jerome Ku Profile picture 1 subscribed
Jan 31, 2023 9 tweets 3 min read
Models such as Stable Diffusion are trained on copyrighted, trademarked, private, and sensitive images.

Yet, our new paper shows that diffusion models memorize images from their training data and emit them at generation time.

Paper: arxiv.org/abs/2301.13188

👇[1/9] Diffusion models are trained to denoise images from the web. These images are often vulgar or malicious, and many are potentially risky to use (e.g., copyrighted).

Moreover, many ongoing projects apply diffusion models to private applications such as medical imagery. [2/9]
May 1, 2020 10 tweets 5 min read
We show that adversaries can attack production machine translation systems like Google Translate.

First, train a model to imitate API outputs. Then, transfer adversarial examples from the imitation model.

ericswallace.com/imitation
arxiv.org/abs/2004.15015

How to defend?👇 [1/9] Model "stealing" can be a goal in itself. It allows adversaries to launch their own competitor service or to avoid long-term API costs.

We "steal" black-box MT systems by querying them with monolingual sentences and training "imitation models" on system outputs. [2/9]
Mar 5, 2020 6 tweets 5 min read
Not everyone can afford to train huge neural models. So, we typically *reduce* model size to train/test faster.

However, you should actually *increase* model size to speed up training and inference for transformers.
Why? [1/6] 👇

bair.berkeley.edu/blog/2020/03/0…
arxiv.org/abs/2002.11794 ImageImage Using more compute often leads to higher accuracy. However, since large-scale training is expensive, the goal is typically to maximize accuracy under your budget.

For most people, the go-to strategy is to train small models because they run fast and use little memory. [2/6]
Sep 18, 2019 6 tweets 3 min read
Most NLP models treat numbers (e.g., “91”) in the same way as other tokens, i.e., they embed them as vectors. Is this a good representation for downstream numerical tasks such as DROP, math questions, etc?

Yes! Pre-trained vectors (BERT, GloVe, ELMo) know numbers.[1 / 6] We begin by testing QA models on questions that evaluate numerical reasoning (e.g., sorting, comparing, or summing numbers), taken from the DROP dataset. Standard models excel on these types of questions! [2 / 6]