and spies have used this to create a profile for a fake person. you can't reverse image search for someone who never exists apnews.com/bc2f19097a4c4f…
The speaker played two examples, one real and one fake, and then said "I swear to God I forget which is which."
Presenter showed a video of a Tom Cruise impersonator, with Actual Tom Cruise's face swapped in.
it's a real video of Zuckerberg, but with mouth movements changed to match the fake audio
They're used for misinformation, which is getting easier to make.
They're used for evidence tampering.
You can train a machine learning network to detect it. But a detector is built into deepfake synthesis - so there's a fundamental problem.
For example - [the speaker played a bunch of videos] - Obama tended to purse his lips between "Hi everybody" and the rest of the content of his speech.
Alec Baldwin's impersonations do the chin-puckering only when his mouth is open.
For lipsyncs it's about 0.85.
if you do a majority vote over an entire video, sliced up into 10-second segments, you get an AUC above 0.9 for all the categories.
Warren is very easy to detect fakes of because she's very expressive. "We love Elizabeth Warren" [laughter]
answer: Well, deep fakes used to be detectable because they didn't blink. They blink now.
But we're looking at long segments, which generative models don't look at.
And sure, a very sophisticated model could defeat this. But "some knucklehead on Reddit" couldn't.
Answer: We're not making these models publicly available. We're publishing the paper, and we'll share our model with researchers, but we won't share it publicly.
A: I don't have a good answer there.
A: We'd like to get to 99.9%. A 1-in-1000 chance is good enough for me.
A: You can't just unleash this on journalists. We want to talk about training journalists.