Today we announced a new feature on Pixel 7/Pro and @GooglePhotos called "Unblur". It's the culmination of a year of intense work by our amazing teams. Here's a short thread about it
Last yr we brought two new editor functions to Google Photos: Denoise & Sharpen. These could improve the quality of most images that are mildly degraded. With Photo Unblur we raise the stakes in 2 ways:
First, we address blur & noise together w/ a single touch of a button.
2/n
Second, we're addressing much more challenging scenarios where degradations are not so mild. For any photo, new or old, captured on any camera, Photo Unblur identifies and removes significant motion blur, noise, compression artifacts, and mild out-of-focus blur.
3/n
Photo Unblur works to improve the quality of the 𝘄𝗵𝗼𝗹𝗲 photo. And if faces are present in the photo, we make additional, more specific, improvements to faces on top of the whole-image enhancement.
4/n
One of the fun things about Photo Unblur is that you can go back to your older pictures that may have been captured on legacy cameras or older mobile devices, or even scanned from film, and bring them back to life.
5/n
It's also fun to go way back in time (like the 70s and 80s !) and enhance some iconic images like these photos of pioneering computer scientist Margaret Hamilton, and basketball legend Bill Russell.
6/n
Recovery from blur & noise is a complex & long-standing problem in computational imaging. With Photo Unblur, we're bringing a practical, easy-to-use solution to a challenging technical problem; right to the palm of your hand
w/ @2ptmvd@navinsarmaphoto@sebarod & many others
n/n
Bonus: Once you have a picture enhanced with #PhotoUnblur, applying other effects on top can have an even more dramatic effect. For instance, here I've also blurred the background and tweaked some color and contrast.
• • •
Missing some Tweet in this thread? You can try to
force a refresh
Yesterday at the @madebygoogle event we launched "Pro Res Zoom" Pixel 10Pro series. I wanted to share a little more detail, some examples and use cases. The feature enables a combined optical + digital zoom up to 100x magnification. It builds on our optical 5x tele camera.
1/n
Shooting at mags well above 30x requires that the 5x optical capture be adapted and optimized for such conditions, yielding a high quality crop that's fed to our upscaler. The upscaler is a large enough model to understand some semantic context to try & minimize distortions
2/n
Given the distances one might expects to shoot at such high magnification, it's difficult to get every single detail in the scene right. But we always aim to minimize unnatural distortions and stay true to the scene to the greatest extent possible.
Receiver Operating Characteristic (ROC) got its name in WWII from Radar, invented to detect enemy aircraft and ships.
I find it much more intuitive than precision/recall. ROC curves show true positive rate vs false positive rate, parametrized by a detection threshold.
1/n
ROC curves show the performance tradeoffs in a binary hypothesis test like this:
H₁: signal present
H₀: signal absent
From a data vector x, we can write ROC directly in terms of x. But typically, some T(x) - a test statistic - is computed, and compared to a threshold γ
2/n
ROC curves derived from general likelihoods are always monotonically increasing
This is easy to see from the definition of Pf and Pd. The slope of the ROC curve is non-negative.
Pro-tip: If you see a ROC curve in a paper or talk that's not so, ask why.
The choice of nonlinear activation functions in neural networks can be tricky and important.
That's because iterating (i.e. repeatedly composing) even simple nonlinear functions can lead to unstable, or even chaotic behavior, even with something as simple as a quadratic.
1/n
Some activations are more well-behaved than others. Take ReLU for example:
r(x) = max{0,x}
its iterates are completely benign r⁽ⁿ⁾(x) = r(x), so we don't have to worry.
Most other activations like soft-plus are less benign, but still change gently with composition.
2/n
Soft-plus:
s(x) = log(eˣ + 1)
has a special property: its n-times self-composition is really simple
s⁽ⁿ⁾(x) = log(eˣ + n)
With each iteration, s⁽ⁿ⁾(x) changes gently for all x.
This form is rare -- most activations don't have a nice closed form iterates like this
3/n
Tweedie's formula is super important in diffusion models & is also one of the cornerstones of empirical Bayes methods.
Given how easy it is to derive, it's surprising how recently it was discovered ('50s). It was published a while later when Tweedie wrote Stein about it
1/n
The MMSE denoiser is known to be the conditional mean f̂(y) = 𝔼(x|y). In this case, we can write the expression for this conditional mean explicitly:
2/n
Note that the normalizing term in the denominator is the marginal density of y.
Images aren’t arbitrary collections of pixels -they have complicated structure, even small ones. That’s why it’s hard to generate images well. Let me give you an idea:
3×3 gray images represented as points in ℝ⁹ lie approximately on a 2-D manifold: the Klein bottle!
1/4
Images can be thought of as vectors in high-dim. It’s been long hypothesized that images live on low-dim manifolds (hence manifold learning). It’s a reasonable assumption: images of the world are not arbitrary. The low-dim structure arises due to physical constraints & laws
2/4
But this doesn’t mean the “low-dimensional” manifold has a simple or intuitive structure -even for tiny images. This classic paper by Gunnar Carlsson gives a lovely overview of the structure of data generally (and images in particular). Worthwhile reading.
The Kalman Filter was once a core topic in EECS curricula. Given its relevance to ML, RL, Ctrl/Robotics, I'm surprised that most researchers don't know much about it - and up rediscovering it. Kalman Filter seems messy & complicated, but the intuition behind it is invaluable
1/4
I once had to explain the Kalman Filter in layperson terms in a legal matter with no maths. No problem - I thought. Yet despite being taught the subject by one of the greats (A.S. Willsky) & having taught the subject myself, I found this very difficult to do.
2/4
I was glad to find this little gem. It’s a 24-page writeup that is a great teaching tool, especially in introductory classes, and particularly at the undergraduate level.
The writeup seems to be out of print, but still available (though expensive)