Peyman Milanfar Profile picture
Aug 15 13 tweets 9 min read Read on X
Did you ever take a photo & wish you'd zoomed in more or framed better? When this happens, we just crop.

Now there's a better way: Zoom Enhance -a new feature my team just shipped on Pixel. Available in Google Photos under Tools, it enhances both zoomed & un-zoomed images

1/n Image
Zoom Enhance is our first im-to-im diffusion model designed & optimized to run fully on-device. It allows you to crop or frame the shot you wanted, and enhance it -after capture. The input can be from any device, Pixel or not, old or new. Below are some examples & use cases

2/n

Image
Image
Image
Let's say you've zoomed to the max on your Pixel 8/9 Pro and got your shot; but you wish you could get a little closer. Now you can zoom in more, and enhance.

3/n

Image
Image
Image
A bridge too far to see the details? A simple crop may not give the quality you want. Zoom Enhance can come in handy.

4/n

Image
Image
Image
If you've been to the Louvre you know how hard it is to get close to the most famous painting of all time.

Next time you could shoot with the best optical quality you have (5x in this case), then zoom in after the fact.

5/n

Image
Image
Image
Maybe you're too far away to read a sign and can use a little help from Zoom Enhance

6/n

Image
Image
Image
Like most people, I have lots of nice shots that can be even nicer if I'd framed them better. Rather than just cropping, you can now frame the shot you wanted, after the fact, and without losing out on quality.

7/n

Image
Image
Image
Is the subject small and the field of view large? Zoom Enhance can help to isolate and enhance the region of interest.

8/n

Image
Image
Image
Sometimes there's one or more better shots hiding within the just-average shot you took. Compose your best shot and enhance.

9/n

Image
Image
Image
There's a lot of gems hidden in older, lower quality photos that you can now isolate and enhance. Like this one from some 20 years ago.

10/n

Image
Image
Image
Pictures you get on social media or on the web (or even your own older photos) may not always be high quality/resolution. If they're small enough (~1MP), you can enhance them with or without cropping.

11/12
Image
Image
So Zoom Enhance gives you the freedom to capture the details within your photos, allowing you to highlight specific elements and focus on what matters to you.

It's a 1st step in powerful editing tools for consumer images, harnessing on-device diffusion models.

12/12

Image
Image
Image
Bonus use case worth mentioning:

Using your favorite text-2-image generator you typically get a result ~1 MP resolution (left image is 1280 × 720). If you want higher resolution, you can directly upscale on-device (right, 2048 × 1152) with Zoom Enhance.

13/12
Image
Image

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Peyman Milanfar

Peyman Milanfar Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @docmilanfar

Sep 5
Random matrices are very important in modern statistics and machine learning, not to mention physics

A model about which much less is known is uniformly sampled matrices from the set of doubly stochastic matrices: Uniformly Distributed Stochastic Matrices

A thread -

1/n
First, what are doubly stochastic matrices?
Non-negative matrices whose row & column sums=1.

The set of doubly stochastic matrices is also known as the Birkhoff polytope: an (n−1)² dimensional convex polytope in ℝⁿˣⁿ with extreme points being permutation matrices.

2/n Image
The extreme points of the Birkhoff polytope (permutations) are sparse matrices, but a typical matrix sampled from inside the polytope is by contrast, very dense

Since rows and columns are exchangeable, the entries of a sampled matrix have the same marginal distribution.

3/n Image
Read 8 tweets
Sep 1
The perpetually undervalued least-squares:

minₓ‖y−Ax‖²

can teach a lot about some complex ideas in modern machine learning including overfitting & double-descent.

Let's assume A is n-by-p. So we have n data points and p parameters

1/10 Image
If n ≥ p (“under-fitting” or “over-determined" case) the solution is

x̃ = (AᵀA)⁻¹ Aᵀ y

But if n < p (“over-fitting” or “under-determined” case), there are infinitely many solutions that give *zero* training error. We pick min‖x‖² norm solution:

x̃ = Aᵀ(AAᵀ)⁻¹ y

2/10
In either case, the solution can be compactly written in terms of the SVD of A:

A = USVᵀ

where U & V are orthogonal matrices of size nxn & pxp, and S is nxp & contains i = 1 to k nonzero diag elements

x̃ = ∑ σᵢ⁻¹ vᵢ uᵢᵀ

where σᵢ are the nonzero sing vals of S

3/10 Image
Read 10 tweets
Aug 18
Two basic concepts are often conflated:

Sample Standard Deviation (SD) vs Standard Err (SE)

Say you want to estimate m=𝔼(x) from N independent samples xᵢ. A typical choice is the average or "sample" mean m̂

But how stable is this? That's what Standard Error tells you:

1/6 Image
Since m̂ is itself a random variable, we need to quantify the uncertainty around it too: this is what the Standard Error does.

The Standard Error is *not* the same as the spread of the samples - that's the Standard Deviation (SD) - but the two are closely related:

2/6 Image
But this expression isn't practical because we don't know √var(xᵢ) either

Not knowing √var(xᵢ), we are forced to estimate that too. Here, we typically just plug in the (sample) Standard Deviation for it. Therefore:

Standard Error ≈ Sample Standard Deviation/√N

3/6 Image
Read 7 tweets
Aug 10
Image-to-image models have been called 'filters' since the early days of comp vision/imaging. But what does it mean to filter an image?

If we choose some set of weights and apply them to the input image, what loss/objective function does this process optimize (if any)?

1/7 Image
Such filters can often be written as matrix-vector operations. Think of z, y, and the corresponding weights as vectors and you have a tidy expression relating (all) output pixels to (all) input pixels. If the filter is local (has a small footprint), most weight will be zero.

2/7 Image
We can think of the filter z = Wy as one step in an iterative process (a diffusion if you like) involving repeated applications of W. A steepest descent step with unit step size, on some yet to be determined loss f(z). We can identify the gradient of the implicit loss easily

3/7 Image
Read 7 tweets
Jul 21
Images aren’t arbitrary collections of pixels -they have complicated structure, even small ones. That’s why it’s hard to generate images well. Let me give you an idea:

3×3 gray images represented as points in ℝ⁹ lie approximately on a 2-D manifold: the Klein bottle!

1/3 Image
Images can be thought of as vectors in high-dim. It’s been long hypothesized that images live on low-dim manifolds (hence manifold learning). It’s a reasonable assumption: images of the world are not arbitrary. The low-dim structure arises due to physical constraints & laws

2/3 Image
But this doesn’t mean the “low-dimensional” manifold has a simple or intuitive structure -even for tiny images. This classic paper by Gunnar Carlsson gives a lovely overview of the structure of data generally (and images in particular). Worthwhile reading.

3/3 Image
Read 4 tweets
Apr 3
We often assume bigger generative models are better. But when practical image generation is limited by compute budget is this still true? Answer is no

By looking at latent diffusion models across different scales our paper sheds light on the quality vs model size tradeoffs

1/5 Image
We trained a range of txt-2-image LDMs & observed a notable trend: when constrained by compute budget smaller models frequently outperform their larger siblings in image quality. For example the sampling result of a 223M model can be better than results of a model 4x larger

2/5 Image
Smaller models may never reach quality levels that large models can. Yet when operating under an inference budget, points reachable by both models may be reached more efficiently w/ smaller ones. We study the tradeoff between model size, compute, quality, & downstream tasks

3/5 Image
Read 5 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(