Bart Trzynadlowski Profile picture
Jun 21 23 tweets 7 min read Twitter logo Read on Twitter
Apple #VisionPro eye-tracking mega-thread! I'm seeing some concern about eye fatigue. @EricPresidentVR brought up this legitimate worry. I think there is some misunderstanding of how AVP differs from other implementations. I'll also dive into some interesting eye UX work. (1/) Image
Eye gaze is tricky to use as an input because it is so fundamental to our bodies that we don't feel like we actively control it and being made aware of it feels creepy and fatiguing. Doubly so because you literally cannot look away. (2/)
It suffers from the "Midas touch" problem, named after the mythical King Midas, who was granted his wish that anything he touch turn to gold. Midas regretted it when he reached out to comfort his distraught daughter one day, turning her to lifeless gold. Oopsies! (3/) Image
Apple sidesteps this problem in a clever way. You're almost never directly made aware of your gaze (although some elements apparently can react lightly) until you perform the pinch gesture. Done right, this should feel almost telepathic. Think about GUIs you use today... (4/)
Touch interfaces or mouse-driven ones. They are not tactile -- you *must* look where you are touching, pointing, clicking. You do so involuntarily. As long as AVP can track gaze accurately and quickly enough, you should not expend any conscious effort "staring" at things. (5/)
AVP's eye tracking setup is quite different from Meta's Quest Pro. Karl Guttag as usual has an excellent overview. On AVP, the IR illuminators and cameras go *through* the optics, allowing them to be placed further back and giving them a better view of your whole eye. (6/) Image
On Quest Pro, they are embedded on a ring outside of the optics, giving them a more indirect look at the eye. (7/) Image
Publicly, it has been disclosed that Apple acquired SMI, an eye tracking vendor that was founded way back in 1991! Whether or not their personnel were involved in this project, the company can certainly draw upon deep expertise. (8/)
Bottom line about the AVP gaze interface: if the eye tracking implementation is good enough, you shouldn't have to *do* anything consciously. Just express intent by pinching. You'll already be looking at the UI element you want to interact with naturally, without thinking! (9/)
Another interesting fact mentioned during the unveil is that apps *do not* have access to eye gaze. Pinch is a system-level gesture and only then does an app get information about where the user was looking when the pinch was detected. This is interesting for two reasons... (10/)
1) Privacy is the stated reason. Personally not too concerned about privacy and would prefer having access to gaze vectors. But it's a valid point. Even without camera access, knowing exactly what you are looking at on e.g. a shopping or social app is ripe for abuse. (11/)
2) Blocking access to direct gaze also prevents apps from implementing awful UX and annoying users with it! @Alientrap did a great demo of drawing with your eyes, which he knew would feel terrible. Can't even do this on AVP! (12/)
Are there reasons to use gaze directly? Yes, there are circumstances where it might make sense (people with disabilities, specialized interfaces, etc.) There is a body of research and some interesting public demos on actively using eye gaze in a reasonably comfortable way. (13/)
Here's a recent paper comparing three control options: dwell, pursuits, and gestures. The videos are a great introduction to the topic. dl.acm.org/doi/10.1145/35… (14/)
Dwell is what you expect: staring at something for a fixed amount of time. It's as annoying and slow as you'd expect. Pursuit is interesting: you follow a moving target with your eye to indicate intent. Here's a paper from 2013 describing it: perceptualui.org/publications/v… (15/)
One type of gesture is "enter-and-leave", shown in this figure. Your eye enters a region, may optionally linger for a time, and then may enter, perhaps with a constrained direction. From this great paper: sciencedirect.com/science/articl… (16/) Image
You can imagine these techniques being combined in interesting ways. In 2017 I saw a really fascinating demo from a Texas company called "Quantum Interface". Their interface worked for touch, "head gaze" (head direction only), and eye gaze. (17/)
Here's an absolutely garbage quality video I snagged from their old Twitter account @qimotions. Notice how it works: you aim with your head at the target which unfolds some options, and each level of options requires you to change direction a little to hit them. (18/)
This forced change in direction prevents you from accidentally moving through a target and selecting the next one. They even had an interesting demo for HIPPA-compliant interfaces requiring double confirmation... (19/)
You'd select an option by gaze, a red x and green check would pop out from either sides. If you selected the check on the right, a second check would pop up on the left forcing you to intentionally change directions to securely confirm. (20/)
Lastly, another good resource on eye gaze from Microsoft, which implements it in HoloLens 2: learn.microsoft.com/en-us/windows/… (21/)
I hope this was informative. I think Apple really thought deeply here and has implemented eye gaze in a natural, minimalistic way that with adequate hardware and software support, should feel subconscious and *not* active. (22/22)
DISCLAIMER ADDENDUM: This and other threads are me speaking as an unaffiliated and independent AR developer, citing public info only.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Bart Trzynadlowski

Bart Trzynadlowski Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @BartronPolygon

Jun 19
Let's talk about the most important feature of the Apple #VisionPro that is getting the least attention right now: high quality spatial audio. Apple understands audio and TDG's VP, Mike Rockwell, was formerly a VP at Dolby. Look the size of those speaker drivers! (1/)
Audio is a huge part of our perception of space. The soundscape around us helps us localize objects both directly -- when they emit sound, esp. beyond our visual field -- and indirectly, when reflections give us a sense of the dimensions and composition of our 3D space. (2/)
Spatial audio is an important cue that grounds virtual objects in our space. This isn't just important for making a dinosaur in our room feel present. It helps orient abstract objects, like app UI, allowing a unified mental model of the real and virtual. Less mental work. (3/) Image
Read 9 tweets
Dec 21, 2022
Natural language interfaces have truly arrived. Here's ChatARKit: an open source demo using #chatgpt to create experiences in #arkit. How does it work? Read on. (1/)
JavaScriptCore is used to create a JavaScript environment. User prompts are wrapped in additional descriptive text that inform ChatGPT of what objects and functions are available to use. The code it produces is then executed directly. You'll find this in Engine.swift. (2/)
3D assets are imported from Sketchfab. When I say "place a tree frog...", it results in: createEntity("tree frog"). Engine.swift implements this and instantiates a SketchfabEntity that searches Sketchfab for "tree frog" and downloads the first model it finds. (3/)
Read 6 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(