Jukka Ursin Profile picture
Jun 6 44 tweets 8 min read Twitter logo Read on Twitter
So, @Apple’s new #VisionPro headset is out, and it looks like a cool product. I happened to work on it five years ago, so here’s a number of thoughts on the gadget. This is long and rambling, and all of my first-hand information is five years old, so make of it what you will.1/?
I worked on the Experience Prototyping team, which built the first POCs of the experiences for which you’d use the final product. Back in 2018 we mostly used Windows computers, Unity, and a mix of off-the-shelf headsets and in-house prototypes. These things take time to build.2/?
One thing that Apple consistently gets right is – well, they get things right. They build things that people can use for something. Apple asks “what do we want to make possible” and only then “what do we need to build to make it happen”. Use case is first, tech is incidental.3/?
Back in 2018, much of design work was around human factors and levels of immersion. There’s a smooth spectrum from the real world thru various kinds of AR & MR to a full-immersive VR. The user is in two worlds at the same time, and the headset needs to take that into account.4/?
The device is used in the real world and in the context of real people. You need to bring the real world to the user, and also the user to the real world. The eye passthrough – so that the visor shows the user’s eyes when they’re not fully immersed – is a big thing.5/?
The example in the WWDC keynote where the dad’s in the headset and his kid wants a toast is great – life happens, and you need to be able to take into account your physical surroundings. An opaque visor between the user and everyone else is an interaction barrier.6/?
The tech in the headset is impressive. In 2018, the computing unit was a big 19-inch rackmount server case full of next-gen silicon, and it’s impressive that all of that has now been minified and crammed into a headset.7/?
The external pocket battery is a bit of a design wart. But battery tech is not good enough yet, and not wearing the heavy battery on the head is a design tradeoff that makes sense, until batteries catch up.8/?
The reason for needing a beefy battery is that there’s a lot of really complicated code needed to make the experience magical. Many problems in making really good quality MR with seamless real-world integration are kind of easy to solve kind of OK, but really hard to do well.9/?
For instance, suppose you’re having a conversation with someone and you wanna simulate this in the headset. Let’s talk about audio. If they’re sitting on the other side of the table and a bit to the right from you, there's an easy way to get started.10/?
You can simply make the audio a bit louder in the right earbud. Cool, it already sounds a bit like it’s coming from the right. But it sounds like you’re wearing headphones or having a phone call – it doesn’t sound like the person is actually sitting there at the same table.11/?
There’s a bunch of easy improvements. There’s all sorts of phase-related tricks – it takes a bit longer for sound to reach the left ear, so we can delay the left signal a bit. This acts as another cue that also tells the brain a bit about the distance of the sound source.12/?
But it still sounds unnatural, because the sound is played in the headphones, not in the room. In the real world, sound bounces around the room, echoing in a myriad subtle ways. The hard flat wall to the right causes bright echos, the hallway a long, dark reverberation.13/?
This is very subtle, but critical to blending in. This is the audio equivalent of photorealist rendering – if you don’t do this, it won’t sound right. And if it doesn’t sound right, the illusion will not trick your brain. And if it doesn't sound right, it won't feel right.14/?
If it’s not perfect, it’ll sound like it comes from kind of the right direction, but there is a disconnect between the virtual sound and the real sound. This disconnect will make a break between the real world and the virtual world.15/?
Conversely, if you can simulate all the relevant senses accurately enough that the difference to the real world is imperceptible – well, the difference to reality is imperceptible.16/?
So what does it take to get audio right? Apple calls it audio ray tracing: Model the acoustic behavior of the entire room, all the surfaces and how they reflect different audio frequencies. Then, apply that model to the virtual sound that’s situated in the correct location.17/?
This is a lot of math. Like, really a lot. Modeling this precisely in a naïve way quickly gets infeasible even on performant desktop hardware. But if you don't do this, you won't get photorealistic perfection, it won't feel seamless, there'll always be a disconnect.18/?
On the visual side, when the user moves their head, the virtual object needs to stay put with respect to the physical world. If the other speaker always slides a bit to the left when you turn your head right, there's a disconnect between the real world and the virtual world.19/?
If positioning is not 100%, the experience does not integrate, it’s not seamless. There’s a cool visual thingy that sort of looks like it’s in a 3d space – but it’s not the 3d space of the real world, it’s a virtual world separate from the real world.20/?
Again this is something that’s kind of easy to get close enough that you get an idea, but to do it realistically is really hard. The system is nonlinear, and measuring the location of the device in the physical space without external tracking beacons is far from trivial.21/?
Then, if you render a virtual object behind a real object, you need to figure out where precisely the real object is, and clip accordingly. Again, this has to be 100% precise for it to feel magical – the brain is super sensitive to things like this.22/?
Objects cast shadows, and again, that effect has to be figured out. Just figuring out from which directions do what kinds of lights come from is an interesting problem in its own right – actually figuring out the effects of blocking that light yet another vast problem area.23/?
Working on XP team was a great example of how design and development interact. There’s a high-level idea, the idea of a magically integrated, digitally enhanced reality, where the digital and the physical seamlessly coexist – not side-by-side, but as an integrated whole.24/?
To make it all this happen needs a lot of time and development effort. A lot of tech has to be developed, but it also has to be the right kind of tech. And consequently, the design process cuts across all levels of development process and guides the entire project.25/?
High level vision is translated to concept visuals and sketches and demo videos and the like, and these design artefacts are then translated to interactive experiences that you can try out. Getting an idea of what it’s like to actually use it five years later is invaluable.26/?
To make magical intuitive interfaces happen, you just need to try things out until you find what works. There’s no way to ultimately validate a design other than to build it and see how it really feels like.27/?
And when the tech takes half a decade for a company like Apple to build, you really wanna know that you’re building the right tech. The tech is just a means to an end – the user experience comes first.28/?
To that end, XP team produces prototypes of various fidelity. One experiment in the team used ordinary sunglasses, LEDs and colored gels to simulate fancy transitions. But it was good enough to get an idea of what kinds of things are possible, and how would they feel like.29/?
I worked on the digital crown that moves the user between reality and various depths of immersion to virtual worlds. I experimented with expanding a film scene from a TV screen to the real-world room, so that e.g. the plants from the scene grow out from the TV into the room.30/?
Go deeper, and eventually you’re inside the 3d movie scene, in a full-immersive VR mode, cut off from the rest of the physical reality. You don’t see or hear the room around you any more. It’s cool to see that this concept was kept throughout.31/?
At that time, the hardware didn’t exist. I used a bunch of headsets during my time. Some had a noisy, distorted video-based passthru and a pixelated display, but this was a great unit for exploring certain kinds of questions.32/?
For instance, the question “do we bring in the virtual world to the real world, or the other way around?” is something that you just got to experiment with. Be situated in a room, try out different transitions, actually see how they feel.33/?
Would animating radial size work? Does it need to stick to the real world? What if I’m in full VR and somebody wants to talk to me – how do I bring them in? Or do I punch a hole in the VR world instead?34/?
These are the kinds of questions that cannot be solved in advance by Thinking About It Real Hard. The only way to know is to try it out. For this prototype, it probably doesn’t matter if the fidelity is not quite there.35/?
I also built an experience using a standard HTC Vive, where the transition was done between a VR room that represents a real room, and another VR room that represents a VR room. This made it possible to concentrate on different aspects of the experience of the transition.36/?
At that time, when the tech didn’t exist and the overlaid AR/VR layers drifted wrt the real world, the only way to simulate how it feels like when it eventually does stay put was to simulate the base reality layer, too.37/?
And answering these kinds of questions (or rather, just asking the right questions and talking about them) is very important for development – as said, there’s a lot of really really hard problems to solve for a product like this, and solving the right problems is the key.38/?
Perhaps during the prototyping phase you come to the conclusion that the experience is not compelling enough to warrant actually developing it – then just drop it. The best kind of code is code that doesn’t need to be written.39/?
Code that doesn't exist will have zero bugs, zero vulnerabilities, and need zero maintenance. All that saved dev effort can be used for things that do matter, such as smooth scrolling with a pleasant snapback.40/?
The price point, 3499$, falls in the middle of the 2k-5k bracket that was discussed back then. I’m impressed that the price point got pushed even this low, given the tech, but even then it’s really expensive. Still, 5k would be definitely more of a pro-only offering.41/?
It’s interesting that there’s a lot of 3rd-party developer stuff, collabs with companies like Zoom or Microsoft. Much less of a walled garden approach than I was expecting from Apple, and this’ll make market entry that much easier. Stuff you already use likely works already.42/?
It’s cool that they’re calling it a “spatial computer”, which is a significant break from the “tech first” way of calling it a “headset”. It’s not about what it is, but what you do with it: You can do all kinds of computing tasks with it, but in a new kind of spatial way.43/?
So, anyhow, it’s cool that they finally got it out, it’s exciting to see some of the things that I worked on half a decade ago finally out in the real world. I think it's a major step forward for integrating the physical and the digital worlds in truly compelling ways.44/44

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Jukka Ursin

Jukka Ursin Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(