, 179 tweets, 65 min read Read on Twitter
Up early, so I drew User Flow 001 for the cloud soundstage project which involved setting up the template and library. I pick simple scenarios for UF001 since it involves a fair amount of this sort of foundation work. This UF is "Trevor creates a Showrunner Account for Kelly"
Digging into UF002, "Kelley creates a show, season, and episode." Design research shows variation in how cultures name and frame these ideas (e.g. "season" in the US and "series" in the UK) and a whole range of words for shorter and longer runs of shows or shows with a theme.
More user flows:
🌸 003 Kelly manages cast and crew invites
🌸 004 June creates a crew account
🌸 005 June checks invitations and assignments
🌸 006 Kelly manages the script
🌸 007 Kelly directs the set <-- 🎉 contains the 1st control room station design. 🎉
Ok, I've started user flow 008: "June runs video" in which June uses the web equivalent of one of those many-button, many-knob control surfaces (e.g. the ATEM video switcher in this picture) and follows spoken directions from Kelly.
There will be keyboard controls but also there needs to be a way to link a midi control surface to the major functions.
While traveling I worked on User Flow 008: June Runs a Camera. It's good fun figuring out what the system can do that physical sets and camera's can't.
For UF-010 June will actually crew on set (herding guests back stage) so I greyboxed a small set in order to quickly generate scenes from any point of view. I'm a firm believer in greyboxing to convey an idea and connote that it's not a complete design. Top view of a 3D modeling program showing a cloud-hosted talk show set.Simple render view of a cloud-hosted talk show set.Simple digital render view of a blocky talk show host.
(I'm tempted to write a script with @amyleighmorgan and then animate these little blockheads but I'm trying to stay focused on the design)
@AmyLeighMorgan These are the remaining user flow topics:
🌸 fee payment
🌸 pre-episode clip and key management
🌸 pre-episode avatar management
🌸 pre-episode set management
🌸 crew training
🌸 4 more crew stations
🌸 audience invitations
🌸 technical difficulty
🌸 password & email management
@AmyLeighMorgan In "UF016: June Runs Lighting" it is interesting to redesign the workflows for a live show with cloud sets. The lighting engineer can preview and change lighting scenes on the live set without disturbing the lights rendered on the live feed. Non-physical sets are 🤯
The lighting engineer could create one lighting mood for the video renderer, a different one for the live audience, and yet another for cast and crew. I'm *not* saying this is a good idea, just that the design space is inherently different than for physical sets.
Because each person on a cloud soundstage is rendering locally in their VR headset, personal accessibility options like raising the intensity or contrast become possible. People with different color perception could change the render to compensate.
UF016 needs review by a domain expert so I'm going to put it on the back-burner until I can schedule that. I'm waffling on how to balance flexibility and baked-in assumptions when creating the set assets. A rough draft of a user flow design document showing each step the user takes, like choosing buttons or moving a slider control.
On one extreme is a general purpose lighting console like the grandMA3 with a huge possibility space, training cost, and frankly way too much capability for talk show production. (if you're a control design nerd it's good fun anyway)
The other extreme is to pre-configure all of the lighting into a few scenes that the crew member can trigger transition between by choosing a button. (this is a quick illustrative illustration, not an actual design) A small set of buttons and a video window.
The middle path will include tactics like great defaults and progressively revealing complexity on an as-needed basis so that users can quickly get started making high quality shows and then dive into the weeds as their confidence grows.
Designs for #widerweb apps like this cloud soundstage are somewhat unique because they responsively shift between flat, portal, and immersive displays and use page, overlay, and spatial control types. Designers choose one or more types for each control.
A trivial example: a control that toggles visibility of a prop could be a flat button in a web page, a 2-position slider overlayed on a portal display, and a 3D switch in an immersive display. Alternatively, it could be a flat button in all three display modes.
The position of @PotassiumES is that the designer must decide the appropriate control type for each display mode for each use of a control. The visibility toggle I mentioned above has different usage patterns (and thus control types) than something like a microphone mute toggle.
@PotassiumES I'm starting to think and have conversations about the service side of the soundstage product so I whipped up a high-level block diagram that illustrates the mental frame I'm using as well as what network protocols are used to communicate between the blocks. A block diagram showing the software components of a cloud service, including video mixing, simulating the sets, and providing a web-browser-based user interface.
@PotassiumES A few of the major components could be stand-alone products so it's interesting to look at current offerings and consider which blocks they chose and how they're selling them. This realization helps set expectations of scale and cost for R&D as well as insight into how to sell.
@PotassiumES Ok, I took a first pass at UF017 that shows the cueing station where June (a crew persona) runs the cast's teleprompters as well as the audience prompts (boo, laugh, quiet, etc). A user flow design document showing each step that Junes takes to do her job.
@PotassiumES As a reminder: User flows are not visual specs. Their purpose is to work through each step that a user takes to achieve their goals. They shake out any missing pieces and misunderstandings to avoid development dead ends and wasted work.
I'm most interested in how they handle lighting and camera motion on a hybrid CGI / physical set.
It could be fun to produce a morning talk show that uses NASA data to inform chit-chat about goings-on around Sol system.
Ok, diving into UF018: June manages avatars. It encompasses wardrobe, hair and makeup, and in a weird way a personal trainer. When a guest is booked they need to be happy with their avatar in order to have a good show.
When in the middle of a design I often rough-cut sizing guides in foamcore so that I can more clearly imagine a control or interaction with accurate physical context. I also print out controls and tape them to the sizing guides so that I can physically step through a user flow. A rectangle of black foamcore, roughly 12 inches by 9 inches.A paper printout of a rough video switcher design that is backed by the black foamcore.
I use physical props along with VR to iterate through control and interaction designs. Here's a video from a couple of years ago when I was using a @Spaciblo space to find comfy distances for very early version of Firefox Reality browser controls.
@spaciblo At some point I might get/make a vacuformer so that I can form UX props with sheet styrene as demonstrated in this Tested one day build.
@spaciblo UF018 (June Manages Avatars) turned into a long story. June coordinates with a guest and her stylist while creating the avatar, which includes several video calls and a meeting in a VR fitting room.
@spaciblo UF019 (June Manages Clips and Keys) was pretty quick because the assumption is that June will edit the source files in a different app. A user flow diagram, showing each step of uploading and editing a clip.
@spaciblo A bit of context: Here are the user flows I've completed (the ones with numbers) and the user flows that remain to do before I switch to the next phase. A list of text showing the name of each user flow that has a v1A list of text showing the name of each user flow that remains in this phase.
@spaciblo By "completed" I mean that I've drawn up a first version. There will be additional versions in later phases of the process.
@spaciblo Ok, UF020 (June manages stage marks) follows a crew member as she sets up a new teleport destination next to the audience so that Allison (the host) can warm up the crowd.
@spaciblo Aaaaaaaand, UF021 (June manages props) is roughed out so that I can (maybe) relax a bit over the long weekend. For some reason I modeled a goofy tokamak? I don't know.
There are so many interesting directions we can go with digital actors (aka avatars).
I got the urge to write an executive summary for the cloud soundstage product. (🤷🏻‍♀️) I sort of love reframing concepts from niche domains like CG and XR so that they can slip easily into the context of people working in different dimensions of business.
For example, people in film or television production understand the phrase "digital actors" more easily than "avatars". Similarly, a "fully digital set" makes more sense than a "virtual reality space".
But a "fully digital set" could also mean a physical space where all of the cabling is optical fiber instead of analog so I have to further clarify that aspect.
I ran a bunch of numbers this weekend (because, vacation?) and on an hourly basis the largest cost for a video production service like a digital soundstage in the cloud is the GPU-enabled server instances. There are a variety of a/v streams coming in, being mixed, and going out.
(note, these are costs for the service itself, not the production costs of a show like payroll, booking guests, etc)
Assumptions for a show w/30 audience members:
🌸 46 mics
🌸 36 avatars
🌸 46 audio monitors
🌸 20 video monitors
🌸 36 on-set video monitors
🌸 36 scene replication channels
🌸 6 RTMP broadcasts
🌸 11 crew control streams
Using prices for AWS's Oregon location, the bandwidth and load-balancing cost of the outgoing streams is about $20 per hour per live set. It's a little less if it's just cast and crew (e.g. a closed rehearsal, set construction, etc) but not as much as you might think.
The cost to store everything that you'd need to re-edit and/or re-shoot an episode after it airs (incoming streams like mics and avatar motion, rendered camera streams, set replication stream, etc):
🌸 $1.3915/month for S3
🌸 $12.0175/month for EBS
🌸 $18.975/month for EFS
Storing the various video clips, overlay keys, images, and PDFs used as source material for a talk show is pennies per month on all three storage tiers (S3, EBS, and EFS).
And all of the above costs are almost lost in the noise compared to GPU instances, which run from $3 to $31 per hour. My seriously wild-ass guess (SWAG) is that a show like I'm targeting will need ~five separate GPU instances for intake, simulation, rendering, mixing, outlet.
Depending on the type of GPU instance each role demands, costs range from ~ $15 to $155 per hour. A daily show set that is open 16 hours each day costs up to ~$2500 just for the GPU instances!
Obviously it will take a few experiments to nail down that SWAG and get trustworthy numbers but it's still handy to know the range and how to put a dollar value on more or less efficient tech like simulators, renderers, and transcoders.
With numbers like this one could run a cost/benefit analysis on building and maintaining two renderers (one for browsers, one for servers) against the "cost of goods sold" (COGS) which is lingo for how much the business pays to deliver the service.
Ok, UF22 (Sandra [the director] runs rehersal) was a short one but there are changes in there that I'll need to roll back into previous user flows. There will be a big normalization pass once I've worked on the six (!) remaining user flows before I call this phase "done".
I just realized that it has been exactly one month since I started drawing up user flows for this product. For a couple of weeks it has felt like it's taking too long, in part because I work on it in the part of my 1/3 not-client-work hours that aren't for W3C, AR hardware, etc.
I've enjoyed the stuffing out of the work and I am happy to have it in hand while poking at various options for what comes next for the product.
Another use for fully computer graphics sets and digital actors: corporate talk shows. When fully-remote orgs are the norm and travel is curtailed this is the tool for a variety of formats ranging from fireside chat to Jobsian unveiling of new products.
I'm designing for a late night talk show format but I'm also interested in its more serious sister, Sunday morning political shows.
This thread has turned into a nice history of the product design process so here's a handy diagram showing the main ideas. A diagram showing that cast and crew work in VR headsets and on web pages, that the service has fully digital sets and actors, and that the video is streamed live to social media for audience members in VR headsets or in their web browsers.
Ok, moving on to User Flow 23, Kelly Manages Guest and Audience Invitations. Eventually invitations could be sent via text or other messaging channels (via Twilio et al) but to start it's email all the way.
Whoops, as soon as I tweeted I noticed an error in UF023-004 and UF023-005. There shouldn't be an accepted/not-accepted count on the Guest list until at least one invitation has been sent.
This is a good time to talk about the ID system of user flows. Each user flow document is referred to by UF### where ### is its number. So user flow 23 is "UF023". Each step in the flow has a number. So, the third step in UF023 is "UF023-003". I say it like "u f 23 step 3".
It seems overly prescriptive but you would not believe how this simple system makes it easy to talk about the designs. In addition, in the explanatory text under each step it's easy to write things like "Kelly clicks 'delete' and is taken to UF012-003". Fast and neat!
Oh, and when it's time to start breaking designs into developer tasks it's very convenient to be able to refer to exact steps in both the task description and in follow-up discussions between the designers and developers. Time and frustration saved!
I woke up early this morning and finished a first draft of User Flow 24: Sandra Handles Technical Difficulties. There are many moving pieces that can fail ranging from dead batteries in a VR headset to mysterious network errors. Production teams need to know what's happening.
Ok, a slow week for this project due to travel and other work but UF025 has landed! In this user flow a service moderator, Tavi, monitors a show that violates the terms of service.
This design assumes a "high-touch" service where there are few shows happening at any time and the pool of moderators can be pretty small. The alternative is a reaction-based system where moderators show up only when there are complaints or programmatically detected violations.
I continue to think about how this and other media production services could and should fit into the wider context of our noosphere.
There are only two more user flows to draw up and then I'm done with phase 1 of this design process! I'm struggling with impatience and it's hard to avoid rushing through these last designs.
Phase 1: ✅
For phase 2 of the design process the first thing I do is to break down the major views in the web app, give them each a URL, and then list each user flow step that shows each view. With this I will revisit all steps showing each view to tighten them up and make them consistent.
I just want to say that if you want to watch talk shows produced by small teams then Amazon Prime Video is a goldmine.
Tiny Tiny Talk Show: amazon.com/Lauren-Lapkus-…
The Goodnight Show: amazon.com/Ladies-and-Ger…
Designs inevitably change as decisions are made so after all of the user flows are drawn another pass is needed to make the early and late user flows consistent with each other. I just finished that pass.
Phase 2: ✔️
And with that I've finished what I originally set out to do: I put together provisional personas, wrote up a scenario list, laid out the brand basis and voice, and then drew up user flows that cover the experience of making a talk show on a cloud soundstage. Yay!
I've started putting together a visual guide for the soundstage web app. I'm using the v2 US Web Design System (designsystem.digital.gov) as a basis as it's a nice balance of visual design and accessibility practices. I'll probably tweak the colors once I've laid out a few pages.
I'm waiting on a US trademark service response before I know whether the name that I want to use is available. Hopefully I'll know by the end of the month.
How many of us want to sit on the sofa? How many of us want to sit in the chair? A crudely rendered computer generated talk show set with a chair for the host and a sofa for the guests.
To clarify, the above image is a render from my modeling practice in Blender, not from code for this project.
Thanks to YouTube tutorials, I now know how to make simple animations in Blender. I will try to not abuse this knowledge.
I actually have something for #screenshotsaturday! I'm mostly just playing with layout and lighting to get a feel for what this level of complexity of models and materials. I have a specific digital actor (aka avatar) aesthetic in mind and it's important that the set match. A computer generated talk show setThe host desk of a computer generated talk show setThe guest interview portion of a computer generated talk show setThe monologue curtain of a computer generated talk show set
I'm slowly learning how to building CG sets for video production. Especially on streaming platforms, a lot of the small details will disappear in the video render so simple shape blocking of the camera shots is key.
Here's an overhead view of the new CG soundstage. It's designed for a talk show with two cast members (host and announcer), two or three guests, and up to 30 on-set audience members. Screenshot of the Blender graphics program that shows a top-down view of the entire soundstage, including the backroom, stage, and audience seating.
The back-stage room on the top of the image has a place for guests to sit when they're not on-stage and there's a camera back there so the host can welcome them to the show before they come out. The curved feature on the left about 1/3 down from the top is the monologue curtain.
Now that I've roughed out a soundstage my thoughts have turned to the workflows, visual design, and usage patterns of digital actors. VR folks call them "avatars" but that's a bit of cultural appropriation that I'm happy to leave behind.
One scenario that I think about is the (not real but should be) "Mindy Kaling Show" with guests Oprah Winfrey, Deepak Chopra, and a musical performance by Maya Rudolph. Mindy KalingOprah and DeepakMaya Rudolph
On the far end of the spectrum from abstract to realistic we would use a capture stage to create a highly accurate model of each person. In tightly controlled environments like a talk show set we might be able to render people without triggering uncanny valley unease.
On the other end we'd hire a technical artist to create symbolic systems that represent each person's brand. For example, in an early prototype of a social VR space (@Spaciblo) I once was an ancient sculpture who interviewed a baby pterodactyl.
@spaciblo While it's tempting to design a system for digital actors that is so flexible that Kelly (our showrunner persona) can choose any point along the spectrum from abstract to realistic, that's like designing a pair of shoes for the beach and the red carpet.
@spaciblo Teams at Facebook have taken a couple of runs at the middle of the abstract <-> realistic spectrum. They have different constraints because their digital actors are configured by people with no training. Their configurators balance flexibility with choice fatigue.
@spaciblo Social spaces like Hubs, AltSpace, and High Fidelity each provide three ways to create digital actors:
🌸 Configurators
🌸 Catalogs
🌸 External editor workflows
Here's a video from Hi Fidelity demonstrating creation and import using Adobe Fuse.
@spaciblo Here's Adobe's configurator. It's designed to require little or no training and links to Mixamo (mixamo.com/#/) which is a skeleton rigging system that markets itself as a "No 3D knowledge required" tool.
@spaciblo My main design criteria for a digital actor workflow are:
🌸 Cast and guests feel well represented
🌸 Audience members feel at ease and connected to people on stage
🌸 Technical artists spend no more than four hours per person per episode using industry standard tools
@spaciblo There also are a variety of technical criteria around rendering and load times as well as concerns about development and maintenance costs of the software. While those are obviously relevant I find it's helpful to pause those thoughts until farther along in the design process.
@spaciblo User Flow 18 steps 2 through 5 quickly gloss over the process of downloading body and wardrobe templates, using Blender to make them match a specific person, and then uploading the results into the system.
Ok, I spent half of today looking at various digital actor creation workflows. The state of the art is pretty much configurators with the escape valve option to use futzy processes in third party editors with variably effective export modules.
We're also still digging our way out of the complexity of the non-standard storage and data transfer formats created during the past two decades.
For this project, since it's an alternative to an existing show format, I'm starting with assumption that cast and guests will "look like" their material selves. I don't expect shows to have the resources to capture and rig art assets that are photorealistic yet not uncanny.
So, for this design for this soundstage for this show format there will be a set of Blender templates. They'll hold wardrobe and a variety of bodies and skins created with modification in mind and tuned for clean export to a browser rendering engine like Three or Babylon.
The goal is for a technical artist to spend between one and four hours putting together a digital actor (body + wardrobe) for each of the five people on stage (two cast members, three guests). So, that's between five and twenty hours for each episode.
The main variable for that time estimate is not technical complexity but how the people being twinned feel about the artist's work and the constraints of the rendering engine.
I'm having fun learning how to model and rig in Blender. If you can believe it, this goofy image represents quite a bit of improvement in my ability to create an outfit from scratch. I usually work with artists who know how to do all of this but I like to learn the ropes. Rough 3D model of a person3D model of a person's top and pants3D model of a person with a visible animation armature.
I tend to pick up a lot of basic skills so that I can better understand the people with whom I work. I'll probably never be a great modeler, rigger, or animator but now I can have a clear conversation with those folks and as a result our work will be better.
My attention has turned to R&D. A few years ago I built a platform for shared WebVR spaces, @Spaciblo, and one path would be to cherry-pick parts from that project (see diagram) and use them with @PotassiumES to jump-start the soundstage code-base. Block diagram of major technical components in the Spaciblo service.Block diagram of major technical components in the Soundstage service.
@spaciblo @PotassiumES The skills involved with figuring out how to build and then building this product involve:
🌸 UX invention and design across three display types (flat, portal, immersive), three control types (page, overlay, spatial), and many new input types.
@spaciblo @PotassiumES 🌸 Interface invention and design across the same display modes and control types, including visual, auditory, animation, and accessibility concerns
@spaciblo @PotassiumES 🌸 So much product management; just a boatload of coordination, empathy, and dynamic planning due to changes in the environment and the raw materials like headsets, GPUs, cloud instances, networks, etc.
@spaciblo @PotassiumES Many engineering niches:
🌸 Real-time rendering
🌸 Simulation of various sorts
🌸 Replication and various net trickery
🌸 Video de/encoding and streaming
🌸 Web UI
🌸 Ops & SRE
🌸 Media storage
🌸 Spatial media tools
🌸 On-set lighting
🌸 On-set sound
@spaciblo @PotassiumES Digital equivalents to traditional actor prep:
🌸 Body modeling, rigging, and animation
🌸 Wardrobe
🌸 Hair
🌸 Make-up
@spaciblo @PotassiumES 🌸 Moderation and other "live team" consideration
🌸 Ethical consideration of creating and running a new media production platform
Looking at @Spaciblo, we'd need to rewrite much of it to make it a simulation and replication system for a CG soundstage. It was the right tool for a different goal; to be a sort of WordPress for shared WebVR spaces. My use of it to make video was an afterthought.
Looking at @PotassiumES, it needs a few improvements to performance, accessibility, and internationalization but it's the only entry in the race for tri-modal display support (flat, portal, and immersive) and the type of input->action mapping required by this design and new hw.
I'm glad to see Industrial Light and Magic talk publicly about their virtual production product, Stagecraft. Together with the press about "The Lion King", Stagecraft is going to validate the idea of a CG soundstage in the minds of many more people.
The cloud soundstage design in this thread is aimed at small distributed teams making live shows while Stagecraft is meant for larger colocated teams making films. So, they share similar underlying concepts of CG sets and digital actors but for different markets and products.
No, you're practicing the art of modeling power suits!
The armpits are still a bit wonky, but this outfit is better than the last one and that's what I'm going for.
Here are a few tips about facial animation that I picked up from various systems:
🌸 Blink rate increases when moving due to air flow
🌸 Exaggerate user's head movements when rendering
🌸 Raise eyebrows and widen eyes on sudden changes in voice pitch or head movement
🌸 Dilate and constrict pupils in changing light (people notice!)
🌸 Snap to mutual eye contact (this one I'd avoid, but apparently it feels good to some people)
The ritualistic and ceremonial forms of late night talk shows are pleasing to me for the same reason that this TikTok account's daily dog greetings are pleasing: the combination of expectation and novelty in a safe and passive format.
I know down to the minute when James Cordon is going to end his monologue, go to commercial break, run a guest package, etc. I know that Ty is going to greet those puppers and offer them a treat. In each show there's room for the energetic bounce of slightly foiled expectations.
Maybe James Cordon has allergies and doesn't get a treat... wait, I'm mixing them up.
The soundstage is a somewhat unique product because it needs automated actors for stand-in audience members, manually puppeted actors so that people can appear on stage using a flat device, and full body actors worn by people in VR.
Let it begin. A terminal window showing a text command to create a new application directory using a Rust programming language tool, Cargo.
Animating mouth movement from audio is coming along nicely at Max Planck Gesellschaft. I haven't seen this quality with live audio so I'm not sure how it would feel on set compared to one of the current computer vision recognizers.
The project site is here: voca.is.tue.mpg.de
Facebook has interesting work on animating digital actors' faces using cameras on their HMDs.
I spent yesterday afternoon thinking and researching a technical architecture for soundstage simulation, replication, and rendering. It's tricky to balance immediate needs and potential future capabilities when tech like expression tracking is on the horizon.
I suspect that Netflix's "talk show problem" could be addressed by a new model of production based around fully distributed production teams and a global pool of guests. nytimes.com/2019/07/02/bus…
Last night to spin down before sleep I wrote a list of the rough order of development of the first milestone: a basic CG soundstage on a server that can be accessed via the flat and immersive web. So, no video tools but a way for folks to "feel" how it will be.
The value of a web-hosted soundstage is reduced (though not entirely eliminated) if our long-haul network speeds degrade and/or bandwidth costs increase. It changes the underlying economics of video (live and canned) and it challenges remote teamwork.
One of the paths I'm researching is how to blend 3D modeled bodies with ~2D animated faces, so I'm watching a fair number of artists on YouTube talk through their animation process.
My goal is for it to take a technical artist less than four hours per cast member or on-set guest to create a digital body with wardrobe. That's a unrealistic goal for a fully modeled realistic body and face that crosses the uncanny valley, even if we could render it in realtime.
I'm focused on learning how to put together workflows for technical artists to create less realistic but still expressive and representative 3D bodies with ~2D faces. Facebook tried a few variations of this idea but they were constrained by making them end-user-configurable.
During this design research I've kept a running list of the illustration techniques and underlying concepts that are running through artists' heads while they draw. I hope to pull out the essential capabilities and to create a workflow to model, draw and rig a digital body.
The other ~1/2 of this problem is how to make the digital body a good representation of how the wearer gestures, talks, and makes facial expressions. There is a fine balance between over-animated, artificial rendering and static, stiff rendering due to lack of information.
I'm thinking through how to fuse 2D illustration and animation techniques with programmatic control of a face based on a microphone stream and tracker data from VR rigs. I think there will be a fair amount of swapping out drawn shapes and morphing the meshes on which they sit.
There will be a lot of horribly disfigured faces during the development process and I'm not looking forward to that part. The rest should be fun, though. The WIP videos should be a hoot.
They're creating a lovely fusion of 3D models with illustrated faces over on @BillieBustUp. I'm aiming for facial feature shapes closer to realistic but using many of the same line and color techniques.
Both @KatieBlueprint and @LukeCutts3D post wonderful WIPs of their original characters. I find them to be a delight.
Blender artists animate by snapping between UV positions of a texture using an armature. I'm going to set up a similar workflow where the artist draws mouth, eyes, and brows on a grid and drops it into a 3D body template for glTF export.
Character sheets like these are simultaneously fun and odd. So much of our 🧠 is devoted to recognizing and normalizing faces that a few lines and shapes turn into a story of who we're seeing and how they feel. A series of basic illustrations of a face making a variety of expressions.
Disney just dropped a neat technique for jaw tracking without markers, basically by training with markers and then smart retargeting of the model. I expect to see new HMDs and aftermarket mods that point cameras at mouths.
A day late for #ScreenshotSaturday but here's a WIP of me playing around with mapping a character sheet like I tweeted above onto a 3D model. The idea is to give illustrators a way to draw up a face and then see it in motion on a digital body.
I'm spinning up a domain (TransmutableSoundstage.com) and a Twitter account (@TSoundstage) for this project. There's not much going on there yet but there's a list signup form on the site for folks who want to hear when I open up early access.
@TSoundstage It's nice to go back to @PotassiumES development. I'm making it lighter-weight for situations where only flat mode is used so that I can use it all over Soundstage.
@TSoundstage @PotassiumES I have my local @PotassiumES updated to the latest Three.js and switched over to using its ES modules instead of external scripts. The lighting is now wonky so I need to figure out what's causing that.
@TSoundstage @PotassiumES At some point I need to write a new input library using the lessons I learned from writing Mozilla's action-input (which nobody uses) to make it both lighter-weight and more capable. I don't think that point is now, plus I'm impatient to write the set simulator and replication.
@TSoundstage @PotassiumES The current tech stack:
🌸Rocket (rocket.rs)
🌸Tera templates for splash, login, etc
🌸@PotassiumES for authed-app pages
Not figured out:
🌸Rust sim and replication (maybe specs ECS)
🌸AV intake & outlet
🌸Control messaging
🌸Structured data and blob persistence
@TSoundstage @PotassiumES There's also a big question mark about which bits will be on both sides of the wire as WebAssembly wadges. I need to make progress on the sim and replication design to know exactly what runs where and how the wasm bits will share data with the JS bits.
@TSoundstage @PotassiumES Now that Mozilla has a plan for re-enabling SharedArrayBuffers I can use them without locking the app into one browser engine.
@TSoundstage @PotassiumES The work that @_alexeykalinin does on their procedural character generator is similar in concept to where I'd like to eventually get within Blender. A technical artist could quickly generate a rigged base mesh and then tweak for an individual person.
Now that I've moved into a coding phase I don't have much interesting to share other than I feel like I'm wading through molasses while I learn Rust and Rocket. Everything seems to takes 10x longer than it would if I were using a language and framework that I already know.
I know #screenshotsaturday is mostly for graphics and games but I'm going to share my ugly WIP site pages because they represent a fair amount of foundational work figuring out nice build and test systems, grokking Rocket.rs, writing some command line tools, etc.
I'm also reviewing the user flow design documents (remember those? 😺) and documenting the various data structures and information paths that need to exist in the implementation. I feel like the design docs just saved me hundreds of hours of frustrating development!
I've also read other people's code to see how they've approached video switching and sound mixing. It's a big space and I could easily spend the next five years scratch-building a feature-complete A/V solution. 🙀 Instead I'm going to use open libraries as much as possible.
I have now data models for services and clients based on their implied existence in the user flows. For example, if a crew member moves a camera then that info has to originate in a client, propagate to several services (sim, renderer, sound mixer, etc) and then go to VR clients.
There are interesting aspects of purpose-built XR spaces like a talk show set when compared to general purpose spaces like High Fidelity or even different-purpose spaces like Hubs.
There is a common core of tech (simulation, replication, rendering, ...) but so much changes with people's goals, roles, & relationships that I have trouble seeing how the scale, complexity, and specifics of these shows could work without baking them into the space itself.
I've worked on or been an involved creator in ~5 general-purpose networked spaces (some flat and some XR) and I've taste tested roughly the same number. In each case those involved struggled to build usable apps and environments for all but the simplest scenarios.
I'm hoping we'll eventually make the core tech for the wider web available as open modules that allow us to rapidly prototype a variety of purpose-built spaces. Perhaps eventually we'll know enough to build layers on top of those to produce excellent general-purpose spaces.
I was thinking more about the experience of being an audience member and how holding two tracked controllers prevents people from clapping in the normal way. It would sound odd for a talk show audience to to vocalize but not clap. So, how to solve this?
Once approach is to ask people to put down the controllers when they want to clap and hope that their headset mics pick up the sound. The upside is it's a pretty natural movement. The downside is that their controllers (used to position their digital hands) are static.
Another path is to detect clap-like motion and generate a clap-like noise for them; louder for faster movement, in time with their motion, etc. I tried this and it feels deeply unsatisfying (no hand-to-hand smack) and also like I'm going to break my controllers by going too far.
I'm going to try detecting a single-handed shaking movement (like rattling a tambourine) and see if I can map those to satisfying clap noises. Suggestions are welcome!
This week I've worked mostly on the messaging channels between browsers and the service, making them resilient to various network problems and preparing for the back end modules like simulators, audio mixers, and video switches to send state updates and receive control messages.
Oh, @heymark mentioned ASL applause and it's a lot like what I was thinking about as the "tambourine rattling motion" but with a controller in each hand and generating some appreciative clapping noises.
@heymark It's interesting to see television talk show brands like @TeamCoco expand into podcasts. It will be more interesting to see the scrappier podcasting teams produce talk show video streams using tools like @TSoundstage.
@heymark @TeamCoco @TSoundstage In my continuing attempt to be the most boring #screenshotsaturday creator, here's what I see when I'm working on Transmutable Soundstage. I'm going to be relieved when I am far enough along to actually push pixels.
@heymark @TeamCoco @TSoundstage It's going to be an interesting decade for filmmakers and showrunners as they gain the ability to pull together global, fully-remote teams for more effective collaboration than is possible with LA traffic. 🚕🚕🚕
Ok, I've spent enough time down in foundational code that it's time to prep some test art!
🌸 A talk show set
🌸 A few props
🌸 Digital bodies for myself and a few others
🌸 Placeholder digital bodies for live audience members
Progress has been a bit slow in recent weeks while I was traveling and recovering from traveling but the message passing flow between browsers, public-facing services, and internal services is far enough along to start roughing out the simulation / replication / rendering bits.
I'm working toward the basic functionality of:
🌸Hosting a few people on set via WebXR
🌸Temporary in-browser camera views into the set that can be recorded & streamed with OBS until the back-end renderer and video switcher are implemented
Well, I ended up framing out the simulator code instead of diving back into Blender to make test art. I'm quite impatient for Transmutable Soundstage sets to be visible and inhabitable places.
I'd love to eventually incorporate this post production pattern (edit text to edit an A/V track) into @TSoundstage. It's tricky with multi-camera+mic streams, though.
@TSoundstage Today I laid in all of the DB tables by using the data models I derived from the user flows. Waterfall design gets a bad rap and often isn't a good fit for a org's culture, but when it works it feels ✨very nice✨.
@TSoundstage I'm still plugging away at the persistence layer and business logic. It's a bit tedious but writing and testing here will save a ton of headaches down the road when someone (probably me) attempts to put the system into an invalid state and is immediately rejected.
@TSoundstage Ok! All of the Create Read Update Delete (CRUD) and business logic is in the models and persistence layers, nicely tested. *whew* Now I get to jump up to web UIs. Actual pixels will move!
@TSoundstage I made some pixels move. I feel better.
@TSoundstage While KS is on fire there's another crowdfunder that's doing really interesting (and as AFAIKT positive) work in the community of filmmakers and showrunners. As I start developing shows for @TSoundstage it's interesting to watch @SeedAndSpark.
@TSoundstage @seedandspark Another ~uninteresting #ScreenshotSaturday but pieces and parts are poking above the waterline. Now that the business and persistence layers are in place I can frame out navigation on the site enough to be able to go to a show's page and enter the set. Also, command line tools.
@TSoundstage @seedandspark I'm spending next weekend at the @OfficeNomads "Finish Up Weekend" where I'll prep art assets for the initial talk show set. In the meantime I'm returning to the simulator and replication code with the goal of enabling remote renderers to load a set and react to changes.
I'm back in the guts of the soundstage simulation and replication code. Yesterday was the first time that the browser client received state updates, so a bit of a small milestone.
Missing some Tweet in this thread?
You can try to force a refresh.

Like this thread? Get email updates or save it to PDF!

Subscribe to 🌸 Trevor Flowers 🌸
Profile picture

Get real-time email alerts when new unrolls are available from this author!

This content may be removed anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!

This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!