Natural language interfaces have truly arrived. Here's ChatARKit: an open source demo using #chatgpt to create experiences in #arkit. How does it work? Read on. (1/)
JavaScriptCore is used to create a JavaScript environment. User prompts are wrapped in additional descriptive text that inform ChatGPT of what objects and functions are available to use. The code it produces is then executed directly. You'll find this in Engine.swift. (2/)
3D assets are imported from Sketchfab. When I say "place a tree frog...", it results in: createEntity("tree frog"). Engine.swift implements this and instantiates a SketchfabEntity that searches Sketchfab for "tree frog" and downloads the first model it finds. (3/)
Given a function like "getPlanes()" that returns an array of planes, ChatGPT is often smart enough to figure out how to find the "nearest plane" or "floor" but sometimes produces wild nonsense. To keep it on the rails, it helps to supply functions like "getNearestPlane()". (4/)
I use @ggerganov excellent implementation of OpenAI Whisper for state-of-the-art speech recognition running *on-device*. My repo has a Swift wrapper for it and an example of how to use AVAudioCapture. (5/)