To make Apps with Magical User Experiences, you need to get all the performance possible from the hardware.
From the on-device ML perspective, you can achieve that using the TFLite Delegates.
They enable you to access the power of HW acceleration.
1/6🧵
Your phone's CPU is usually very fast but as a multi-purpose processor it's not optimized for the heavy math that ML needs
Like on their big brothers (servers 🤓) phones have also more specialized chips more suitable for ML, the most popular being the GPUs
2/6🧵
Another popular accelerator is the Qualcomm Hexagon DSP that has shown 75% reduction in power consumption.
On the Apple side, you can use the Core ML delegate to access the Neural Engine processor on newer i[Phones|Pads] and that can give huge boosts in performance!
3/6🧵
On Android, you can also use the NNAPI delegate on newer OS versions (API Level 27) and this will enable access to the phones GPU, DSP or NPU, depending on what's available.
When you have your TensorFlow Model and want to use it on a mobile device, you'll need to convert it to the TFLite format.
This process can be done in two ways:
- Using the Python API
- Using a command line tool
Let's look into some more details….
1/6🧵
Why do we need to convert?
The TFLite is an optimized format (Flatbuffer) for faster loading
To keep the framework lite and fast, all the Operations are optimized for mobile execution but not all TF operations are available
Usually we imagine Machine Learning models running on expensive servers with lots of memory and resources
A change that is enabling complete new types of apps is executing ML models on the edge, like on phones and microcontrollers
Let's find out why that matters
[2min]
1/6🧵
Why would you want to run a ML model on a phone?
Lower Latency: if you want to get inference on real time, running a model over a cloud API is not going to give a good user experience.
Running locally is much better and potentially much faster ⚡️
2/6🧵
When running ML models on-device, your app will be able to keep working even without network connectivity
For example, if your app translates text, it will still work in another country, when you need most and it won't use your hard earned money on roaming fees!