
Using realtime mode on muna.ai
Making Predictions in Realtime
Before callingmuna.predictions.create on every frame, you must first ensure that the predictor has been
preloaded on the current device.
Preloading the Predictor
To preload a predictor, make a prediction and pass in empty inputs:Making Realtime Predictions
After preloading the predictor, you can then make predictions in realtime using your app’s update loop, or other similar mechanisms.Performance Considerations
Muna automatically optimizes the runtime performance of predictors on a given device by leveraging aggregated performance data. While this means that developers have little control over performance, there are several ways to ensure a smooth user experience in your application:Finding Similar Predictors
Finding Similar Predictors
We are always working to find and bring newer and faster predictors to Muna.
To this end, we are working on adding a ‘Performance’ tab to predictors on the Muna explore page.
This tab will provide granular performance statistics collected from the millions of devices that use that
predictor.
Overriding the Acceleration Type
Overriding the Acceleration Type
When loading a predictor, our platform informs the Muna client about the best hardware primitive to use
for accelerating predictions:
Some of our client SDKs allow you to override the
| Acceleration | Notes |
|---|---|
cpu | Use the CPU to accelerate predictions. This is always enabled. |
gpu | Use the GPU to accelerate predictions. |
npu | Use the neural processor to accelerate predictions. |
Muna currently does not support multi-GPU acceleration. This is planned for the future.
acceleration used to power predictions:You can opt to use multiple acceleration types using a
bitwise-OR:
acceleration: Acceleration.GPU | Acceleration.NPU.The prediction
acceleration only applies when preloading a predictor.
Once a predictor has been loaded, the acceleration is ignored.Specifying the Acceleration Device
Specifying the Acceleration Device
First, you should absolutely (absolutely) never ever do this unless you know what the hell you’re doing.With that out of the way, some Muna clients allow you to specify the acceleration device used to make predictions.
Our clients expose this field as an untyped integer or pointer. The underlying type depends on the current operating system:
| OS | Device type | Notes |
|---|---|---|
| Android | - | Currently unsupported. |
| iOS | id<MTLDevice> | Metal device. |
| Linux | int* | CUDA device ID pointer. |
| macOS | id<MTLDevice> | Metal device. |
| visionOS | id<MTLDevice> | Metal device. |
| Web | GPUDevice | WebGPU device. |
| Windows | ID3D12Device* | DirectX 12 device. |
The prediction
device only applies when preloading a predictor.
Once a predictor has been loaded, the device is ignored.Concurrency with Threading
Concurrency with Threading
If your development environment exposes a threading model, it is often beneficial to maintain a dedicated thread
to make predictions.Furthermore, you might benefit from making predictions at a lower rate than realtime. While this
approach does not directly improve performance, it could alleviate system pressure, thereby enhancing the
interactivity of your application.