
Using realtime mode on muna.ai
Making Predictions in Realtime
Before callingmuna.predictions.create
on every frame, you must first ensure that the predictor has been
preloaded on the current device.
Failing to preload a predictor before using it in realtime will result in your
Muna
client making network requests
every frame in an attempt to initially load the predictor. This will lead to your app hanging and crashing.Preloading the Predictor
To preload a predictor, make a prediction and pass in empty inputs:This works by forcing the
Muna
client to fetch and initialize the predictor. The empty inputs
will cause the prediction to fail due to missing inputs, but it can be safely ignored.Making Realtime Predictions
After preloading the predictor, you can then make predictions in realtime using your app’s update loop, or other similar mechanisms.Performance Considerations
Muna automatically optimizes the runtime performance of predictors on a given device by leveraging aggregated performance data. While this means that developers have little control over performance, there are several ways to ensure a smooth user experience in your application:Finding Similar Predictors
Finding Similar Predictors
We are always working to find and bring newer and faster predictors to Muna.
To this end, we are working on adding a ‘Performance’ tab to predictors on the Muna explore page.
This tab will provide granular performance statistics collected from the millions of devices that use that
predictor.
If you would like us to bring an open-source AI model or function to Muna, let us know.
Overriding the Acceleration Type
Overriding the Acceleration Type
When loading a predictor, our platform informs the Muna client about the best hardware primitive to use
for accelerating predictions:
Some of our client SDKs allow you to override the
Acceleration | Notes |
---|---|
cpu | Use the CPU to accelerate predictions. This is always enabled. |
gpu | Use the GPU to accelerate predictions. |
npu | Use the neural processor to accelerate predictions. |
Muna currently does not support multi-GPU acceleration. This is planned for the future.
acceleration
used to power predictions:You can opt to use multiple acceleration types using a
bitwise-OR:
acceleration: Acceleration.GPU | Acceleration.NPU
.The prediction
acceleration
only applies when preloading a predictor.
Once a predictor has been loaded, the acceleration
is ignored.The prediction
acceleration
is merely a hint, which the Muna client will try its best to honor.
Setting an acceleration
does not guarantee that all or any operation in the prediction function
will actually use that acceleration type.Specifying the Acceleration Device
Specifying the Acceleration Device
First, you should absolutely (absolutely) never ever do this unless you know what the hell you’re doing.With that out of the way, some Muna clients allow you to specify the acceleration device used to make predictions.
Our clients expose this field as an untyped integer or pointer. The underlying type depends on the current operating system:
OS | Device type | Notes |
---|---|---|
Android | - | Currently unsupported. |
iOS | id<MTLDevice> | Metal device. |
Linux | int* | CUDA device ID pointer. |
macOS | id<MTLDevice> | Metal device. |
visionOS | id<MTLDevice> | Metal device. |
Web | GPUDevice | WebGPU device. |
Windows | ID3D12Device* | DirectX 12 device. |
The prediction
device
only applies when preloading a predictor.
Once a predictor has been loaded, the device
is ignored.The prediction
device
is merely a hint, which the Muna client will try its best to honor.
Setting a device
does not guarantee that all or any operation in the prediction function
will actually use that acceleration device.Concurrency with Threading
Concurrency with Threading
If your development environment exposes a threading model, it is often beneficial to maintain a dedicated thread
to make predictions.Furthermore, you might benefit from making predictions at a lower rate than realtime. While this
approach does not directly improve performance, it could alleviate system pressure, thereby enhancing the
interactivity of your application.
Muna clients are not thread-safe. Never use a single
Muna
client across multiple threads.