Making Remote Predictions

Muna supports creating predictions on serverless containers, backed by powerful GPUs.

All predictors compiled for Linux x86_64 on Muna support remote predictions by default.

Remote predictions are an experimental feature, and can be drastically altered or removed on short notice.

Making a Remote Prediction

Use the muna.beta.predictions.remote.create method to request a prediction to be created in the cloud:

import { Muna } from "muna"

// 💥 Create your Muna client
const muna = new Muna({ accessKey: "..." });

// 🔥 Run the prediction remotely
const prediction = await muna.beta.predictions.remote.create({
    tag: "@fxn/greeting",
    inputs: { name: "Yusuf" }
});

// 🚀 Print the result
console.log(prediction.results[0]);

Leveraging GPU Acceleration

One advantage with remote predictions is having access to orders of magnitude more compute than on your local device. Muna supports specifying a RemoteAcceleration when creating remote predictions:

// Create a remote prediction on an Nvidia A100 GPU
const prediction = await muna.beta.predictions.remote.create({
    tag: "@meta/llama-3.1-70b",
    inputs: { ... },
    acceleration: "remote_a100"
});

Below are the currently supported types of RemoteAcceleration:

Remote Acceleration	Notes
`remote_auto`	Automatically use the ideal remote acceleration.
`remote_cpu`	Predictions are run on AMD CPU servers.
`remote_a40`	Predictions are run on an Nvidia A40 GPU.
`remote_a100`	Predictions are run on an Nvidia A100 GPU.

Remote predictions are priced by the remote acceleration, per second of prediction time (i.e. prediction.latency). See our pricing for more information.

If you want to self-host the remote acceleration servers in your VPC or on-prem, reach out to us.

Get Started

Making Predictions

Creating Predictors

Insiders

Making Remote Predictions

Making a Remote Prediction

Leveraging GPU Acceleration

Get Started

Making Predictions

Creating Predictors

Insiders

​Making a Remote Prediction

​Leveraging GPU Acceleration

Making a Remote Prediction

Leveraging GPU Acceleration