Introduction

Run inference with an OpenAI-compatible client, and optimize for cost, latency, or throughput with one line of code.

Installing Muna

We provide SDKs for common development frameworks:

# Run this in Terminal
$ npm install muna

Most of our client SDKs are open-source. Star them on GitHub!

Run your First Inference

Generate an access key, then create a chat completion locally with @openai/gpt-oss-20b.

import { Muna } from "muna"

// 💥 Create an OpenAI client
const openai = new Muna({ accessKey: "..." }).beta.openai;

// 🔥 Create a chat completion
const completion = await openai.chat.completions.create({
  model: "@openai/gpt-oss-20b",
  messages: [{ role: "user", content: "What is the capital of France?" }]
});

// 🚀 Print the result
console.log(completion.choices[0]);

Our OpenAI-style client in muna.beta.openai that has the same interface as the official OpenAI client. This allows you to migrate in two lines of code.

The first time you run the code above might take a few minutes, because we have to download the (rather large) model weights. Subsequent runs should take a few seconds.

Run on a Datacenter GPU

Muna’s central feature is the ability to choose where inference runs, on each request. Let’s run the same model on a datacenter GPU:

// 🔥 Create a chat completion with a datacenter GPU
const completion = await openai.chat.completions.create({
  model: "@openai/gpt-oss-20b",
  messages: [{ role: "user", content: "What is the capital of France?" }],
  acceleration: "remote_a100"
});

The first time you run the code above might take a few minutes, because we have to spin up a container on the cloud GPUs. Subsequent runs take only a second.

Next Steps

With Muna, you control which model to run, and where to run it, with zero infrastructure setup.

Explore Models

Explore some popular models you can use immediately.

Join the Conversation

Come learn more and ask questions in our community Slack.

Get Started

Running Models

Compiling Models

Insiders

Installing Muna

Run your First Inference

Run on a Datacenter GPU

Next Steps

Explore Models

Join the Conversation

Get Started

Running Models

Compiling Models

Insiders

​Installing Muna

​Run your First Inference

​Run on a Datacenter GPU

​Next Steps

Explore Models

Join the Conversation

Installing Muna

Run your First Inference

Run on a Datacenter GPU

Next Steps