import { Muna } from "muna"// π₯ Create an OpenAI clientconst openai = new Muna({ accessKey: "..." }).beta.openai;// π₯ Create a chat completionconst completion = await openai.chat.completions.create({ model: "@openai/gpt-oss-20b", messages: [{ role: "user", content: "What is the capital of France?" }]});// π Print the resultconsole.log(completion.choices[0]);
Our OpenAI-style client in muna.beta.openai that has the same interface
as the official OpenAI client. This allows you to migrate in two lines of code.
The first time you run the code above might take a few minutes, because we have to download the (rather large) model weights. Subsequent runs should take a few seconds.
Munaβs central feature is the ability to choose where inference runs, on each request. Letβs run
the same model on a datacenter GPU:
Copy
// π₯ Create a chat completion with a datacenter GPUconst completion = await openai.chat.completions.create({ model: "@openai/gpt-oss-20b", messages: [{ role: "user", content: "What is the capital of France?" }], acceleration: "remote_a100"});
The first time you run the code above might take a few minutes, because we have to spin up a container on the cloud GPUs. Subsequent runs take only a second.