// CONNECTION
| Base URL |
https://api.phantom.codes/v1 |
| Tor (.onion) |
http://jzqvbfmrlt5ye467joz75dg6xdurc6bniozautqlil5b3tbf777zmsid.onion/v1 |
| Auth header |
Authorization: Bearer sk-... |
| Get a key |
Buy credit at phantom.codes. Pay XMR, receive sk-... once. |
| Compatibility |
Drop-in OpenAI replacement. Point any OpenAI SDK at the Base URL above. |
// 01
Quick start
▸ curl
$ curl https://api.phantom.codes/v1/chat/completions \
-H "Authorization: Bearer YOUR_PHANTOM_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "phala/qwen-2.5-7b-instruct",
"messages": [{"role":"user","content":"Hello"}]
}'
▸ Python (openai client)
from openai import OpenAI
client = OpenAI(
api_key="YOUR_PHANTOM_KEY",
base_url="https://api.phantom.codes/v1",
)
r = client.chat.completions.create(
model="phala/kimi-k2.6",
messages=[{"role": "user", "content": "Hello"}],
)
print(r.choices[0].message.content)
▸ Node (openai client)
import OpenAI from "openai";
const client = new OpenAI({
apiKey: "YOUR_PHANTOM_KEY",
baseURL: "https://api.phantom.codes/v1",
});
const r = await client.chat.completions.create({
model: "phala/gpt-oss-120b",
messages: [{ role: "user", content: "Hello" }],
});
console.log(r.choices[0].message.content);
// 02
Don't have XMR?
Phantom takes Monero only. Four paths, pick by what you start with and how much friction you'll absorb.
▸ Get a wallet first
- Cake Wallet (iOS/Android). Easiest. Built-in swap. Tor option.
- Feather Wallet (desktop). Lightweight, Tor-first. Recommended for Phantom users.
- Monero GUI (Windows/Mac/Linux). Official, full node. Power users.
- Monerujo (Android). Mobile alternative to Cake.
▸ Path A. Have crypto already (BTC, USDT, ETH, etc.)
Swap to XMR. No account, no KYC, no signup.
- Trocador. Aggregator over 10+ providers. Best rate auto-routed. KYC + log rating per provider. Default recommendation.
- FixedFloat. Direct. 0.5% floating / 1% fixed.
- MajesticBank. No-KYC, often best on small swaps. Onion mirror.
- eXch. A-rated no-KYC. Tor by default.
Privacy on the source coin is your problem. BTC swaps leak the BTC side. Use Tor and a non-KYC source if your threat model requires it.
▸ Path B. Have a gift card (Amazon, Steam, Visa, etc.)
Convert your card to XMR on a P2P market. Phantom doesn't take gift cards directly. Too much fraud risk; we'd compromise our brand.
- Easy mode: RoboSats. Trade gift card → BTC (Lightning) over Tor browser. No install. Then use Path A above (Trocador) to swap BTC → XMR.
- Maximalist mode: RetoSwap. Trade gift card → XMR directly. Open source software you run locally. No third-party swap in between.
Both use escrow. Both handle fraud disputes. Phantom never sees your card.
▸ Path C. Have only fiat, no crypto yet
Buy XMR with cash, card, or bank transfer.
- Cake Wallet built-in buy. Card or Apple Pay via routed providers. KYC by the buy provider, not Phantom. Fastest for normies.
- RetoSwap. P2P with cash by mail, SEPA, ACH, in-person. No KYC. Slower but ID-free.
- Kraken. KYC heavy, well-known. Buy XMR, withdraw to your wallet.
If you use a centralized exchange, withdraw immediately. Never leave XMR sitting on an exchange. They can freeze, lose, or be subpoenaed.
▸ Path D. Mine it
Slow, but the only path with zero counterparty. CPU-mineable. Run Monero GUI or join a pool with Monerujo.
// FAIR WARNING
These are third-party services. Their privacy policies are theirs, not ours. Their domains may change. Verify before sending funds. We list them because they're the best options we know of today; we cannot vouch for them tomorrow. No affiliate links here.
First time? Cake Wallet + Path A (in-app swap) gets you to Phantom in ~10 minutes from zero.
// 03
Endpoints
| METHOD | PATH | AUTH | PURPOSE |
POST | /v1/chat/completions | Bearer | OpenAI-compatible chat. Streams. |
POST | /v1/embeddings | Bearer | OpenAI-compatible embeddings. |
POST | /v1/images/generations | Bearer | Image generation (DALL-E, Stable Diffusion, Recraft). |
GET | /v1/models | none | List models + pricing (markup included). |
GET | /v1/bundles | none | List credit bundles. |
POST | /v1/purchase | none | Body: {"bundle":"small"} OR {"amount_usd":7.5}. |
GET | /v1/purchase/{id}/status | none | Poll payment. Returns plaintext key once on completion. |
GET | /v1/purchase/{id}/qr.svg | none | QR code (SVG) for the payment URI. |
GET | /v1/key/balance | Bearer | Remaining credit + expiry. |
POST | /v1/key/rotate | Bearer | Issue new key, carry credit, deactivate old. |
GET | /v1/signature/{id} | Bearer | Per-response signature (step 2 of verification). |
GET | /v1/inference-attest | Bearer | Phala TDX + NVIDIA CC attestation report (step 3). |
GET | /health | none | Liveness. |
// 04
Purchase flow
Send POST /v1/purchase with either a bundle name or a custom USD amount.
$ curl -X POST https://api.phantom.codes/v1/purchase \
-H "Content-Type: application/json" \
-d '{"amount_usd": 25}'
You receive:
{
"payment_id": "abc...",
"xmr_address": "75z...",
"xmr_amount": "0.0645...",
"bundle": "custom",
"credit_usd": 25,
"expires_at": "2026-05-19T06:00:00+00:00"
}
Send XMR. Poll /v1/purchase/{id}/status. States:
pending
→
confirming
→
ready
→
completed
Key drops in response body when status flips to completed. Shown ONCE. Recovery is not possible.
// 05
Key rotation
If you suspect your key is compromised, rotate. Credit + expiry transfer to the new key. Old key deactivates.
$ curl -X POST https://api.phantom.codes/v1/key/rotate \
-H "Authorization: Bearer OLD_KEY"
# → {"api_key": "sk-new...", "shown_once": true}
// 06
Streaming
Pass "stream": true. SSE-compatible. Final chunk includes a usage block used for billing. Aborting mid-stream is billed for tokens already generated.
// 07
Vision
Only phala/qwen3-vl-30b-a3b-instruct accepts images. Pass content as an array with image_url parts:
{
"model": "phala/qwen3-vl-30b-a3b-instruct",
"messages": [{
"role": "user",
"content": [
{"type": "text", "text": "What's in this image?"},
{"type": "image_url", "image_url": {"url": "https://example.com/cat.jpg"}}
]
}]
}
// 08
Image generation
OpenAI-compatible image API. Flat-rate billing per image. No tokens. Generated URLs from upstream live for ~1 hour — download them.
▸ curl
$ curl https://api.phantom.codes/v1/images/generations \
-H "Authorization: Bearer YOUR_PHANTOM_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "stability/stable-diffusion-3-5-large",
"prompt": "A misty cyberpunk alley at dawn, anamorphic lens, 35mm film",
"n": 1,
"size": "1024x1024",
"quality": "standard"
}'
▸ Python (openai client)
from openai import OpenAI
client = OpenAI(api_key="YOUR_PHANTOM_KEY", base_url="https://api.phantom.codes/v1")
img = client.images.generate(
model="stability/stable-diffusion-3-5-large",
prompt="A serene mountain at sunset with aurora borealis",
n=1,
size="1024x1024",
quality="standard",
)
print(img.data[0].url)
▸ Parameters
| FIELD | TYPE | NOTES |
model | string | One of the image models in /v1/models (kind: image). |
prompt | string | Required. Max 4000 chars. |
n | integer | Images to generate. 1–10. Charged per image. |
size | string | 256x256, 512x512, 1024x1024, 1536x1536, 2048x2048. Bounded by model's max_size. |
quality | string | standard or hd. Affects price. |
response_format | string | url (default) or b64_json. |
style | string | Model-specific (e.g. vivid / natural for DALL-E). |
negative_prompt | string | Model-specific. Stability + Recraft honor this. |
▸ Pricing
Flat per image at the requested quality. Marked up over the vendor's wholesale rate. Live numbers come from GET /v1/models under image_pricing_usd_user. Indicative:
| MODEL | STANDARD | HD |
segmind/sd3-turbo | ~$0.02 | ~$0.03 |
openai/dall-e-3 | ~$0.06 | ~$0.12 |
stability/stable-diffusion-3-5-medium | ~$0.05 | ~$0.09 |
stability/stable-diffusion-3-5-large | ~$0.06 | ~$0.12 |
stability/stable-diffusion-ultra | ~$0.12 | ~$0.18 |
// PRIVACY NOTE
All current image models are tier PROXY — gateway runs in TDX but the model itself runs on the vendor's normal infrastructure (Stability, OpenAI, Recraft, Segmind). They see your prompt content. Phantom hides only your identity. Don't generate images of anything you'd be unhappy publishing.
// 09
Attestation
Phala provides two cryptographic layers, exposed through three endpoints:
- Per-response signature: ECDSA signature bound to a specific chat completion id.
- Per-model attestation report: Intel TDX (CPU) quote + NVIDIA Confidential Computing (GPU) attestation, tied to the signing address that signed your response.
▸ Three-step verification
Bind each individual inference to verified TEE hardware:
Step 1. Make any chat completion. Capture response.id.
$ curl https://api.phantom.codes/v1/chat/completions \
-H "Authorization: Bearer YOUR_KEY" \
-d '{"model":"phala/kimi-k2.6","messages":[{"role":"user","content":"hi"}]}'
# → { "id": "chatcmpl-abc123...", ... }
Step 2. Fetch the signature for that specific response id.
$ curl "https://api.phantom.codes/v1/signature/chatcmpl-abc123...?model=phala/kimi-k2.6" \
-H "Authorization: Bearer YOUR_KEY"
# → { "text": "...", "signature": "0x...", "signing_address": "0x56d0...8b" }
Step 3. Fetch the attestation report. Bind it to the same signing_address and a fresh nonce of your choosing.
$ NONCE=$(openssl rand -hex 32)
$ curl "https://api.phantom.codes/v1/inference-attest?model=phala/kimi-k2.6&nonce=$NONCE&signing_address=0x56d0...8b" \
-H "Authorization: Bearer YOUR_KEY"
# → { intel_quote, nvidia_payload, signing_address, nonce, event_log, ... }
▸ Offline verification
Verify the Intel TDX quote (CPU):
POST https://cloud-api.phala.com/api/v1/attestations/verify # with intel_quote
Verify the NVIDIA Confidential Computing payload (GPU):
POST https://nras.attestation.nvidia.com/v3/attest/gpu # with nvidia_payload
Verify the signature locally: recover the public key from signature over the canonical text field, confirm it matches signing_address. The attestation report binds signing_address to the genuine TEE that produced it. Together: response → signature → TEE quote chain.
// 10
Paste-to-agent
Copy any block below into your AI agent's config. Replace sk-your-phantom-key with the key you get after purchase. Every block on this page has a copy button — click it.
▸ Shell env (works with most agent SDKs)
LangChain, LlamaIndex, Vercel AI SDK, Pydantic-AI, llm, mods, shell-gpt, aichat, chatblade, tgpt — anything that respects OpenAI env.
export OPENAI_BASE_URL=https://api.phantom.codes/v1
export OPENAI_API_KEY=sk-your-phantom-key
▸ Claude Code / Cursor / Cline (OpenAI-compatible providers)
In each tool's settings UI pick "OpenAI Compatible" and paste:
base_url: https://api.phantom.codes/v1
api_key: sk-your-phantom-key
model: phala/kimi-k2.6
▸ opencode (sst) — ~/.config/opencode/opencode.json
{
"$schema": "https://opencode.ai/config.json",
"provider": {
"phantom": {
"npm": "@ai-sdk/openai-compatible",
"options": {
"baseURL": "https://api.phantom.codes/v1",
"apiKey": "{env:PHANTOM_API_KEY}"
},
"models": {
"phala/kimi-k2.6": {"name": "Kimi K2.6"},
"phala/glm-5.1": {"name": "GLM 5.1"},
"phala/gpt-oss-120b": {"name": "GPT-OSS 120B"},
"phala/glm-4.7-flash": {"name": "GLM 4.7 Flash"}
}
}
}
}
▸ Continue.dev — ~/.continue/config.json
{
"models": [{
"title": "Phantom Kimi",
"provider": "openai",
"model": "phala/kimi-k2.6",
"apiKey": "sk-your-phantom-key",
"apiBase": "https://api.phantom.codes/v1"
}]
}
▸ Aider
aider --openai-api-base https://api.phantom.codes/v1 \
--openai-api-key sk-your-phantom-key \
--model openai/phala/kimi-k2.6
▸ OpenHands
export LLM_API_KEY=sk-your-phantom-key
export LLM_BASE_URL=https://api.phantom.codes/v1
export LLM_MODEL=openai/phala/kimi-k2.6
docker run -it ghcr.io/all-hands-ai/openhands:main
▸ Drop-in for an LLM prompt
Paste this into your LLM and tell it "configure my project to use phantom":
Use phantom (phantom.codes) as my LLM provider.
It is OpenAI-API-compatible. Set:
base_url = https://api.phantom.codes/v1
api_key = sk-your-phantom-key (provided separately)
Pick a model from: https://api.phantom.codes/v1/models
Defaults that work well:
phala/kimi-k2.6 - long-horizon coding agent
phala/glm-5.1 - premium reasoning + coding
phala/gpt-oss-120b - balanced default
phala/glm-4.7-flash - fast agentic coding, 202K ctx
For image generation use POST /v1/images/generations with model
stability/stable-diffusion-3-5-large or openai/dall-e-3.
For embeddings use POST /v1/embeddings with model
qwen/qwen3-embedding-8b.
Wire the SDK exactly like OpenAI (same shape, same auth header).
// 11
Integrations
Phantom is OpenAI-compatible. Works with anything that takes baseUrl + apiKey.
▸ Any OpenAI-compatible CLI
Two env vars in your shell config (~/.zshrc or ~/.bashrc):
$ export OPENAI_BASE_URL=https://api.phantom.codes/v1
$ export OPENAI_API_KEY=sk-your-phantom-key
Works with: aichat, llm (simonw), mods (charm), chatblade, shell-gpt, tgpt, ollama (via --openai compat), chatgpt-cli, plus any agent SDK that reads OPENAI_BASE_URL (LangChain, LlamaIndex, Vercel AI SDK, Pydantic-AI). Set the model name in each tool's config to one from /v1/models, e.g. phala/kimi-k2.6.
Wrappers around Moonshot's Kimi or other proprietary CLIs that hardcode their vendor URL won't work. They bypass your env. Use any of the above generic CLIs and select the phala/kimi-k2.6 model instead.
▸ opencode (sst)
~/.config/opencode/opencode.json:
{
"$schema": "https://opencode.ai/config.json",
"provider": {
"phantom": {
"npm": "@ai-sdk/openai-compatible",
"options": {
"baseURL": "https://api.phantom.codes/v1",
"apiKey": "{env:PHANTOM_API_KEY}"
},
"models": {
"phala/kimi-k2.6": {"name": "Kimi K2.6"},
"phala/glm-5.1": {"name": "GLM 5.1"},
"phala/gpt-oss-120b": {"name": "GPT-OSS 120B"},
"phala/glm-4.7-flash":{"name": "GLM 4.7 Flash"}
}
}
}
}
▸ OpenHands
$ export LLM_API_KEY=sk-your-key
$ export LLM_BASE_URL=https://api.phantom.codes/v1
$ export LLM_MODEL=openai/phala/kimi-k2.6
$ docker run -it ghcr.io/all-hands-ai/openhands:main
▸ Aider
$ aider --openai-api-base https://api.phantom.codes/v1 \
--openai-api-key sk-your-key \
--model openai/phala/kimi-k2.6
▸ Cline (VS Code)
Settings → API Provider: OpenAI Compatible
Base URL: https://api.phantom.codes/v1
API Key: your sk-...
Model: phala/kimi-k2.6 (or any from /v1/models).
▸ Continue.dev
~/.continue/config.json:
{
"models": [{
"title": "Phantom Kimi",
"provider": "openai",
"model": "phala/kimi-k2.6",
"apiKey": "sk-your-key",
"apiBase": "https://api.phantom.codes/v1"
}]
}
▸ OpenClaw
~/.openclaw/openclaw.json:
{
models: {
mode: "merge",
providers: {
"phantom": {
baseUrl: "https://api.phantom.codes/v1",
apiKey: "${PHANTOM_API_KEY}",
api: "openai-completions",
models: [
{ id: "phala/kimi-k2.6", contextWindow: 262144, maxTokens: 32000 },
{ id: "phala/glm-5.1", contextWindow: 202752, maxTokens: 32000 },
{ id: "phala/gpt-oss-120b", contextWindow: 131072, maxTokens: 32000 },
{ id: "phala/glm-4.7-flash", contextWindow: 202752, maxTokens: 32000 }
]
}
}
}
}
▸ OpenWebUI
Settings → Connections → OpenAI API
URL: https://api.phantom.codes/v1
Key: your sk-...
// FUNCTION-CALLING NOTE
Agentic tools (opencode, Cline, OpenHands) lean on tool/function calling. Pick a capable model:
- Strong:
phala/kimi-k2.6, phala/glm-5.1
- Decent:
phala/glm-4.7-flash, phala/gpt-oss-120b
- Chat-only:
phala/qwen-2.5-7b-instruct, phala/uncensored-24b
// 12
Rate limits
| ENDPOINT | LIMIT | KEYED BY |
/v1/chat/completions | 60/minute | API key hash |
/v1/purchase | 10/minute | IP hash (never logged) |
/v1/purchase/{id}/status | 60/minute | IP hash |
/v1/key/rotate | 5/hour | API key hash |
// 13
FAQ
▸ I lost my key. Recovery?
None. Keys are bearer. We store only the SHA-256 hash. Lose it, credit is gone. By design.
▸ Refunds?
Crypto refunds need a return address we don't store. Buy small first.
▸ Why no /v1/usage history endpoint?
We log token counts + key hash, not content. Exposing usage history as an endpoint lets attackers enumerate when a key is active. /v1/key/balance gives you the only thing you need: remaining credit.
▸ What models do you support?
See /v1/models. All TEE-attested on Phala. Proprietary models (Claude, GPT-4, Gemini, Grok) are not exposed. They don't run in TEEs and they log queries.
▸ Can you see my prompts?
While a request is in flight, the proxy holds the prompt in RAM. We don't log it, and it's not persisted. The hosting provider could in principle dump memory. See trust boundaries.
▸ Tor?
The API runs as a Tor hidden service. See the .onion address in the connection callout at the top of this page. Same API, same keys, no clearnet exit, no DNS lookup. You can also route any client through Tor SOCKS at the clearnet URL.
▸ How long does payment take to confirm?
Monero blocks land roughly every 2 minutes. Required confirmations scale with bundle size:
- small ($10): 2 confirmations (~4 min)
- medium ($50): 4 confirmations (~8 min)
- large ($200): 6 confirmations (~12 min)
- whale ($500+): 10 confirmations (~20 min)
Custom amounts use the same tier rule, derived from credit size. After confirmation, the key drops on first status poll.
▸ Why Monero only? Why not BTC / ETH / USDC?
Bitcoin and Ethereum transactions are public. Sending us BTC reveals to anyone with the txid that wallet X paid Phantom on day Y. Monero hides sender, receiver, and amount on-chain. Stablecoins (USDC/USDT) have a freeze function that gives the issuer the power to seize. Phantom's premise is operator privacy + customer privacy, so we use the only payment rail that delivers both.
▸ Is this legal?
Phantom forwards inference requests to TEE-attested models. The models are open-weight. Monero is legal in nearly all jurisdictions (banned in a handful: China, some EU exchanges restrict, but possession not criminalized). Your use of the inference output is your responsibility per the AUP. Phantom does not advise on legality in your jurisdiction.
▸ How does this compare to OpenAI / Anthropic / Phala direct?
| vendor |
account |
payment |
logs prompts |
TEE |
| OpenAI / Anthropic / Google | required + KYC | credit card | yes (30d default) | no |
| Phala / Redpill direct | email account | card / crypto | no | yes |
| Phantom | none | XMR only | no | yes |
Phantom adds a ~30% markup vs Phala direct in exchange for the no-account, no-card, no-IP-log layer. If you don't need anonymity, use Phala direct.
▸ What happens when my credit hits $0?
The API returns HTTP 402 "insufficient credit or expired key". Buy a new bundle to get a fresh key, OR keep the same key by purchasing more credit (the key id stays. pending mid-stream behavior may differ; safest is rotate). Lost credit at expiration is forfeit.
▸ Encrypted contact / abuse reports?
Encrypted only. PGP public key at /pgp.txt (fingerprint 09654A79076956E6042D11946296DEC4E954FC76). We don't run a plaintext support channel.