API · v1

neonproxy.space documentation

Every frontier model, one endpoint.

Drop-in OpenAI-compatible gateway routing to Anthropic, OpenAI and Zhipu upstreams. Pay per token. No subscription, no rate-limit theatrics.

1. Get an API key →

2. Drop it in →

Point any OpenAI SDK at neonproxy.space/v1

3. Ship →

Stream responses, swap models, watch usage live

Create an account

Go to https://neonproxy.space/sign-up
Pick a username and password.
Verify your email (if email verification is enabled by the operator).
Log in at https://neonproxy.space/sign-in — you'll land on the dashboard.

Get an API key

Open Keys in the sidebar (/console/keys)
Click New key.
Optionally cap the per-key spend (e.g. $200 / month) and restrict it to specific models.
Copy the key — sk-.... It's only shown once.

Store it in your environment:

export NEONPROXY_KEY=sk-...

Drop it into your code

The gateway speaks the OpenAI wire format out of the box — point any OpenAI-compatible SDK at it and your existing code keeps working.

Node / TypeScript

import OpenAI from 'openai'

const ai = new OpenAI({
  baseURL: 'https://neonproxy.space/v1',
  apiKey: process.env.NEONPROXY_KEY,
})

const res = await ai.chat.completions.create({
  model: 'claude-opus-4.8',
  messages: [{ role: 'user', content: 'Explain rate limits in one sentence.' }],
})

console.log(res.choices[0].message.content)

Python

from openai import OpenAI
import os

client = OpenAI(
    base_url="https://neonproxy.space/v1",
    api_key=os.environ["NEONPROXY_KEY"],
)

res = client.chat.completions.create(
    model="claude-sonnet-4.6",
    messages=[{"role": "user", "content": "Hi"}],
)
print(res.choices[0].message.content)

curl

curl https://neonproxy.space/v1/chat/completions \
  -H "Authorization: Bearer $NEONPROXY_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.5",
    "messages": [{"role": "user", "content": "Hi"}]
  }'

Streaming

Every model supports streamed responses. Add "stream": true:

const stream = await ai.chat.completions.create({
  model: 'claude-opus-4.7',
  messages,
  stream: true,
})

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || '')
}

Anthropic / Gemini wire formats

If your code uses the Anthropic SDK directly:

import Anthropic from '@anthropic-ai/sdk'

const client = new Anthropic({
  baseURL: 'https://neonproxy.space',
  apiKey: process.env.NEONPROXY_KEY,
})

const msg = await client.messages.create({
  model: 'claude-opus-4.8',
  max_tokens: 1024,
  messages: [{ role: 'user', content: 'Hi' }],
})

Gemini-style endpoints work too:

POST https://neonproxy.space/v1beta/models/gemini-2.5-pro:generateContent

Models & pricing

All prices in USD per 1M tokens. No subscription, no monthly minimum. Billing on actual usage.

Model	Input	Output	Context
claude-opus-4.8	$0.40	$0.40	200K
claude-opus-4.7	$0.40	$0.40	200K
claude-sonnet-4.6	$0.20	$0.20	200K
gpt-5.5	$0.20	$0.20	128K
gpt-5.4	$0.20	$0.20	128K
glm-5.2	$0.10	$0.10	128K

More code examples

Tool use (function calling)

const res = await ai.chat.completions.create({
  model: 'claude-opus-4.8',
  messages: [{ role: 'user', content: 'Weather in Berlin?' }],
  tools: [{
    type: 'function',
    function: {
      name: 'get_weather',
      description: 'Get current weather',
      parameters: {
        type: 'object',
        properties: { city: { type: 'string' } },
        required: ['city'],
      },
    },
  }],
})

if (res.choices[0].message.tool_calls) {
  // Pass tool results back as a follow-up message
}

Vision (multimodal input)

const res = await ai.chat.completions.create({
  model: 'claude-sonnet-4.6',
  messages: [{
    role: 'user',
    content: [
      { type: 'text', text: 'What is in this image?' },
      { type: 'image_url', image_url: { url: 'https://example.com/cat.jpg' } },
    ],
  }],
})

JSON-mode response

const res = await ai.chat.completions.create({
  model: 'gpt-5.5',
  response_format: { type: 'json_object' },
  messages: [
    { role: 'system', content: 'Always respond with JSON.' },
    { role: 'user', content: 'List 3 fruits.' },
  ],
})

const data = JSON.parse(res.choices[0].message.content)

Embeddings (where supported)

const res = await ai.embeddings.create({
  model: 'text-embedding-3-large',
  input: 'hello world',
})

console.log(res.data[0].embedding)

FAQ

What providers do you route to?

Anthropic for claude-*, OpenAI for gpt-*, Zhipu for glm-*. Region failover handled transparently.

Is there a rate limit?

Per-key rate limits are configurable in the dashboard. The default is generous enough for most production traffic; you only hit a ceiling if you set one.

What about latency?

Routing adds ~5–15 ms on top of the upstream provider's latency. For streamed responses the first token arrives at the same wallclock as calling the provider directly.

Can I bring my own API keys (BYOK)?

Contact support — BYOK is supported for enterprise accounts.

Where can I see usage?

Dashboard at neonproxy.space/console/dashboard — real-time spend, per-key breakdown, per-model token counts.

What if a model returns an error?

The gateway retries transient errors automatically. Persistent errors propagate to your client with the upstream's error code unchanged.

Refunds / SLAs?

See User Agreement.

Support

Last updated 2026-06-22. This site is operated under neonproxy.space terms.