Code sample library

Practical code covering common LLM scenarios, just copy and paste to get started.

Quick start

30 seconds to get started with Python

Make the first request with OpenAI SDK + our base_url.

View example

30 Seconds Node.js / TypeScript Quick Start

Use the official OpenAI npm package to make requests in Node.js.

View example

2-line switch: switching from OpenAI to Nexevo

Change two lines of configuration, keep the OpenAI SDK + code, and only change the endpoint.

View example

Get started with the official Python SDK

Replace OpenAI compat with the nexevo-ai package and enjoy typed API + direct access to Nexevo extensions (balance/feedback/model library).

View example

Get started with the official TypeScript SDK

@nexevo/sdk — typed client + 8 resources(chat/models/keys/billing/auth/conversations/orgs/feedback).

View example

Streaming & Tool Calling

Streaming Chat (SSE)

The token is generated and returned at the same time, without waiting for a complete response.

View example

Function calling / tool call

Let the model tune your code (check DB, weather, anything).

View example

Nexevo exclusive capabilities

Automatic fallback for multiple models

Pass a `models: [...]` list and the agent will try one by one until it succeeds - built-in fault tolerance.

View example

Use max_price to capture the worst cost

Set an upper limit on the unit price to prevent out-of-control loops/untrusted inputs from burning out the quota.

View example

:fast / :cheap / :quality routing suffix

Add a suffix to the model name to give routing hints, without reorganizing the request body.

View example

RLHF feedback closed loop

Collect thumbs up/down from user behavior and automatically feed it back to the routing system, allowing the self-learning algorithm to optimize future model selection.

View example

Asynchronous concurrent batch processing (10x throughput)

Use AsyncNexevo + asyncio.gather to process batch requests, the throughput is 10x+ higher than serial, suitable for offline tasks such as data annotation/classification.

View example

Image / video / 3D generation

Text-to-image (DALL-E 3 / Imagen 4 / FLUX)

Sync image generation — pick OpenAI / Google / Replicate; response returns URL directly.

View example

Image editing (gpt-image-1 reference)

gpt-image-1 supports image-to-image — pass reference_image_b64 for style transfer / partial edits.

View example

Text-to-video (Sora 2 / Veo 3 / Wan 2.6)

Async video — submit returns job_id, poll or use generate_and_wait helper. Sora 2 / Veo 3 / Wan 2.6.

View example

Image-to-video with Runway Gen-4 + OSS

Runway Gen-4 requires reference image — upload to OSS first, then submit with reference_image_url.

View example

3D asset generation (Hunyuan 3D)

Hunyuan 3D direct (TC3-HMAC, -30% vs Replicate) — text / image → GLB / OBJ / USDZ.

View example

RAG / Retrieval

Full RAG pipeline

Embedding retrieval → rerank fine-ranking → chat answering. 10k docs at ~$0.06 vs ~$5+ raw prompt.

View example

Embeddings quickstart

Single-text / batch vectorization in 3 lines.

View example

Rerank quickstart

50 candidates → top-5 ranked, boosts RAG accuracy.

View example

Agent / Automation

Agents quickstart

Sync + streaming, built-in tools.

View example

RAG Agent — drop in docs, get answers

Inject docs, agent auto-uses rag_search + answers.

View example

Multi-modal combo — research + math + image generation

One task chains web_search + python_exec + generate_image; the agent orchestrates steps automatically.

View example

Framework integration

Use LangChain + Nexevo to make RAG

Connect Nexevo to LangChain's ChatOpenAI to perform retrieval-augmented chat.

View example

Next.js + Vercel AI SDK

Use Nexevo as a model provider in Vercel @ai-sdk to create streaming UI.

View example

LangChain integration (langchain-nexevo)

Drop-in replacement for ChatOpenAI — get smart routing, ELO duels, cascade cost optimization in one import. Plus LangGraph checkpoint saver bridged to /v1/conversations.

View example

Multiple rounds of dialogue + backend history persistence

Use the conversations API to replace client-side session management. All history is persisted in the Nexevo backend and supports cross-device + Resume + Search.

View example

Get a free API key →

Cookbook · Production-ready LLM code examples | Nexevo.ai