Nexevo Conductor / Smart Hub

AI Runtime—not just another router

One call = Routing + Caching + Cross-model Memory + Anti-hallucination + On-demand Agent

We don’t sell you models — freeing you from vendor lock-in. At its core: MCP and desktop applications. One-line config integration with Claude Desktop or Cursor. A neutral runtime layer that model vendors, by structure, cannot provide.

Start for free MCP Integration Documentation

OpenAI-Compatible · Main Entry Point: POST /v1/conductor/chat

Tokens saved

511,776,365

USD value saved

$1,275.92

Hallucinations caught

11,358

Infrastructure of Each Era

Both are neutral layers situated between applications and underlying services — such a layer has yet to emerge in the AI era.

Web

Cloudflare

Observability

Datadog

Frontend

Vercel

AI Runtime

Nexevo

One API, Six Capabilities

Each request automatically runs through the Conductor pipeline. Customer perception = one call = complete result. Internally = 9 visible decision steps; model vendors structurally lack 3 critical capabilities.

Local semantic cache

Save 30–50% tokens

pgvector single-vector coarse filtering + Voyage Rerank-2.5 fine ranking; match only if sim ≥ 0.95. Cross-provider universal, with dual isolation per user and per options. Pre-warm cron ensures new users hit on their first visit.

Cross-model memory auto-injection

Switch models without losing context

Automatically prepends Recall + differential injection (token -60%) when switching model_id. Family-specific format adaptation: Claude XML, GPT Markdown, Gemini fenced YAML, Llama plaintext — each model reads its most comfortable syntax.

Anti-hallucination verification

Intent Whitelist Trigger

Five high-risk intent categories—legal, medical, financial, security, and code_critical—are automatically cross-validated using a low-cost judge, conserving budget on simple chats while ensuring reliability for high-sensitivity scenarios. Fabricated responses can also trigger auto-retry with a stronger model to regenerate the answer.

Agent-on-demand

On-demand multi-step, not a standalone product

When `agent=auto-if-multi-step`, the system automatically determines: simple Q&A is answered directly; complex tasks (involving planning, tool invocation, or multi-step reasoning) are automatically routed to an agent sandbox for iterative execution. `max_cost_usd` enforces a hard cutoff to prevent runaway loops.

Per-call X-Ray trace

Transparent per call

X-Ray Badge displays: pipeline decisions / cache hits / memory injections / cost estimates / latency. Turning black-box into white-box—every request can be audited down to the model, decisions, and cost details.

Sticky session + break-even

Cost Intelligence Optimization

Lock model within the same session to avoid random jitter; calculate break-even before switching models—if attaching memory costs 1.3× more than staying on the current model, automatically recommend “sticky” mode. Decisions are visible, logs are auditable, and enforcement is optional in Q2.

Built-in Components

Conductor includes Quorum + Recall

Quorum · Anti-hallucination Verification

Conductor’s `verify=auto` uses Quorum’s dual-AI cross-verification mechanism to combat hallucination. It can also operate independently as an MCP widget, ideal for human in-depth review of critical decisions.

Recall · Cross-model memory storage

The memory automatically injected by Conductor when switching models is encapsulated in Recall. Recall can also operate independently as an MCP widget, enabling search and retrospective analysis for any AI conversation.

MCP / Desktop Application as the Core

Wherever you use AI, Conductor is there.

Not just another web playground. MCP provides a single tool (`conductor_ask`) that enables native integration of Claude Desktop, Cursor, or any MCP client—seamlessly embedding into their chat or editor experience.

Claude Desktop

1 tool: conductor_ask

Configuring the Nexevo MCP server → Claude automatically gains the conductor_ask tool
mode parameter: chat / save_memory / search_memory — internal automatic routing
Claude invocation works identically to its native tools—seamless switching.

Cursor / VSCode

Automatic Code Context Injection

MCP server listens for IDE selection / current file
conductor_ask mode=chat automatically includes code context
Automated code review / refactoring suggestions / bug detection integrated with Conductor

Nexevo Desktop App

Local AI Workspace

One-click switch between Claude / GPT-4o / Gemini; memory automatically carries over
X-Ray real-time display: cache hits + tokens saved + dollars saved
Recall Capsule: Desktop-level management + cross-conversation reuse

MCP Full Integration Documentation

We don’t sell you models.

We ensure you’re not locked in to any single provider.

Model vendors have a natural incentive to lock you into their own ecosystem—APIs, cache, memory, and agents are all under their control. The reality at the application layer is: Claude leads today, GPT-5 launches tomorrow, and Gemini surges ahead the day after. Nexevo acts on your behalf—cross-vendor, swappable, memory preserved—so model selection returns to “judging by performance,” not “weighing lock-in costs.”

Compare with Similar Products

Routing + Caching + Memory + Anti-Hallucination + Agent + X-Ray = One Product.

Capability	Conductor	OpenRouter	Portkey	Letta
OpenAI-Compatible API	✓	✓	✓	—
Multi-model routing	✓	✓	✓	—
Local semantic cache (cross-provider)	✓	—	—	—
Cross-model memory + format adaptation	✓ Exclusive	—	—	Partial
Pre-warm cluster first-visit hit	✓ Exclusive	—	—	—
Anti-hallucination verify (high-risk intent)	✓	—	guardrails	—
Per-call X-Ray trace UI	✓ Exclusive	—	Partial	—
MCP Native + Desktop Application	✓	—	—	Partial
Neutral / Not vendor-locked	✓	✓	✓	—

Compared against publicly available information from May 2026. Specific capabilities may evolve; refer to each vendor’s documentation for details.

Get Started with Conductor

Get started in 5 minutes. OpenAI SDK compatible = change just one line: base_url. Or connect MCP to Claude Desktop / Cursor.

Get started for free View Pricing MCP Documentation

Free quota for new users · OpenAI-compatible SDK · Documentation in Chinese and English