Roadmap
What we're working on next. See a gap? Tell us.
Released in the last 90 days.
OAuth sign-in + region-aware cookie consent (2026-04-30)
Continue with Google / GitHub on /login. Cookie consent bar shows only in GDPR / CCPA / PIPL / LGPD / nFADP / PoPIA regions; localStorage persists choice + dispatches event for 3rd-party SDKs.
Layer 3+ smart routing (2026-04-30)
ELO duels feed effective_catalog reasoning+creative dims (≥30 duels). Per-model snapshot history with admin sparkline. Feedback-sourced golden prompts: weekly cron samples high-quality user adoptions to evolve duel topics with real traffic.
Multi-process billing safety + 11 P1 hardening (2026-04-30)
Pg advisory lock for billing critical sections (workers > 1 safe). Stripe webhook persistent dedup. BYOK calculator now factors cache tokens. python_exec DoS caps. Per-email login lockout. DAG checkpoint status field. 4505 tests pass.
Generation gateway: 22 models / 8 providers
Image / video / 3D unified API. Sora 2 / Veo 3 / Imagen 4 / Wan 2.6 / Hunyuan 3D direct / Runway Gen-4 / OpenAI Images / Replicate — all real adapters.
OSS reference image pipeline
Multipart upload + signed URLs + per-tenant quota + GC for orphans. Sora 2 / Veo 3 / Runway video outputs auto-mirrored to OSS.
TS SDK v0.3 + Python SDK v0.2: generation resources
nexevo.images / videos / models3d / generation.{jobs, uploads}; Python sync + async parity. 64+33 tests pass.
Playground 4-tab + admin provider health
Chat / Image / Video / 3D tabs in /dashboard/playground. Admin manages all 8 provider creds + 7 health pings (incl. TC3-HMAC sign test).
Smart Routing v2
80 specialty tags + 8-signal weighted scoring + 4 hard filters + manual override (specialty/difficulty/price) + 5-tab admin UI + draft/publish weights gating.
Catalog scaled to 92 models
Covers OpenAI (GPT-5/5.5/Codex/o3/o3-pro) + Anthropic (Claude 4.6/4.7) + Gemini 2.5/3.1 Pro + DeepSeek V4 Preview + Qwen3.6/3-VL + Kimi K2.6 + GLM 5.1 + Ant Ling + Microsoft Phi-4.
Stripe top-up + Aliyun email + admin integrations
Customer/SetupIntent/off-session auto top-up loop + Aliyun direct mail (Singapore) + one-stop /admin/integrations console.
BYOK (bring your own API key)
Use your own upstream keys, get our routing + observability, we charge a flat 5% service fee. Independent byok module, scheduler priority BYOK > settings.
Tiered pricing (dual smart-routing tier)
nexevo/fast ($0.80/$2.00/M) + nexevo/balanced ($5/$20/M) flat tiers, plus Passthrough (upstream +5%) — three optional models.
Cascade cost optimization
Try cheap level first, return on confidence ≥ 0.7 to save tokens; heuristic signals geo-mean; v2 top-3 audit trail.
Public anonymous /chat
No-signup browser try-it entry, 5-model whitelist + 3-message free quota, $2 signup credit guidance.
10-language i18n + smart locale routing
zh-CN/en/zh-TW/es/fr/de/ja/ko/hi/ms first-class support + 5-tier smart middleware + deepMerge key-level fallback.
Public Leaderboard
Following HF Open LLM Leaderboard's content compounding pattern; privacy chain salt+SHA256+magnitude buckets; 3 public endpoints.
TypeScript + Python SDK
@nexevo/sdk + nexevo-ai official SDKs with 8 resources (chat/models/keys/billing/auth/conversations/orgs/feedback).
JSON-LD SEO + complete sitemap
Organization / SoftwareApplication / FAQPage / TechArticle / Breadcrumb schema + 10-locale hreflang + AI crawler whitelist.
Current sprint.
Production deployment (HK gateway + Shenzhen proxy)
HK ECS backend + co-located nginx+Next frontend + Aliyun RDS + Shenzhen ECS proxy_cn + Aliyun ESA. Code on GitHub, awaiting 6-stage ops rollout.
SDK npm/PyPI publish
Local build + tests pass, awaiting npm publish + pypi twine upload.
Reliability signal hooked to real monitoring data
monitoring per-backend success rate is wired to scoring v2; replacing the 0.85 placeholder once production accumulates >= 20 samples per model.
/models/[id] dynamic OG image + detail SEO
Per-model OG share cards (provider · context · score · price). Code ready; temporarily disabled by Next.js 15.5.15 catch-all OG regression — restore on next minor upgrade (5 min job).
In the next 4-8 weeks.
Self-hosted distill model v1
Qwen-2.5-32B + AWQ + LoRA + RunPod A100. Catalog placeholder + train_lora.py ready, awaiting data (SFT >= 5000 / DPO >= 1000).
Benchmark fetcher cron live
Chatbot Arena / HF Open LLM auto-fetch real scores to override capability mvp-estimates. Code ready, awaiting cron schedule.
Public Leaderboard launch (legal sign-off)
MVP code + 3 public endpoints + privacy chain (salt+SHA256+magnitude buckets) all ready. Hidden from nav/sitemap until legal signs off on competitive-disparagement risk + opt-out flow.
PIPL / data residency compliance
ICP filing + region:cn-only/eu-resident hard filters + legal-signed 4-state data_policy.
LangChain provider adapter (langchain-nexevo)
Native LangChain integration: from langchain_nexevo import NexevoChatModel. Plus LangGraph checkpoint bridge into our conversations table. Borrowing the LangChain ecosystem beats building yet another agent framework — TaaS Self-Healing v2 already covers most LangGraph use cases.
Evaluating; no commitment yet — starts only on a clear trigger signal.
GN8 China GPU self-hosting
Ascend + DeepSeek-V4 domestic chip adaptation, lower cost + fully domestic chain.
Voice / audio routing
Bytedance Seedup / OpenAI Realtime / Gemini Live API unified interface. **Trigger to start**: ≥3 enterprise inquiries with paid intent OR OpenAI Realtime priced ≥$0.20/min OR clear gap in domestic Voice market. Deferred 6+ months unless triggered — bidirectional WebSocket stack + per-second billing is a different beast from chat completion.
Hybrid difficulty rating with real data
Benchmark + Bandit reward 50/50 weighting is currently a placeholder — flip on once enough data accumulates.
Microsoft / Apple OAuth + SSO for enterprise
Beyond Google/GitHub — add Microsoft (Office 365 enterprises) and Apple (iOS users) sign-in; plus SAML SSO for B2B enterprise tier.