January 2026
Quick Search & Navigation Refresh
A major overhaul of the console’s command palette and navigation experience. The Quick Search dialog now supports multi-page flows, improved resource discovery, and an integrated theme switcher—all accessible via Cmd+K.Features
- Unified create actions. All resource creation flows are now accessible directly from the Cmd+K quick search
- Enhanced Quick Search. Multi-page flow with improved resource search and theme switcher built in
- Duplicate-name validation. Prevents naming conflicts across projects, VPCs, security groups, filesystems, and images
- Batch instance creation. Create multiple instances at once with public IP toggle support
- Image snapshots. Create image snapshots directly from instance detail views
- Filesystem improvements. VPC multi-select in creation/editing with improved UI
- Navigation updates. New tabs across serverless details, usage & billing, Kubernetes details, fine-tuning overview, and project compute clusters
- Security groups. Add actions in instance view with rule side panel for create/edit
- Jobs enhancements. Parameters field on create with AI summary/diagnosis for failed jobs
- Dashboard quotas. Private dashboard now displays available quotas
Nscale CLI
- Device CLI login. New device authentication flow with refresh token and expires-at fields
December 2025
Instances & Networking MVP
The console now includes a complete instances and networking workflow. Create and manage instances with full security group support, storage side panels, and quota visibility.Features
- Instances + networking MVP. Full instance actions and security group management
- Storage create side panel. Streamlined storage creation with improved details view
- Quota callout. Clear visibility into resource quotas
- Instance create polish. Refined instance creation flow and UI improvements
- Device status. Updated device status display
November 2025
Instances v2 & UI Refresh
A comprehensive refresh of the console UI alongside the new Instances v2 experience. New list and detail views, storage cards, usage visualisation, and a broad refresh of core components.Features
- Instance list & detail views. New views with create image side panel
- Storage details cards. Improved storage details page with card layout
- Usage visualisation. New bar chart for usage metrics
- UI component refresh. Updated form components, drawers, card tables, and tooltips
- Instances v2. List pages for supporting resources
- API keys removed. API key management moved out of console
- Workbench polish. UI improvements to the Workbench experience
- Design system alignment. Button styling aligned to design system
October 2025
Dedicated Infrastructure & Workbench Upgrades
New dedicated infrastructure nodes addon and significant Workbench improvements including model selection fixes, preset rules, and comparison-mode enhancements.Features
- Dedicated infrastructure nodes. New addon for dedicated infrastructure
- Workbench upgrades. Model selection/validation fixes, preset rules, comparison-mode improvements, and license display
- Job status handling. Improved job completion status handling
- Workload pool IPs. Machine IP display and copy improvements
- Region validation. Region change validation fixes
- Image selector. Improved image selection experience
- Preset baseline. Now updates to current version automatically
July 2025
Serverless Fine-tuning

Features
- Two-step workflow. Pick any supported base model (Llama 3, Mistral 7B, DeepSeek, Qwen and more) and launch a job with your dataset – no cluster sizing, no Dockerfiles
- LoRA-powered efficiency. Default Low-Rank-Adaptation (LoRA) reduces GPU hours and cost
- Live metrics & easy monitoring. Poll one endpoint to track train_loss, eval_loss, perplexity
- Export or deploy instantly. One-click push to Hugging Face or direct download of a ready-to-serve artefact
- Serverless pricing. $2 minimum per job, billed by processed tokens; every new account still gets $5 free credit to experiment
Quick start
March 2025
Serverless Inference

https://inference.api.nscale.com/v1/* and get deterministic, low-latency responses from today’s best open-source models—all billed per token and delivered from data-sovereign, 100% renewable data-centres.Features
- OpenAI-compatible endpoints. Drop-in support for Llama, Qwen, DeepSeek and other leading models makes migration a copy-paste job
- Pay-as-you-go billing. Prices are per 1 million tokens including input and output tokens for Chat, Multimodal, Language and Code models. Image models is based on image size and steps
- 80% lower cost & 100% renewable. Our vertically-integrated stack slashes TCO versus hyperscalers while guaranteeing data privacy—requests are never logged or reused
- $5 free credits to get started. Every new account includes starter credits so you can ship to production in minutes
Under the hood
| Area | What it looks like | Why it matters |
|---|---|---|
| API surface | Drop‑in equivalents for GET /models, POST /chat/completions, POST /images with optional stream: true for SSE (text/event-stream). | Migrate from OpenAI by changing only the base URL and key. |
| Model library | Launch set covers Meta Llama‑4 Scout 17B, Qwen‑3 235B, Mixtral‑8×22B, DeepSeek‑R1 distills, SD‑XL 1.0 and more (text, code, vision). | Lets teams A/B models or mix modalities without provisioning extra infra. |
| Elastic runtime | “Zero rate limits, no cold starts.” Traffic is sharded over thousands of MI300X/MI250X/H100 GPUs, spun up on‑demand by our orchestration layer. | Bursty workloads stay < 200 ms tail latency without you over‑allocating GPUs. |
| Cost model | Tokens in, tokens out — billed per 1M tokens; images billed per megapixel. Every account starts with $5 free credit. | Fine‑grained, deterministic spend; easy to embed in metered SaaS. |
| Security / privacy | End‑to‑end TLS, org‑scoped API keys, full tenant isolation; we never log or train on user prompts or outputs. | Meets GDPR, HIPAA and most vendor‑assessment checklists out of the box. |
| Sustainability | All compute runs in hydro‑powered facilities; the vertical stack is 80% cheaper‑per‑token than hyperscalers. | Fewer carbon (and budget) emissions per request. |