Skip to main content
January 2026

Quick Search & Navigation Refresh

A major overhaul of the console’s command palette and navigation experience. The Quick Search dialog now supports multi-page flows, improved resource discovery, and an integrated theme switcher—all accessible via Cmd+K.

Features

  • Unified create actions. All resource creation flows are now accessible directly from the Cmd+K quick search
  • Enhanced Quick Search. Multi-page flow with improved resource search and theme switcher built in
  • Duplicate-name validation. Prevents naming conflicts across projects, VPCs, security groups, filesystems, and images
  • Batch instance creation. Create multiple instances at once with public IP toggle support
  • Image snapshots. Create image snapshots directly from instance detail views
  • Filesystem improvements. VPC multi-select in creation/editing with improved UI
  • Navigation updates. New tabs across serverless details, usage & billing, Kubernetes details, fine-tuning overview, and project compute clusters
  • Security groups. Add actions in instance view with rule side panel for create/edit
  • Jobs enhancements. Parameters field on create with AI summary/diagnosis for failed jobs
  • Dashboard quotas. Private dashboard now displays available quotas

Nscale CLI

  • Device CLI login. New device authentication flow with refresh token and expires-at fields
December 2025

Instances & Networking MVP

The console now includes a complete instances and networking workflow. Create and manage instances with full security group support, storage side panels, and quota visibility.

Features

  • Instances + networking MVP. Full instance actions and security group management
  • Storage create side panel. Streamlined storage creation with improved details view
  • Quota callout. Clear visibility into resource quotas
  • Instance create polish. Refined instance creation flow and UI improvements
  • Device status. Updated device status display
November 2025

Instances v2 & UI Refresh

A comprehensive refresh of the console UI alongside the new Instances v2 experience. New list and detail views, storage cards, usage visualisation, and a broad refresh of core components.

Features

  • Instance list & detail views. New views with create image side panel
  • Storage details cards. Improved storage details page with card layout
  • Usage visualisation. New bar chart for usage metrics
  • UI component refresh. Updated form components, drawers, card tables, and tooltips
  • Instances v2. List pages for supporting resources
  • API keys removed. API key management moved out of console
  • Workbench polish. UI improvements to the Workbench experience
  • Design system alignment. Button styling aligned to design system
October 2025

Dedicated Infrastructure & Workbench Upgrades

New dedicated infrastructure nodes addon and significant Workbench improvements including model selection fixes, preset rules, and comparison-mode enhancements.

Features

  • Dedicated infrastructure nodes. New addon for dedicated infrastructure
  • Workbench upgrades. Model selection/validation fixes, preset rules, comparison-mode improvements, and license display
  • Job status handling. Improved job completion status handling
  • Workload pool IPs. Machine IP display and copy improvements
  • Region validation. Region change validation fixes
  • Image selector. Improved image selection experience
  • Preset baseline. Now updates to current version automatically
July 2025

Serverless Fine-tuning

Serverless Fine-tuning
Introducing Serverless Fine-tuning – the fastest way to customise open-weight foundation models without touching infrastructure. Spin up a secure, pay-as-you-go training job in a single API call, watch metrics stream in real-time, and download your tuned model or push straight to Hugging Face.

Features

  • Two-step workflow. Pick any supported base model (Llama 3, Mistral 7B, DeepSeek, Qwen and more) and launch a job with your dataset – no cluster sizing, no Dockerfiles
  • LoRA-powered efficiency. Default Low-Rank-Adaptation (LoRA) reduces GPU hours and cost
  • Live metrics & easy monitoring. Poll one endpoint to track train_loss, eval_loss, perplexity
  • Export or deploy instantly. One-click push to Hugging Face or direct download of a ready-to-serve artefact
  • Serverless pricing. $2 minimum per job, billed by processed tokens; every new account still gets $5 free credit to experiment

Quick start

# 1. List available base models
curl -H "Authorization: Bearer $NSCALE_API_TOKEN" \
     https://fine-tuning.api.nscale.com/api/v1/organizations/$ORG_ID/base-models

# 2. Launch a LoRA job
curl -X POST https://fine-tuning.api.nscale.com/api/v1/organizations/$ORG_ID/jobs \
     -H "Authorization: Bearer $NSCALE_API_TOKEN" \
     -H "Content-Type: application/json" \
     -d '{
           "name": "support-bot-finetune",
           "base_model_id": "e5f6a7b8-c9d0-1234-efab-567890123456",
           "dataset": {"id":"<DATASET_ID>","prompt_column":"prompt","answer_column":"response"},
           "hyperparameters":{"n_epochs":3,"batch_size":4,"lora":{"enabled":true,"r":8,"alpha":16}}
         }'

# 3. Stream training metrics
curl -H "Authorization: Bearer $NSCALE_API_TOKEN" \
     https://fine-tuning.api.nscale.com/api/v1/organizations/$ORG_ID/jobs/$JOB_ID/metrics

March 2025

Serverless Inference

Serverless Inference
Our fully-managed, pay-per-request runtime that puts a pool of GPUs behind a single OpenAI-compatible endpoint. Instead of capacity planning, container images and infra dashboards, you call https://inference.api.nscale.com/v1/* and get deterministic, low-latency responses from today’s best open-source models—all billed per token and delivered from data-sovereign, 100% renewable data-centres.

Features

  • OpenAI-compatible endpoints. Drop-in support for Llama, Qwen, DeepSeek and other leading models makes migration a copy-paste job
  • Pay-as-you-go billing. Prices are per 1 million tokens including input and output tokens for Chat, Multimodal, Language and Code models. Image models is based on image size and steps
  • 80% lower cost & 100% renewable. Our vertically-integrated stack slashes TCO versus hyperscalers while guaranteeing data privacy—requests are never logged or reused
  • $5 free credits to get started. Every new account includes starter credits so you can ship to production in minutes

Under the hood

AreaWhat it looks likeWhy it matters
API surfaceDrop‑in equivalents for GET /models, POST /chat/completions, POST /images with optional stream: true for SSE (text/event-stream).Migrate from OpenAI by changing only the base URL and key.
Model libraryLaunch set covers Meta Llama‑4 Scout 17B, Qwen‑3 235B, Mixtral‑8×22B, DeepSeek‑R1 distills, SD‑XL 1.0 and more (text, code, vision).Lets teams A/B models or mix modalities without provisioning extra infra.
Elastic runtime“Zero rate limits, no cold starts.” Traffic is sharded over thousands of MI300X/MI250X/H100 GPUs, spun up on‑demand by our orchestration layer.Bursty workloads stay < 200 ms tail latency without you over‑allocating GPUs.
Cost modelTokens in, tokens out — billed per 1M tokens; images billed per megapixel. Every account starts with $5 free credit.Fine‑grained, deterministic spend; easy to embed in metered SaaS.
Security / privacyEnd‑to‑end TLS, org‑scoped API keys, full tenant isolation; we never log or train on user prompts or outputs.Meets GDPR, HIPAA and most vendor‑assessment checklists out of the box.
SustainabilityAll compute runs in hydro‑powered facilities; the vertical stack is 80% cheaper‑per‑token than hyperscalers.Fewer carbon (and budget) emissions per request.

Quick start

curl -X POST \
  https://inference.api.nscale.com/v1/chat/completions \
  -H "Authorization: Bearer $NSCALE_KEY" \
  -H "Content-Type: application/json" \
  -d '{
        "model": "meta-llama/Llama-4-Scout-17B-Instruct",
        "messages": [{"role":"user","content":"Hello world"}],
        "stream": true
      }'