Overview
Welcome to the Nscale docs. Discover how to get started and unlock the full potential of Nscale.
Optimising AI performance with Nscale
We make it easy to integrate AI into your applications. Nscale is a high-performance compute platform designed to simplify AI workloads, from fine-tuning to inference. It provides on-demand AI services that enable you to fine-tune, evaluate, deploy, and run AI models at scale—without the complexity of managing infrastructure. Built on Nscale’s powerful compute technology, these services are accessible in a self-serve, prepaid model. Whether you’re optimising models for production or running large-scale inference, Nscale delivers the flexibility and performance needed to power AI-driven applications.
Quick links
Quickstart
Learn how to get started with Nscale
Serverless Inference
Get started with Serverless Inference
Contact Support
Get help from our support team
Get GPU Clusters
Slurm, Kubernetes and Bare metal clusters
Nscale’s capabilities:
-
Serverless inference – Run popular LLMs effortlessly with our API, without worrying about infrastructure management.
-
Fine-tuning (Coming soon) – Customise open-source models with fine-tuning and deploy them on a dedicated endpoint for high-performance inference.
-
Evaluation (Coming soon) – Evaluate model performance with tools designed to meet your specific requirements.
-
GPU clusters – Accelerate your workflow with cutting-edge infrastructure like NKS (Nscale Kubernetes Service), Slurm, or bare metal machines. Contact sales to learn more.
Quickstart
Create an account
Head to console.nscale.com and create an account
Add credit to your account
From the dashboard, add a minimum of $5 of credit to start using our service
Obtain your API key
Call your first endpoint
Call the inference endpoint with your API key
Head to the quickstart page for a more detailed view on how to get started with Nscale.
Get help
Contact Support
For technical issues or assistance with our platform.