/* Target the background image span */
.fixed.inset-0[style*='background-image'] {
  background-repeat: no-repeat;
  background-position: top right;
  background-size: cover;
  -webkit-touch-callout: none;
}   

Purpose of rate limits

How we enforce rate limits

Rate limits define the maximum number of requests a user can make to Nscale’s serverless inference service within a given time frame.

Rate Limits

Nscale

Welcome to the Nscale docs. Discover how to get started and unlock the full potential of Nscale.

Overview

Get up and running with Nscale’s serverless inference models in just a few steps. This guide will walk you through signing up, adding credit to your account, and making your first API call to start leveraging the power of our AI models.

Quickstart

This guide will walk you through installing and using the Nscale Command Line Interface (CLI).

Nscale CLI

This guide will walk you through integrating a chat model into your application using Nscale’s API. With our serverless architecture, you can focus on building your application without worrying about infrastructure management.

Chat

This guide will walk you through integrating image generation models into your application using Nscale's API. With our serverless architecture, you can focus on building your application without worrying about infrastructure management.

Image Generation

Test and compare supported models in a simple, interactive interface.

Playground

Explore the models currently available for serverless inference, ready to deploy instantly for chat, vision, and image generation.

Serverless Models

Guide to creating, managing, and monitoring serverless fine-tuning jobs

Fine Tuning (coming soon)

Guide to creation, management and deletion of datasets

Datasets (coming soon)

Evaluations (coming soon)

Contact our sales team to get access to GPU clusters

GPU Clusters

Nscale AI services use a credit-based billing system. To access AI services, you’ll need to purchase credits, which are deducted as you consume resources. You can monitor your balance, usage, and transaction history in the Usage & Billing section.

Credit

You can monitor your usage and spend in the Usage & Billing section to manage costs, optimise workloads, and prevent unexpected credit depletion.

Monitoring usage

This section explains how to generate an API key in Nscale, along with important details about managing and securing your keys.

API Keys

arXiv Multimodal RAG

Refer to the error codes below for troubleshooting common issues with Nscale's serverless inference service.

Error Codes

We’re excited to keep our platform at the cutting edge with robust open-source model upgrades.

Deprecations

Whatever you need, we're here to help—get in touch with the right team below.

Support

Terms

Quick Start

Create A Cluster

Connect To Cluster

Manage A Cluster

Welcome to the home of Nscale documentation

Create a Cluster

Teams

RBAC

Account

Organisation Access

Auth Tokens

SSH Keys

Coming Soon

Returns a list of models available for the specified organization.

List models

Generates a chat completion response for the provided model and chat history.

Create chat completion

Creates an image based on the provided text prompt.

Create image

Retrieves the base model with the given ID.

Creates a new fine-tuning dataset in an existing workspace.

Retrieves the fine-tuning dataset with the given ID.

Permanently deletes a fine-tuning dataset.

Updates the specific fine-tuning dataset by setting the values of the parameters passed. Any parameters not provided will be left unchanged.

Creates a new fine-tuning job in an existing workspace.

Retrieves the fine-tuning job with the given ID.

Updates the specific fine-tuning job by setting the values of the parameters passed. Any parameters not provided will be left unchanged.

Returns a list of Job Exports for the given job.

Retrieves the fine-tuning job export with the given ID.

Creates a new Hugging Face export for the given job.

Prepares the download of the settings and model weights for the specified fine-tuning job and redirects the client to the actual download URL once ready.

Downloads the settings and weights of the fine-tuned model as a `.tar.gz` archive for the given ID.

Use Cases

AI Services

AI Compute

Manage

Cookbooks

FAQs

Rate Limits

Purpose of rate limits

How we enforce rate limits

Use Cases

AI Services

AI Compute

Manage

Cookbooks

FAQs

​Purpose of rate limits

​How we enforce rate limits

Purpose of rate limits

How we enforce rate limits