Skip to main content
Aliases: inferencing, infer

Subcommands


list-models

Returns a list of all models available on the serverless platform.
nscale inferencing list-models [flags]

Flags

FlagDescription
—jsonEmit the full JSON payload (mutually exclusive with -q)
-q, —query stringArrayjq filter for value extraction (see Query output with -q)

Example

nscale inferencing list-models --json

list-endpoints

Returns a list of all model endpoints available for use by the specified organization.
nscale inferencing list-endpoints [flags]

Flags

FlagDescription
—org stringOrganization ID
—jsonEmit the full JSON payload (mutually exclusive with -q)
-q, —query stringArrayjq filter for value extraction (see Query output with -q)

Example

nscale inferencing list-endpoints --org <org-id>

chat

Send a chat completion request to the inference API using a configuration file. Supports both batch and interactive modes.
nscale inferencing chat [flags]

Flags

FlagDescription
--config stringPath to a chat configuration file (JSON)
--messages stringPath to a JSON+LD file containing additional messages
--uiLaunch interactive chat TUI

Reasoning content

When you use the interactive TUI (--ui) with a model that supports reasoning, the model’s thought process appears in a separate “Thought Process” bubble above the response. The reasoning streams live as the model works through the problem, and the final answer appears in the standard response bubble once reasoning is complete. This gives you visibility into how the model arrives at its answer without cluttering the final response.

Examples

# Send a chat completion using a config file
nscale inferencing chat --config chat-config.json

# Launch the interactive chat TUI
nscale inferencing chat --config chat-config.json --ui

# Include additional messages from a file
nscale inferencing chat --config chat-config.json --messages extra-messages.json

Models

Learn about available models on the Nscale platform.

Chat Use Case

End-to-end guide for chat inferencing.