Overview

The Nscale Playground is a web interface that lets you experiment with and compare any AI Model supported on Nscale. With the Playground, you can quickly switch between models, tweak inference parameters, and where model capability allows — upload images for multimodal tasks. It’s a lightweight, self-serve environment for prototyping before you integrate a model via the API.

Playground capabilities

  • Model Comparison

    Test any Nscale-hosted AI model side by side to evaluate output quality and latency.

  • Parameter Tuning

    Adjust common inference settings—temperature, top-P, max tokens, presence penalty, frequency penalty—to see how each change affects results.

  • File Upload (Image-text-to-text)

    For models that have vision capabilities, upload image files (e.g. JPG, PNG, SVG) (≤ 2 MB) to generate text outputs.

  • Cost Estimates & Billing

    View estimated cost per request based on your prompt length and chosen model. Your usage is deducted from your prepaid Nscale balance instantly.

Getting started

  • Log In & Open Playground

    • Sign in to your Nscale account and click Playground in the sidebar.

  • Check Your Balance

    • Confirm you have enough credit (top up via Add Credit if needed).

  • Pick a Model & Set Parameters

    • Use the dropdown to choose one of Nscale’s models.

    • Adjust settings (temperature, top-P, max tokens, etc.) as needed.

  • Enter Prompt (and optionally upload an image)

    • Type your text into the input panel.

    • If the selected model accepts images, click Add (JPG/PNG/SVG ≤ 2 MB).

  • Generate & Review

    • Click Generate to run inference, and experiment!

Parameters

Text to text models & Image text to text models

Temperature Controls randomness: Lower values are more deterministic, higher values more creative

Top P or nucleus sampling controls the diversity of text generation by setting a probability threshold. The model considers only the most likely tokens whose cumulative probability reaches this threshold. Lower values (e.g., 0.1) make outputs more focused and deterministic, while higher values (e.g., 0.9) allow for more creative and varied responses.

Max tokens sets the maximum number of tokens (words, punctuation, etc.) that the model can generate in a single response. When this limit is reached, the output will be cut off at that point, potentially mid-sentence.

Presence penalty is a number between -2.0 and 2.0. Positive values penalise new tokens based on if they have already appeared in the conversation, encouraging exploration of new subjects and ideas. Negative values increase the likelihood of staying on familiar topics.

Frequency penalty is a number between -2.0 and 2.0. Positive values penalise new tokens based on how many times they have appeared in the conversation, helping to avoid repetitive language and encourage varied vocabulary. Negative values make repetition more likely.

For moderate repetition reduction, use penalty values between 0.1 and 1.0. To aggressively minimise repetition, increase values up to 2.0, though this may compromise output quality. Use negative values to encourage repetition when desired.

Image models

Width determines the width of the generated image, in pixels.

Height determines the height of the generated image, in pixels.

Billing

  • Ensure you have credit before running inference; each request deducts credit respective to number of input/output tokens or pixels processed.
  • To add credit, go to Billing, choose an amount, and pay; your balance updates instantly.
  • You can track inference spending in the Billing section; Playground usage and API inference usage are aggregated.
  • Model prices are displayed in the Playground and on the model information page
  • If your balance runs out, you’ll be prompted to top up before running another request.

Contact Support

Need assistance? Get help from our support team