Skip to main content
POST
/
generate
Generate
curl --request POST \
  --url https://api.example.com/generate \
  --header 'Content-Type: application/json' \
  --data '
{
  "prompt": "<string>",
  "model": "<string>",
  "max_tokens": 123,
  "temperature": 123
}
'
{
  "id": "<string>",
  "node": "<string>",
  "output": "<string>",
  "usage": {}
}
Send a prompt to an eligible worker node and receive generated output plus usage metadata.

Request body

prompt
string
required
Input text for inference.
model
string
Model identifier (for example tinyllama).
max_tokens
integer
Maximum output tokens. Default is 256.
temperature
number
Sampling temperature between 0.0 and 2.0. Default is 0.7.

Example request

{
  "prompt": "Explain quantum tunneling in one paragraph.",
  "model": "tinyllama",
  "max_tokens": 256,
  "temperature": 0.7
}

Success response

id
string
Unique job ID for this generation request.
node
string
Worker node that executed the inference.
output
string
Generated completion text.
usage
object
Usage object with token and credits metadata.

Response example

{
  "id": "job_abc123",
  "node": "node-9006",
  "usage": {
    "tokens": 142,
    "credits_deducted": 1.42
  },
  "output": "Quantum tunneling occurs when..."
}

Error responses

StatusMeaningRetry
400Invalid request parameters.No
401Missing or invalid API key.No
402Insufficient credits.No
429Rate limited.Yes
503No healthy nodes available.Yes
Last modified on February 21, 2026