scaleway provider targets the OpenAI-compatible API exposed by Scaleway AI. It works seamlessly with AI Kit agents and workflows.
Installation
Configuration
Use with an agent
Available models
| Model | Description | Size |
|---|---|---|
gpt-oss-120b | High-performing general model | 120B |
llama-3.3-70b-instruct | Meta Llama 3.3 tuned for instruction | 70B |
llama-3.1-8b-instruct | Compact Llama 3.1 | 8B |
mistral-small-3.2-24b-instruct-2506 | Latest Mistral Small | 24B |
mistral-nemo-instruct-2407 | Mistral Nemo optimised | 12B |
qwen3-235b-a22b-instruct-2507 | Large Qwen 3 | 235B |
qwen3-coder-30b-a3b-instruct | Qwen 3 code-specialised | 30B |
deepseek-r1-distill-llama-70b | Distilled DeepSeek R1 | 70B |
gemma-3-27b-it | Google Gemma 3 instruction | 27B |
voxtral-small-24b-2507 | Voxtral Small | 24B |
devstral-small-2505 | Devstral for development use cases | 25B |
pixtral-12b-2409 | Multimodal Pixtral | 12B |
Examples
Structured output
Inside a workflow
Streaming
Model selection tips
- Code generation:
qwen3-coder-30b-a3b-instruct,devstral-small-2505. - General-purpose tasks:
gpt-oss-120b,llama-3.3-70b-instruct. - Lightweight workloads:
llama-3.1-8b-instruct,mistral-nemo-instruct-2407. - Reasoning-heavy tasks:
deepseek-r1-distill-llama-70b,qwen3-235b-a22b-instruct-2507. - Multimodal scenarios:
pixtral-12b-2409.
Best practices
- Security – store API keys in a secret manager, never commit them.
- Error handling – wrap calls in
try/catchand log failures. - Cost control – set
maxOutputTokensand monitor usage. - Temperature – pick the right creativity level (
0.0-0.3precise,0.4-0.7balanced,0.8+creative).