Documentation Index
Fetch the complete documentation index at: https://ai.aidalinfo.fr/llms.txt
Use this file to discover all available pages before exploring further.
@ai_kit/core includes model-agnostic audio transcription support, compatible with any OpenAI-compatible endpoint (Scaleway Whisper large v3, OpenAI whisper-1, etc.).
Three public primitives
| Export | Role |
|---|
createTranscriptionModel(config) | Creates a TranscriptionModelV3 provider |
transcribe(options) | Standalone function: loads audio (path / URL / buffer), calls the model, returns the transcript |
createTranscriptionTool(model, options?) | Returns an AI SDK tool() to attach directly to an Agent |
createTranscriptionModel
import { createTranscriptionModel } from "@ai_kit/core";
const whisperModel = createTranscriptionModel({
modelId: "whisper-large-v3",
apiKey: process.env.SCALEWAY_API_KEY!,
baseURL: "https://api.scaleway.ai/v1",
providerName: "scaleway", // optional, used in logs
});
Supports any OpenAI-compatible /audio/transcriptions endpoint (response_format=verbose_json).
transcribe
import { transcribe } from "@ai_kit/core";
// From a file path
const result = await transcribe({
model: whisperModel,
audio: "/path/to/audio.wav",
inputType: "path", // "path" | "url" | "buffer" — auto-detected if omitted
language: "fr", // optional ISO-639-1 code
});
console.log(result.text);
// result.segments → [{ text, startSecond, endSecond }]
// result.language, result.durationInSeconds
audio accepts a file path, an http(s) URL, or a Buffer / Uint8Array. The inputType is auto-detected when omitted.
Return value
interface TranscribeResult {
text: string;
segments: Array<{ text: string; startSecond: number; endSecond: number }>;
language: string | undefined;
durationInSeconds: number | undefined;
}
import { createTranscriptionModel, createTranscriptionTool, Agent } from "@ai_kit/core";
import { scaleway } from "@ai_kit/core";
const whisperModel = createTranscriptionModel({
modelId: "whisper-large-v3",
apiKey: process.env.SCALEWAY_API_KEY!,
baseURL: "https://api.scaleway.ai/v1",
});
const agent = new Agent({
name: "medical-assistant",
model: scaleway("gpt-oss-120b"),
tools: {
transcribeAudio: createTranscriptionTool(whisperModel, {
description: "Transcribe a medical audio recording to text",
}),
},
});
const result = await agent.generate({
prompt: "Transcribe this file: /recordings/consultation.mp3",
});
The tool schema exposed to the LLM: audio (path / URL / base64), inputType, language.
flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, webm (identical to OpenAI Whisper).