Skip to main content

Documentation Index

Fetch the complete documentation index at: https://ai.aidalinfo.fr/llms.txt

Use this file to discover all available pages before exploring further.

@ai_kit/core includes model-agnostic audio transcription support, compatible with any OpenAI-compatible endpoint (Scaleway Whisper large v3, OpenAI whisper-1, etc.).

Three public primitives

ExportRole
createTranscriptionModel(config)Creates a TranscriptionModelV3 provider
transcribe(options)Standalone function: loads audio (path / URL / buffer), calls the model, returns the transcript
createTranscriptionTool(model, options?)Returns an AI SDK tool() to attach directly to an Agent

createTranscriptionModel

import { createTranscriptionModel } from "@ai_kit/core";

const whisperModel = createTranscriptionModel({
  modelId: "whisper-large-v3",
  apiKey: process.env.SCALEWAY_API_KEY!,
  baseURL: "https://api.scaleway.ai/v1",
  providerName: "scaleway", // optional, used in logs
});
Supports any OpenAI-compatible /audio/transcriptions endpoint (response_format=verbose_json).

transcribe

import { transcribe } from "@ai_kit/core";

// From a file path
const result = await transcribe({
  model: whisperModel,
  audio: "/path/to/audio.wav",
  inputType: "path",         // "path" | "url" | "buffer" — auto-detected if omitted
  language: "fr",            // optional ISO-639-1 code
});

console.log(result.text);
// result.segments → [{ text, startSecond, endSecond }]
// result.language, result.durationInSeconds
audio accepts a file path, an http(s) URL, or a Buffer / Uint8Array. The inputType is auto-detected when omitted.

Return value

interface TranscribeResult {
  text: string;
  segments: Array<{ text: string; startSecond: number; endSecond: number }>;
  language: string | undefined;
  durationInSeconds: number | undefined;
}

createTranscriptionTool — attach to an Agent

import { createTranscriptionModel, createTranscriptionTool, Agent } from "@ai_kit/core";
import { scaleway } from "@ai_kit/core";

const whisperModel = createTranscriptionModel({
  modelId: "whisper-large-v3",
  apiKey: process.env.SCALEWAY_API_KEY!,
  baseURL: "https://api.scaleway.ai/v1",
});

const agent = new Agent({
  name: "medical-assistant",
  model: scaleway("gpt-oss-120b"),
  tools: {
    transcribeAudio: createTranscriptionTool(whisperModel, {
      description: "Transcribe a medical audio recording to text",
    }),
  },
});

const result = await agent.generate({
  prompt: "Transcribe this file: /recordings/consultation.mp3",
});
The tool schema exposed to the LLM: audio (path / URL / base64), inputType, language.

Supported audio formats

flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, webm (identical to OpenAI Whisper).