Split text
index, start, end, content, type: "text", and optional metadata. chunkOverlap defines how many characters overlap to preserve context.
Working with chunks
Split JSON
format accepts auto, preserve, or pretty. Returned chunks have type: "json" and inherit the provided metadata.
Use TChunkDocument
TChunkDocument manages the content type and merges document-level metadata with chunk-level metadata.
Tips
- Adjust
chunkSizeto fit your model or vector store limits. - Keep
chunkOverlaplight (10–50) to retain context without blowing up size. - Store metadata (source, version, language) to trace chunks and filter later on.