Loader
Before you can start indexing your documents, you need to load them into memory.
SimpleDirectoryReader
LlamaIndex.TS supports easy loading of files from folders using the SimpleDirectoryReader
class.
It is a simple reader that reads all files from a directory and its subdirectories.
import { SimpleDirectoryReader } from "llamaindex/readers/SimpleDirectoryReader";
// or
// import { SimpleDirectoryReader } from 'llamaindex'
const reader = new SimpleDirectoryReader();
const documents = await reader.loadData("../data");
documents.forEach((doc) => {
console.log(`document (${doc.id_}):`, doc.getText());
});
Currently, it supports reading .csv
, .docx
, .html
, .md
and .pdf
files,
but support for other file types is planned.
Also, you can provide a defaultReader
as a fallback for files with unsupported extensions.
Or pass new readers for fileExtToReader
to support more file types.
import type { BaseReader, Document, Metadata } from "llamaindex";
import {
FILE_EXT_TO_READER,
SimpleDirectoryReader,
TextFileReader,
} from "llamaindex/readers/SimpleDirectoryReader";
class ZipReader implements BaseReader {
loadData(...args: any[]): Promise<Document<Metadata>[]> {
throw new Error("Implement me");
}
}
const reader = new SimpleDirectoryReader();
const documents = await reader.loadData({
directoryPath: "../data",
defaultReader: new TextFileReader(),
fileExtToReader: {
...FILE_EXT_TO_READER,
zip: new ZipReader(),
},
});
documents.forEach((doc) => {
console.log(`document (${doc.id_}):`, doc.getText());
});