Skip to content

DuplicatesStrategy

Defined in: packages/core/src/ingestion/strategies/DuplicatesStrategy.ts:10

Handle duplicates by checking if documents already exist in the vector store. Documents that already exist (by ref_doc_id) are skipped. Note: This does NOT detect content changes - use UPSERTS strategy if you need to update changed documents.

DuplicatesStrategy<Options>(nodes, options?): BaseNode<Metadata>[] | Promise<BaseNode<Metadata>[]>

Defined in: packages/core/src/ingestion/strategies/DuplicatesStrategy.ts:10

Handle duplicates by checking if documents already exist in the vector store. Documents that already exist (by ref_doc_id) are skipped. Note: This does NOT detect content changes - use UPSERTS strategy if you need to update changed documents.

Options extends Record<string, unknown>

BaseNode<Metadata>[]

Options

BaseNode<Metadata>[] | Promise<BaseNode<Metadata>[]>

new DuplicatesStrategy(vectorStore): DuplicatesStrategy

Defined in: packages/core/src/ingestion/strategies/DuplicatesStrategy.ts:13

BaseVectorStore

DuplicatesStrategy

RollbackableTransformComponent.constructor

id: string

Defined in: packages/core/src/schema/type.ts:22

RollbackableTransformComponent.id

rollback(vectorStore, nodes): Promise<void>

Defined in: packages/core/src/ingestion/strategies/rollback.ts:9

Remove all nodes for documents that exist in the vector store. Useful in case generating embeddings fails and we want to remove partially added docs.

BaseVectorStore

BaseNode<Metadata>[]

Promise<void>

RollbackableTransformComponent.rollback