config
¤
SelectionMode
¤
ClusterTreeConfig
dataclass
¤
ClusterTreeConfig(
embedding_model: EmbeddingModelLike,
summarization_model: SummarizationModelLike,
document_splitter: DocumentSplitterLike,
clustering_func: ClusteringFunctionLike = raptor_clustering,
clustering_backend: ClusteringBackendLike | None = None,
max_length_in_cluster: int = 3500,
max_num_layers: int = 5,
)
Configuration for ClusterTreeBuilder.
Parameters:
-
embedding_model
(EmbeddingModelLike
) –The embedding model to use.
-
summarization_model
(SummarizationModelLike
) –The summarization model to use.
-
document_splitter
(DocumentSplitterLike
) –The document splitter to use.
-
clustering_func
(ClusteringFunctionLike
, default:raptor_clustering
) –The clustering function to use.
-
clustering_backend
(ClusteringBackendLike | None
, default:None
) –The clustering backend to use.
-
max_length_in_cluster
(int
, default:3500
) –The maximum length of a cluster.
-
max_num_layers
(int
, default:5
) –The maximum number of layers
embedding_tokenizer
property
¤
embedding_tokenizer: TokenizerLike
Returns:
-
TokenizerLike
–The tokenizer of the embedding model.
summarization_tokenizer
property
¤
summarization_tokenizer: TokenizerLike
Returns:
-
TokenizerLike
–The tokenizer of the summarization model.
TreeRetrieverConfig
dataclass
¤
TreeRetrieverConfig(
embedding_model: EmbeddingModelLike,
threshold: float = 0.5,
top_k: int = 5,
selection_mode: SelectionMode = SelectionMode.TOP_K,
max_tokens: int = 3500,
)
Configuration for TreeRetriever.
Parameters:
-
embedding_model
(EmbeddingModelLike
) –The embedding model to use.
-
threshold
(float
, default:0.5
) –The threshold value for selection when using threshold mode for selection.
-
top_k
(int
, default:5
) –The number of top results to return when using top k mode for selection.
-
selection_mode
(SelectionMode
, default:TOP_K
) –The selection mode to use.
-
max_tokens
(int
, default:3500
) –The maximum number of tokens to retrieve.
tokenizer
property
¤
tokenizer: TokenizerLike
Returns:
-
TokenizerLike
–The tokenizer of the embedding model.