From cb3b2af846f903c5edd1a3c8aacf7af8258c81e3 Mon Sep 17 00:00:00 2001 From: rbroc Date: Wed, 20 Mar 2024 09:54:25 +0100 Subject: [PATCH] add reference to dynamic modeling in docs --- docs/clustering.md | 5 +++++ docs/dynamic.md | 2 +- 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/docs/clustering.md b/docs/clustering.md index a7c2a55..58df63f 100644 --- a/docs/clustering.md +++ b/docs/clustering.md @@ -188,6 +188,11 @@ top2vec = ClusteringTopicModel( Theoretically the model descriptions above should result in the same behaviour as the other two packages, but there might be minor changes in implementation. We do not intend to keep up with changes in Top2Vec's and BERTopic's internal implementation details indefinitely. +### _(Optional)_ 5. Dynamic Modeling + +Clustering models are also capable of dynamic topic modeling. This happens by fitting a clustering model over the entire corpus, as we expect that there is only one semantic model generating the documents. +To gain temporal representations for topics, the corpus is divided into equal, or arbitrarily chosen time slices, and then term importances are estimated using Soft-c-TF-IDF, c-TF-IDF, or distances from cluster centroid for each of the time slices separately. When distance from cluster centroids is used to estimate topic importances in dynamic modeling, cluster centroids are computed based on documents and terms present within a given time slice. + ## Considerations ### Strengths diff --git a/docs/dynamic.md b/docs/dynamic.md index 9e90628..3120d95 100644 --- a/docs/dynamic.md +++ b/docs/dynamic.md @@ -28,7 +28,7 @@ Dynamic topic models in Turftopic have a unified interface. To fit a dynamic topic model you will need a corpus, that has been annotated with timestamps. The timestamps need to be Python `datetime` objects, but pandas `Timestamp` object are also supported. -Models that have dynamic modeling capabilities have a `fit_transform_dynamic()` method, that fits the model on the corpus over time. +Models that have dynamic modeling capabilities (currently, `GMM` and `ClusteringTopicModel`) have a `fit_transform_dynamic()` method, that fits the model on the corpus over time. ```python from datetime import datetime