Fit topic models¶
tinytopics.fit
¶
fit_model(X, k, num_epochs=200, batch_size=16, base_lr=0.01, max_lr=0.05, T_0=20, T_mult=1, weight_decay=1e-05, device=None)
¶
Fit topic model using sum-to-one constrained neural Poisson NMF, optimized with AdamW and a cosine annealing with warm restarts scheduler.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X
|
Tensor
|
Document-term matrix. |
required |
k
|
int
|
Number of topics. |
required |
num_epochs
|
int
|
Number of training epochs. Default is 200. |
200
|
batch_size
|
int
|
Number of documents per batch. Default is 16. |
16
|
base_lr
|
float
|
Minimum learning rate after annealing. Default is 0.01. |
0.01
|
max_lr
|
float
|
Starting maximum learning rate. Default is 0.05. |
0.05
|
T_0
|
int
|
Number of epochs until the first restart. Default is 20. |
20
|
T_mult
|
int
|
Factor by which the restart interval increases after each restart. Default is 1. |
1
|
weight_decay
|
float
|
Weight decay for the AdamW optimizer. Default is 1e-5. |
1e-05
|
device
|
device
|
Device to run the training on. Defaults to CUDA if available, otherwise CPU. |
None
|
Returns:
Type | Description |
---|---|
NeuralPoissonNMF
|
Trained model. |
list
|
List of training losses for each epoch. |
poisson_nmf_loss(X, X_reconstructed)
¶
Compute the Poisson NMF loss function (negative log-likelihood).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X
|
Tensor
|
Original document-term matrix. |
required |
X_reconstructed
|
Tensor
|
Reconstructed matrix from the model. |
required |