Skip to content

Fit topic models

tinytopics.fit

fit_model(X, k, num_epochs=200, batch_size=16, base_lr=0.01, max_lr=0.05, T_0=20, T_mult=1, weight_decay=1e-05, device=None)

Fit topic model using sum-to-one constrained neural Poisson NMF, optimized with AdamW and a cosine annealing with warm restarts scheduler.

Parameters:

Name Type Description Default
X Tensor

Document-term matrix.

required
k int

Number of topics.

required
num_epochs int

Number of training epochs. Default is 200.

200
batch_size int

Number of documents per batch. Default is 16.

16
base_lr float

Minimum learning rate after annealing. Default is 0.01.

0.01
max_lr float

Starting maximum learning rate. Default is 0.05.

0.05
T_0 int

Number of epochs until the first restart. Default is 20.

20
T_mult int

Factor by which the restart interval increases after each restart. Default is 1.

1
weight_decay float

Weight decay for the AdamW optimizer. Default is 1e-5.

1e-05
device device | None

Device to run the training on. Defaults to CUDA if available, otherwise CPU.

None

Returns:

Type Description
Tuple[NeuralPoissonNMF, Sequence[float]]

A tuple containing: - The trained NeuralPoissonNMF model - List of training losses for each epoch

poisson_nmf_loss(X, X_reconstructed)

Compute the Poisson NMF loss function (negative log-likelihood).

Parameters:

Name Type Description Default
X Tensor

Original document-term matrix.

required
X_reconstructed Tensor

Reconstructed matrix from the model.

required

Returns:

Type Description
Tensor

The computed Poisson NMF loss.