References:
- Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora, Daniel Ramage...
- Parameter estimation for text analysis, Gregor Heinrich.
- Latent Dirichlet Allocation, David M. Blei, Andrew Y. Ng...
The following descriptions come from Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora, Daniel Ramage...
Labeled LDA is a topic model that constrains Latent Dirichlet Allocation by defining a one-to-one correspondence between LDA’s latent topics and user tags. Labeled LDA can directly learn topics(tags) correspondences.
- Graphical model of Labeled LDA:
- Generative process for Labeled LDA:
- Gibbs sampling equation:
- new llda model
- training
- inference
- save model to disk
- load model from disk
import model.labeled_lda as llda # data labeled_documents = [("example example example example example", ["example"]), ("test llda model test llda model test llda model", ["test", "llda_model"]), ("example test example test example test example test", ["example", "test"])] # new a Labeled LDA model llda_model = llda.LldaModel(labeled_documents=labeled_documents) print llda_model # training llda_model.training(iteration=10, log=True) # inference document = "test example" topics = llda_model.inference(document=document, iteration=10, times=10) print topics # save to disk save_model_dir = "../data/model" llda_model.save_model_to_dir(save_model_dir) # load from disk llda_model_new = llda.LldaModel() llda_model_new.load_model_from_dir(save_model_dir) print llda_model_new 

