topicwizard

Pretty and opinionated topic model visualization in Python.

topicwizard_new_release-2023-04-25_09.38.23.mp4

New in version 0.3.0 🌟 🌟

Exclude pages, that are not needed 🐦
Self-contained interactive figures 🎁
Topic name inference is now default behavior and is done implicitly.

Features

Investigate complex relations between topics, words and documents
Highly interactive
Automatically infer topic names
Name topics manually
Pretty 🎨
Intuitive 🐮
Clean API 🍬
Sklearn, Gensim and BERTopic compatible 🔩
Easy deployment 🌍

Installation

Install from PyPI:

pip install topic-wizard

Usage (documentation)

Step 1:

Train a scikit-learn compatible topic model. (If you want to use non-scikit-learn topic models, check compatibility)

from sklearn.decomposition import NMF
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.pipeline import make_pipeline

# Create topic pipeline
topic_pipeline = make_pipeline(
    CountVectorizer(),
    NMF(n_components=10),
)

# Then fit it on the given texts
topic_pipeline.fit(texts)

Step 2a:

Visualize with the topicwizard webapp 💡

import topicwizard

topicwizard.visualize(pipeline=topic_pipeline, corpus=texts)

From version 0.3.0 you can also disable pages you do not wish to display thereby sparing a lot of time for yourself:

import topicwizard

# A large corpus takes a looong time to compute 2D projections for so
# so you can speed up preprocessing by disabling it alltogether.
topicwizard.visualize(pipeline=topic_pipeline, corpus=texts, exclude_pages=["documents"])

Ooooor...

Step 2b:

Produce high quality self-contained HTML plots and create your own dashboards/reports 🍓

Map of words

from topicwizard.figures import word_map

word_map(corpus=texts, pipeline=pipeline)

Timelines of topic distributions

from topicwizard.figures import document_topic_timeline

document_topic_timeline(
    "Joe Biden takes over presidential office from Donald Trump.",
    pipeline=pipeline,
)

Wordclouds of your topics ☁️

from topicwizard.figures import topic_wordclouds

topic_wordclouds(corpus=texts, pipeline=pipeline)

jankounchained / topic-wizard Goto Github PK

topic-wizard's Introduction

topicwizard

New in version 0.3.0 🌟 🌟

Features

Installation

Usage (documentation)

Step 1:

Step 2a:

Step 2b:

Map of words

Timelines of topic distributions

Wordclouds of your topics ☁️

And much more (documentation)

topic-wizard's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent