Giter VIP home page Giter VIP logo

nsbm's Introduction

DOI Documentation Status Python package Conda test Docker GPL

multipartite Stochastic Block Modeling

Inheriting hSBM from https://github.com/martingerlach/hSBM_Topicmodel extends it to tripartite networks (aka supervised topic models)

The idea is to run SBM-based topic modeling on networks given keywords on documents

network

Install

With pip

python3 -m pip install . -vv

With conda/mamba

conda install -c conda-forge nsbm

Example

from nsbm import nsbm
import pandas as pd
import numpy as np

df = pd.DataFrame(
index = ["w{}".format(w) for w in range(1000)],
columns = ["doc{}".format(d) for d in range(250)],
data = np.random.randint(1, 100, 250000).reshape((1000, 250)))

df_key_list = []

## keywords
df_key_list.append(
    pd.DataFrame(
    index = ["keyword{}".format(w) for w in range(100)],
    columns = ["doc{}".format(d) for d in range(250)],
    data = np.random.randint(1, 10, (100, 250)))
)
    
## authors
df_key_list.append(
    pd.DataFrame(
    index = ["author{}".format(w) for w in range(10)],
    columns = ["doc{}".format(d) for d in range(250)],
    data = np.random.randint(1, 5, (10, 250)))
)
    
## other features
df_key_list.append(
    pd.DataFrame(
    index = ["feature{}".format(w) for w in range(25)],
    columns = ["doc{}".format(d) for d in range(250)],
    data = np.random.randint(1, 5, (25, 250)))
)

model = nsbm()
model.make_graph_multiple_df(df, df_key_list)

model.fit(n_init=1, B_min=50, verbose=False)
model.save_data()

Run with Docker

docker run -it -u jovyan -v $PWD:/home/jovyan/work -p 8899:8888 docker.pkg.github.com/fvalle1/trisbm/trisbm:latest

If a graph.xml.gz file is found in the current dir the analysis will be performed on it.

Tests

python3 tests/run_tests.py

Caveats

Please check this stuff in your data:

  • there should be no zero-degree nodes (all nodes should have at least one link)
  • there shouldn't be any duplicate node
  • The make_form_BoW_df function discretises the data

Documentation

Docs

Readthedocs

License

See LICENSE.

This work is in part based on sbmtm

Third party libraries

This package depends on graph-tool

nsbm's People

Contributors

fvalle1 avatar

Stargazers

 avatar

Watchers

 avatar  avatar

Forkers

biophys-turin

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.