Giter VIP home page Giter VIP logo

nemtropy's Introduction

Hello there πŸ‘‹

I'm a PhD student in Italy research associate in Zurich. Working on statistical methods for network reconstruction, sometimes. Most of the time learning something interesting and useless (yet).

  • πŸ“« How to reach me: nicolo.vallarano at imtlucca.it
  • 🌱 I’m currently learning:
    • mathematics
    • LaTex
    • python
    • html
    • Hugo to write websites
    • Crypto consensus protocols

nemtropy's People

Contributors

emilianomarchese avatar mat701 avatar nicoloval avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

nemtropy's Issues

Directed Graph Example Notebook

Hello,
I noticed that in the Directed Graph Example Notebook you created a graph using the function

# and generate a networkx graph from it
G = nx.from_numpy_array(ens_adj)

in the block [24].

However, according to
https://networkx.org/documentation/stable/reference/generated/networkx.convert_matrix.from_numpy_array.html
the function

nx.from_numpy_array()

per default creates a graph using nx.Graph, which is undirected.
Shouldn't the function call therefore be

G = nx.from_numpy_array(ens_adj, create_using=nx.DiGraph)

?

Add warning when solver fails

Print or give a warning for the undirected and directed graph classes when the solution of the model is clearly wrong.

ECM_EXP for non-integer weights

Hi,

First, thanks for providing such a package. I'm trying to use it to generate null models for a weighted undirected network (N=200, ~=10). The weights are in the range of 0.001 and 0.1 . However, the generated networks have integer weights instead of float values.

I'm using the following code:

G = nx.read_edgelist(networkFilename)
adj_kar = nx.to_numpy_array(G,dtype=float)
graph = UndirectedGraph(adj_kar)

graph.solve_tool(model="ecm_exp",
                method="newton",
                initial_guess="random",
                linsearch=True)
graph.ensemble_sampler(1, cpu_n=1, output_dir=outputDir+"/")

Also, sometimes the whole process stalls. I solved this by adding a timeout of 5 minutes to the process generating the networks.

Running it on Linux.

Misspelled Method in .solve_tool

When using solve_tool if the method in input is "quasi-newton" an error occurs:
" ValueError: Method must be "newton","quasi-newton", or "fixed-point". "

Instead, solve_tool works if "quasinewton" is written instead of "quasi-newton".
Please uniform the method string with the warning error.

Error in Ensembler crema

here it follow an example code that raises the error:

import numpy as np
import networkx as nx
from NEMtropy import UndirectedGraph, matrix_generator

adj_weigh = matrix_generator.random_weighted_matrix_generator_uniform_custom_density(n=100,
p=0.2,
sym=True,
sup_ext=30,
intweights=True)

graph.solve_tool(model="crema",
method="newton",
initial_guess="random",
adjacency="cm_exp",
method_adjacency="newton")

graph.ensemble_sampler(10, cpu_n=2, output_dir="sample/")

DCM_Exp: loss of nodes

Hello

I've noticed that using the dcm_exp model can lead to a loss of nodes in a network.
In this case, the graph G starts with 3214 nodes and 36907 edges, while the sampled graph has 3109 nodes and 36825 edges. Whilst close, reruns also lead to different amounts of nodes and edges.

G = nx.read_gml('Graphs/airlines.gml')

adj_g = nx.to_numpy_array(G) # ndarray: (3214, 3214)
edges = np.array(G.edges) # ndarray: (36907, 2)
graph_d = DirectedGraph(edgelist=edges)
graph_d.solve_tool(model="dcm_exp")
graph_d.ensemble_sampler(1, cpu_n=4, output_dir='temp/')

edgelist_dbcm = np.loadtxt(f"temp/0.txt", dtype=str)
G_RANDOMIZED = nx.DiGraph()
G_RANDOMIZED.add_edges_from(edgelist_dbcm)

Solution errors are in the 1e-10 range.

Using python 3.8.12 and nemtropy 2.0.6

edgelist input bug

edgelist input in Undirected_graph_class.py bug:

    • doesnt work with list of one edge
    • doesnt work with list of >2 edges
  1. is a consequence of a recent hotfix, proposed fix:
    change line 968
    edgelist = list(zip(*edgelist))

with line:
edgelist = [tuple(item) for item in edgelist]

for point 1 no idea, just add a conditional check

Question CREM-A

Hello,

I am trying to implement CReM-A and I was checking your code. I think I am not understanding a small part in the sampling procedure. You implemented the sampling as:

q_ensemble = 1/(beta_i + beta_j) w_link = np.random.exponential(q_ensemble)

therefore you sample from a process with rate q_ensemble and mean 1/q_ensemble = beta_i + beta_j as far as I understand the documentation of numpy.random.exponential (https://numpy.org/doc/stable/reference/random/generated/numpy.random.exponential.html)

But in the paper, Equation III.14 you have that the weights are distributed according to an exponential distribution with rate beta_i + beta_j and mean 1/(beta_i + beta_j). Which makes sense since then the expected value of the row and column sumn match the observed value, equation III.18.

Is this correct?

And if so one could speed up the code by solving for 1/(beta_i + beta_j) directly by transforming it in a linear problem? I mean by solving a linear equation in terms of rates and not means.

Raise error for wrong model

If the model passed by the user to solve_tool is wrong, it is raised an error related to method. We should have two different errors for wrong model and wrong method.

multiprocessing not working on Windows

Multiprocessing, used in the ensemble_generator functions, does not work on Windows. Its use could be avoided when working on Windows or replaced with another parallel computation package.

missing jit

in ensemble_functions.py:

  • std_dcm_3motif_11
  • std_dcm_3motif_12
    missing call to jit before the definition

Plans for BiECM?

Any plans to include the bipartite enhanced configuration model?

I know that the DECM can be applied to this case, technically. But, with the BiECM, there are only two variables associated with each node, allowing reduction in unique (degree, strength) pairs.

Thanks!

Limits of scalability of UECM/DECM models

In the paper these models are tested on networks with 187~ nodes and 196~ nodes.

Has testing been done on larger networks to assess the scalability?

This part

Second, recipes like the UECM and the DECM are,
generally speaking, difficult to solve; as we have already
observed, only Newton’s method performs in a satisfac-
tory way, both for what concerns accuracy and speed:
hence, easier-to-solve recipes are welcome.

Suggests they don't scale very well but I'm wondering at what scale these become unworkable. Any estimates here?

Initial guess

Initial guess is an optional argument, but when it is not passed to solve_tool it raises an error.

Error in model_loglikelihood

The solution_array passed to model_loglikelihood in the case of exponential methods is wrong. We should pass the thetas' solution instead of the exponential of the solution. A possible solution is to differentiate the model_loglikelihood function between the two cases: exponential and not exponential.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.