okkevaneck / distributed_systems_lab Goto Github PK

Python 69.13% Shell 30.87%

distributed_systems_lab's Issues

implement script to do analysis on resulting graphs

Use some library or something to perform graph analysis

Vertex IDs in Graphalytics dataset are random numbers

Vertex IDs in Graphalytics dataset are actually random number, not just start from some number n and end with n + n_vertices, which leads to the bug in the converter.

Add a decorator to time certain functions for measuring time of certain functions

Multiple functions need the decorator.

graph.spread should probably be measured

list how many nodes has been burned per X time units
Compute node should measure how many messages are sent, how many messages are received.
Head node should keep track of when heartbeats are received, how many nodes are burned per heartbeat.

Make a Head Node Class

Head Node is responsible for

Stitching together all graph sampled from compute nodes
listening for heartbeats
- send kill signal to compute nodes to stop burning
writing resulting graph to a file

Implement MPI communication in compute node classes

Issue to implement MPI communication in compute node classes

Implement a thread to listen for communication from other compute nodes
Implement a thread to listen for communication from the HEAD node telling the compute node to stop
Implement a function to send the head node heartbeats that communicate how many nodes have been sampled by the compute node
Implement function to listen for restart call

Implement functionality so that Compute Nodes can be assigned multiple partitions

Because we will have precomputed partitions with clustering, we will need each compute node to handle multiple partitions.

(I.e 8 partitions for a graph & 2 compute nodes means each compute node gets 4 partitions)

This task is to implement that functionality

Create vertex -> Vertex_status dictionary in Graph.py

for faster set_vertex_status calls
and faster get_vertex_status calls

shouldn't these functions be in the vertex class anyway?
What am I doing?

Compute Node might not close correctly

When ComputeNode is blocking on send (heartbeat), after the head send a kill request. The head will never accept that send and thus the ComputeNode wont stop by it self.

Fix graph interpretation.

We should read 2 graph files.
a .v file, and a .e file.

Changes to graph interpreter and Graph.add_vertex_and_neighbor

Improve burned edge aggregation in heartbeats

It's pretty underperformant right now

develop upscaling algorithm

To upsacle X2

run downscaling to (0.5) algorithm 4 times
Then stitch together all (0.5) samples.

head node sends a restart command when downscaling iteration is done (i.e 0.5 nodes collected)

move Compute node machine_with_vertex function to a simulation file

So when we run halted vs wild, we can just pass a different machine_with_vertex function to the compute node.

Determine clustered partitions of input graphs

For halted fires we want clustering in the partitions

Implement functionality to run simulations locally

manage difference in environment variables. (i.e "LOCAL" variable)
Create all the compute node classes for each simulated node.
run the node instance in it's own thread
handle communication between compute nodes with MPI??

OSError: [Errno 122] Disk quota exceeded

Current data storage structure:

data/
|
| ---- kgs/
|        |
|        | ---- kgs-2-partitions/
|        |        |
|        |        | ---- node1.e
|        |        |
|        |        | ---- node1.p
|        |        |
|        |        | ---- node2.p 
|        |        |
|        |        | ---- node2.e
...
...
|        | ---- kgs-16-partitions/
|        |        |
|        |        | ---- node1.e
|        |        |
|        |        | ---- node1.p
...
...

|        |        | ---- node16.p 
|        |        |        
|        |        | ---- node16.e

Update:
Memory is still a problem especially when doing partitioning on a large graph (like S scale in LDBC datasets)

Halted forest fire script vs wild fire script

Compute nodes are responsible for sending messages to other compute nodes

In a halted forest fire there are two ideas that need be implemented.

Don't pass messaged
Read from assigned partition files. <- might be it's own task
Run algorithm on assigned graph
Parse scale up call vs scale down call, what the percentage of the scale up/down call is.

okkevaneck / distributed_systems_lab Goto Github PK

distributed_systems_lab's Issues

implement script to do analysis on resulting graphs

Vertex IDs in Graphalytics dataset are random numbers

Add a decorator to time certain functions for measuring time of certain functions

Make a Head Node Class

Implement MPI communication in compute node classes

Implement functionality so that Compute Nodes can be assigned multiple partitions

Create vertex -> Vertex_status dictionary in Graph.py

Compute Node might not close correctly

Fix graph interpretation.

Improve burned edge aggregation in heartbeats

develop upscaling algorithm

move Compute node machine_with_vertex function to a simulation file

Determine clustered partitions of input graphs

Implement functionality to run simulations locally

OSError: [Errno 122] Disk quota exceeded

Halted forest fire script vs wild fire script

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent