Giter VIP home page Giter VIP logo

graphchallenge.jl's People

Contributors

jpfairbanks avatar rohitvarkey avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar

graphchallenge.jl's Issues

sparse matrix as dict

Here is an example type for how to embed a sparse 3 tensor into a dense matrix.

type Messages{T<:Real, G} <: AbstractArray{T, 3}
    nrows::Int
    ncols::Int
    edges::Dict{Tuple{Int, Int}, Int}
    states::UnitRange
    storage::Array{T, 2}
end

Base.size(m::Messages) = (m.nrows, m.ncols, length(m.states))
function Base.getindex(m::Messages, I::Vararg{Int})
    pair = I[1:2]
    if pair in keys(m.edges)
        return m.storage[m.edges[pair], I[3]]
    else
        throw(KeyError(pair))
    end
end

function Base.setindex!(m::Messages, v, I::Vararg{Int})
    pair = I[1:2]
    if pair in keys(m.edges)
        return m.storage[m.edges[pair], I[3]] = v
    else
        throw(KeyError(pair))
    end
end

Type Stability

Can the current test suite be used to check for type stability?

Linear algebra information

Geoff had some good ideas about our method.

This is a big problem in Algebraic Multigrid if C is a matrix mapping vertices to communities. Then we need to update C but read from a matrix C'AC. So our write optimized DS is a vector for C. and our Read optimized DS is a CSC for C'AC, but we will calculate a sparse update to C'AC by using the formula:

(C+D)'A(C+D) = C'AC + D'AC + C'AD + D'AD where for undirected graphs we can exploit symmetry.

We can batch updates up to the point where Time to compute (C+D)'A(C+D) < Time to compute C'AC + D'AC + C'AD + D'AD

We should compare with the streaming of A' = A+D where (C+d)'(A+D)(C+d)

Baseline

compare performance with serial baseline

Reduce memory traffic

Any place with copy, hcat, vcat, zip, list comprehension is a place where memory is allocated. Tracking these down and replacing them with reused buffers will improve performance.

Roadmap

Here is the roadmap for the semester

  • PGSQL implementation
  • Benchmark and profiling system
  • Pick challenge conforming output formats
  • Updating the graph leveraging old clusters
  • Parallel implementation
  • Paper Submission

RNG parallelism

Multithreaded calls to rand serialize. Each thread should have their own RNG.

Stinger behavior influencing partitioning algorithm

Some major restrictions imposed by stinger make this algorithm non straightforward to implement using STINGER

  1. The degrees pre computed in the vertex of STINGER are only a cumulative count of the number of edges. It does not automatically compute the sum of the weights of the edges. This means to compute the degree (sum of the weights of the edges), we actually need to iterate over all the edges of the graph, keep track of the direction and update the outdegree and indegree for the source and destination. Rather than doing this for every vertex separately, I update when the STINGER is created, and then whenever the updates are made to the STINGER. This again increases complexity and makes it harder to get right.
  2. Self loop removals make STINGER go into an infinite loop. The algorithm creates subgraphs with infinite loops. Trying to remove one of these hangs the entire algorithm. A workaround was to insert edges of 0 weights, but this results in inaccurate degree counts and more complexity to keep track of the removed edges. The current workaround I've adopted is to keep track of just the self loops counts in a Vector{Int64} stored along with the STINGER data structure and only insert and remove edges that aren't self loops. This does increase complexity in writing code using the structure though.
  3. Removing edges doesn't actually remove the edge from the structure. It only marks the direction as -1. While iterating through edges, it should be checked if the direction bit is -1, and those edges should be discarded.
  4. The edge weight of a back edge is only stored in the source vertex edgeblock. This leads to us having to iterate through the source vertices of all back edges to find weights for each of these back edge. We will need to find a workaround for this or it will be a huge detriment to performance.

writing

  • Intro
  • Related Work
    • Piexoto algorithm
  • What data structures we used
  • Serial vs Bulk
    • In serial you cant do anything
    • In parallel segregation of read and writes help on the data structure too
  • Theory from spreadsheet analysis #7
  • Theory of #17
  • Benchmark results
    • Include the observation that entropy does not decay with nodal shifts within mering periods, until the end phases.
  • Conclusions
    • X is the best DS
    • Writes time takes up a tiny fraction of runtime so it doesn't matter.
    • Broader statement about how to apply this analysis to other problems

Postgres Performance

We need to make the PGSQL implementation faster.

  • Reusing connections
  • Indexing on any field with a where clause
  • Check query plans to eliminate any sequential scans

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.