Giter VIP home page Giter VIP logo

algostat's Introduction

Algostat

Build Status Coverage Status Code Health Scrutinizer Code Quality

Tools to find the most frequently used C++ algorithms on Github.

Results

You can look at the results of 3869 analyzed C++ repos in my Google Spreadsheets or use the results.csv directly.

Diagram of top 10 algorithms

algorithm sum avg
swap 108363 28
find 81006 21
count 60306 16
move 57595 15
copy 48050 12
sort 33317 9
max 28848 7
equal 27467 7
min 21720 6
unique 18484 5
lower_bound 15017 4
remove 13972 4
replace 13262 3
upper_bound 11835 3
for_each 11518 3

##Usage

For best mode you should disable input and output buffering of Python.

export PYTHONUNBUFFERED=true

Analyze top C++ repos on Github

Analyze the top C++ repos on Github and create a CSV file.

./top-github-repos.py | ./algostat.py | ./create-csv.py > results.csv

Analyze all C++ repos on Github

Analyze all C++ repos listed in GHTorrent.

cat cpp_repos.txt | ./algostat.py | ./create-csv.py > results.csv

Distributed Analyzing with Redis Queue and workers

Use a Redis Queue to distribute jobs among workers and then fetch the results. You need to provide the ALGOSTAT_RQ environment variable to the process with the address of the redis server.

export ALGOSTAT_RQ_HOST="localhost"
export ALGOSTAT_RQ_PORT="6379"

Now you need to fill the job queue with results from top github repos and repos listed in GHTorrent and sort out the duplicates.

./top-github-repos.py >> jobs.txt
cat cpp_repos.txt >> jobs.txt
sort -u jobs.txt | ./enqueue-jobs.py

On your workers you need to tell algostat.py to fetch the jobs from a redis queue and then store it in a results queue.

./algostat.py --rq | ./enqueue-results.py

After that you aggregate the results in a single csv.

./fetch-results.py | ./create-csv.py > results.csv

Installation

  1. Make sure you have Python 3 installed
  2. Clone the repository
  3. Install requirements with pip install -r requirements.txt

Using Docker for Deployment

You can use Docker to run the application in a distributed setup.

Redis

Run the redis server.

docker run --name redis -p 6379:6379 -d sameersbn/redis:latest

Get the IP address of your redis server. Assign it to the ALGOSTAT_RQ_HOST env variable for all following docker run commands. In this example we will work with 104.131.5.11.

Get the image

I have already setup an automated build lukasmartinelli/algostat which you can use.

docker pull lukasmartinelli/algostat

Or you can clone the repo and build the docker image yourself.

docker build -t lukasmartinelli/algostat .

Fill job queue

docker run -it --rm --name queue-filler \
-e ALGOSTAT_RQ_HOST=104.131.5.11 \
-e ALGOSTAT_RQ_PORT=6379 \
lukasmartinelli/algostat bash -c "cat cpp_repos.txt | ./enqueue-jobs.py"

Run the workers

Assign as many workers as you like.

docker run -it --rm --name worker1 \
-e ALGOSTAT_RQ_HOST=104.131.5.11 \
-e ALGOSTAT_RQ_PORT=6379 \
lukasmartinelli/algostat bash -c "./algostat.py --rq | ./enqueue-results.py"

Aggregate results

Note that this step is not repeatable. Once you've aggregated the results the redis list will be empty.

docker run -it --rm --name result-aggregator \
-e ALGOSTAT_RQ_HOST=104.131.5.11 \
-e ALGOSTAT_RQ_PORT=6379 \
lukasmartinelli/algostat bash -c "./fetch-results.py | ./create-csv.py"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.