Giter VIP home page Giter VIP logo

git_statistics's Introduction

git_statistics's People

Contributors

kevinjalbert avatar mdespuits avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

git_statistics's Issues

Object not found - no matching loose object

Installed the latest git_statistics (v0.8.0):

gem install git_statistics
Fetching: rugged-0.23.3.gem (100%)
Building native extensions.  This could take a while...
Successfully installed rugged-0.23.3
Fetching: language_sniffer-1.0.2.gem (100%)
Successfully installed language_sniffer-1.0.2
Fetching: git_statistics-0.8.0.gem (100%)
Successfully installed git_statistics-0.8.0
3 gems installed

When running from the command line I get the following error:

git_statistics-0.8.0/lib/git_statistics/collector.rb:17:in `each': Object not found - no matching loose object

ASCII Art Graphs

It would totally ballin' if you got some ASCII art graphs going on. :)
Just a thought!

Move from mojombo/grit to libgit2/rugged

While the move to mojombo/grit has been great (performance-wise and ease of collection), it currently has an issue dealing with commits which have GPG signing. The solution to this would be to move to libgit2/rugged.

In addition, grit is also not ready for Ruby 2.0.x. The switch over to rugged will allow us to also expand to Ruby 2.0.x.

Pipe spec is failing only on Travis CI

The builds for Travis CI are failing now, even though nothing is explicitly breaking our specs. Seems like the culprit is the a Pipe spec:

GitStatistics::Pipe#io 
  Failure/Error: it { pipe.io.should be_an IO }
  Errno::ENOENT:
  No such file or directory - time
   # ./lib/git_statistics/pipe.rb:24:in `open'
   # ./lib/git_statistics/pipe.rb:24:in `io'
   # ./spec/pipe_spec.rb:56:in `block (3 levels) in <top (required)>'

Only thing I really noticed is that Travis CI has changed its gem --system version from 1.8.x to 2.0.x. I updated my system gem's version to 2.0.x and I still can't reproduce the error.

I'm going to make another issue for this, so we can resolve this problem. Looks like it might be interpreting 'time' as a directory/file instead of a command though, not sure why this would be the case now though.

Add specs for existing source

The major refactoring was needed. New features were added as well (e.g., #1)

These changes rendered the old specs useless. This issue is to bring back specs for the old and new functionality. In addition, if any refactoring can be done to improve the maintainability then it should also be done in accordance with the new specs.

For commits acquire their source branch

In some situations it might be useful to know what branch a commit belongs too. This information can be further expressed in the summary results (i.e., which branch is most actively developed, etc...). This issue is about collecting the branch information and storing it within the commit JSON files.

As commits can reside in multiple branches at the same time (i.e., merged/rebased commits), this information will list the source branch (according to git log --source command, which indicates how the commit can be reached).

Consistent Style -- Use a Static Analysis Tool

It might be worth having some static analysis tool to ensure that the code base (and future contributions) share a similar style. For example, tailor will check style and report any violation from a set of rules.

Add proper logging

Currently puts are used to log specific situations. A proper logging gem should be used instead, possibly that writes to a file.

Identify languages involved in each commit

We want to identify the languages of the files which are involved in each commit. This will allow authors to be associated with the normal attributes, yet now in specify languages (i.e., find author who did the most python, etc...)

We could simply use the file extensions as a language indicator, though there is likely to be many problems with this approach. We can use a language detection tool like github/linguist instead as their approach is more sophisticated and accurate. The only problem is that they require the actual file (git blob?). This mean we essentially have to iterate over every commit and revert the files to the state of the commit then identify the language. The mentioned library can also detect vendor/generator files, so we can exclude those from our calculations.

Sort languages in output

Right now the languages are unsorted (based on order of appearance). Ideally they should be sorted either by the sort criteria (which would result in different orderings for each author) or alphabetically (consistant ordering for every author).

Reduce memory used for large projects

On large projects (i.e., torvalds/linux) there are so many commits that git_statistics will run out of memory. This is because all the commit data is held in memory.

To avoid this problem the commits should be processed such that we only keep x number of commits in memory at a time. We can do this by only keeping cumulative data of the commits we process (i.e., read a commit and then toss it), though this will not allow us to save any of the data. We can also append commits to a file in batches. This way only x number of commits are in memory, yet we still have a complete collect of commits in a json file for loading/updating.

In the second approach during the results phase (cumulative data is gathered) we would read x number of commits in memory and collect the data, then read the next x and so forth.

Overall this approach should reduce the memory consumption, though at a little I/O cost.

Crashes when repository is in a detached HEAD state

In in the situation that the current repository to be analyzed is in a detached HEAD state (i.e., switched to a specific tag, or commit) the following will occur:

sh: -c: line 0: syntax error near unexpected token `('
sh: -c: line 0: `git --no-pager log (no branch) general-cleanup issue_5 master --date=iso --reverse --no-color --find-copies-harder --numstat --encoding=utf-8 --summary   --format="%H,%an,%ae,%ad,%p"'
/Users/jalbert/.rvm/gems/ruby-1.9.3-p362/gems/git_statistics-0.5.1/lib/git_statistics/formatters/console.rb:17:in `prepare_result_summary': Parameter for --sort is not valid (RuntimeError)
    from /Users/jalbert/.rvm/gems/ruby-1.9.3-p362/gems/git_statistics-0.5.1/lib/git_statistics/formatters/console.rb:35:in `print_summary'
    from /Users/jalbert/.rvm/gems/ruby-1.9.3-p362/gems/git_statistics-0.5.1/lib/git_statistics.rb:47:in `execute'
    from /Users/jalbert/.rvm/gems/ruby-1.9.3-p362/gems/git_statistics-0.5.1/bin/git_statistics:5:in `<top (required)>'
    from /Users/jalbert/.rvm/gems/ruby-1.9.3-p362/bin/git_statistics:23:in `load'
    from /Users/jalbert/.rvm/gems/ruby-1.9.3-p362/bin/git_statistics:23:in `<main>'

As we can see the (no branch) from the git branch command is being used and is causing the problem. We should probably just ignore the (no branch) and continue to grab git statistics from other valid branches.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.