Giter VIP home page Giter VIP logo

baseballdatabank's Introduction

Baseball Databank

Baseball Databank is a compilation of historical baseball data in a convenient, tidy format, distributed under Open Data terms.

This work is licensed by Chadwick Baseball Bureau under the Creative Commons Attribution-ShareAlike 3.0 Unported License. For details see http://creativecommons.org/licenses/by-sa/3.0/

About this data

  • This is a legacy resource. Data in this format has been circulated by various people for many years, and there are many applications and users who have tools which take data in this format. It is maintained by Chadwick Baseball Bureau to support compatibility with those tools and programs. As such, the schema is not open to amendments, either in terms of the scope of coverage or in terms of the data categories available.
  • This is a free resource. Statistical data will be updated once at some point during the MLB offseason. To borrow the slogan used by ProMods, "It's ready when it's ready." New releases will be announced via our Twitter account at @chadwickbureau. We, politely, will not be able to respond to any enquiries as to when new versions of the data will be released.
  • These data are maintained wholly by Chadwick Baseball Bureau, for the benefit of the community. Users who require data of a different scope, in a different format, and/or with more specific schedules for updates are encouraged to enquire about our various licensing options.

Organisation of the files

There are three directories in the repository.

  • core/ contains the databank itself. These files are automatically produced from our larger dataset.
  • contrib/ contains files which are manually maintained by others using the same identifier system as the core. We bundle these for the convenience of the community.
  • upstream/ contains files used to construct the databank.

Maintenance and sources

Most of the data in the Databank is provided by Chadwick Baseball Bureau (http://www.chadwick-bureau.com). The data differ from the data the Bureau provides to its clients in that it contains less detail, is updated less frequently, and is provided on an as-is basis.

The Databank is historically based in part on the Lahman Baseball Database, version 2015-01-24, which is Copyright (C) 1996-2015 by Sean Lahman.

The tables Parks.csv and HomeGames.csv are based on the game logs and park code table published by Retrosheet. This information is available free of charge from and is copyrighted by Retrosheet. Interested parties may contact Retrosheet at http://www.retrosheet.org.

Enquiries and suggested revisions

Enquiries and suggested revisions to the data can be posted in the issue tracker at https://github.com/chadwickbureau/baseballdatabank/issues.

Files in core/ are all generated by scripts. As such they are not edited manually (and therefore pull requests should not be submitted against these files).

Files in upstream/ are manually-maintained files which contain information specific to constructing the Databank. As they are maintained manually, it is valid to submit pull requests containing corrections or additions to these files.

baseballdatabank's People

Contributors

tturocy avatar orrski avatar seanlahman avatar heike avatar cbrou avatar mskeen avatar nickball avatar wclark17 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.