Giter VIP home page Giter VIP logo

akronomicon's People

Contributors

iacolippo avatar jsevillamol avatar slippylolo avatar zarathoustra0 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

akronomicon's Issues

Include explanation or specific source for compute estimates

Training compute estimates for models in this database are not always stated in the publications for the models. For example, the AI21 Jurassic whitepaper does not report how much compute was used to train the models, but the database says 3708 petaflop-days (by the way, the units also are not clear just from the JSON files - but that's another issue). I'm not sure what the source or calculation for this number is.

Given that training compute estimates involve some assumptions and uncertainty, it would be good to include some explanation of how the compute was calculated. I realise it might be too cumbersome to put this in the database files themselves, but it could be e.g. in a README in each folder of the database, and/or in the commit messages when a specific database file is updated.

Open source leaderboard

Would there be interest in an option to filter the leaderboard to open source models only? I feel like models like T5, GPT-J, etc which have been released deserve being applauded for that fact. And as a practical matter, for researchers who want to apply these models in their work it takes some searching to find the largest open source models of each type.

New Models

GPT-NeoX 20B is a new language model by EleutherAI trained on the Pile. It is a decoder model that is competent at both English and generating code.

The training code can be found here and the model weights will be released next week. It was trained using PyTorch and DeepSpeed on 96 A100 @ CoreWeave. You can find the compute states here... sorry, you'll have to do the math for the total compute yo

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.