Giter VIP home page Giter VIP logo

history-of-deep-learning's Introduction

speedrun of implemntation of History-of-Deep-Learning (inspired by "adam-maj" -added few more papers)

Three stage of implemntation : From Scrath, In PyTorch And In Jax(not all but some).

Totalcount : (9/32)

01-deep-neural-networks

Concept Complete
BackPropagation
CNN
AlexNet
U-net

02-optimization-and-regularization

Concept Complete
weights-decay
relu
residuals
dropout
batch-norm
layer-norm
gelu
adam
early-stopping

03-sequence-modeling

Concept Complete
rnn
lstm
learning-to-forget
word2vec
seq2seq
attention
mixture-of-experts

04-transformer

Concept Complete
transformer
bert
t5
gpt
lora
rlhf
vision-transformer

05-image-generation

Concept Complete
gans
vae
diffusion
clip
dall-e

Papers

  • DNN - Learning Internal Representations by Error Propagation (1987), D. E. Rumelhart et al. [PDF]
  • CNN - Backpropagation Applied to Handwritten Zip Code Recognition (1989), Y. Lecun et al. [PDF]
  • LeNet - Gradient-Based Learning Applied to Document Recognition (1998), Y. Lecun et al. [PDF]
  • AlexNet - ImageNet Classification with Deep Convolutional Networks (2012), A. Krizhevsky et al. [PDF]
  • U-Net - U-Net: Convolutional Networks for Biomedical Image Segmentation (2015), O. Ronneberger et al. [PDF]
  • Weight Decay - A Simple Weight Decay Can Improve Generalization (1991), A. Krogh and J. Hertz [PDF]
  • ReLU - Deep Sparse Rectified Neural Networks (2011), X. Glorot et al. [PDF]
  • Residuals - Deep Residual Learning for Image Recognition (2015), K. He et al. [PDF]
  • Dropout - Dropout: A Simple Way to Prevent Neural Networks from Overfitting (2014), N. Strivastava et al. [PDF]
  • BatchNorm - Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift (2015), S. Ioffe and C. Szegedy [PDF]
  • LayerNorm - Layer Normalization (2016), J. Lei Ba et al. [PDF]
  • GELU - Gaussian Error Linear Units (GELUs) (2016), D. Hendrycks and K. Gimpel [PDF]
  • Adam - Adam: A Method for Stochastic Optimization (2014), D. P. Kingma and J. Ba [PDF]
  • RNN - A Learning Algorithm for Continually Running Fully Recurrent Neural Networks (1989), R. J. Williams [PDF]
  • LSTM - Long-Short Term Memory (1997), S. Hochreiter and J. Schmidhuber [PDF]
  • Learning to Forget - Learning to Forget: Continual Prediction with LSTM (2000), F. A. Gers et al. [PDF]
  • Word2Vec - Efficient Estimation of Word Representations in Vector Space (2013), T. Mikolov et al. [PDF]
  • Phrase2Vec - Distributed Representations of Words and Phrases and their Compositionality (2013), T. Mikolov et al. [PDF]
  • Encoder-Decoder - Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation (2014), K. Cho et al. [PDF]
  • Seq2Seq - Sequence to Sequence Learning with Neural Networks (2014), I. Sutskever et al. [PDF]
  • Attention - Neural Machine Translation by Jointly Learning to Align and Translate (2014), D. Bahdanau et al. [PDF]
  • Mixture of Experts - Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer (2017), N. Shazeer et al. [PDF]
  • Transformer - Attention Is All You Need (2017), A. Vaswani et al. [PDF]
  • BERT - BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (2018), J. Devlin et al. [PDF]
  • RoBERTa - RoBERTa: A Robustly Optimized BERT Pretraining Approach (2019), Y. Liu et al. [PDF]
  • T5 - Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer (2019), C. Raffel et al. [PDF]
  • GPT-2 - Language Models are Unsupervised Multitask Learners (2018), A. Radford et al. [PDF]
  • GPT-3 - Language Models are Few-Shot Learners (2020) T. B. Brown et al. [PDF]
  • LoRA - LoRA: Low-Rank Adaptation of Large Language Models (2021), E. J. Hu et al. [PDF]
  • RLHF - Fine-Tuning Language Models From Human Preferences (2019), D. Ziegler et al. [PDF]
  • PPO - Proximal Policy Optimization Algorithms (2017), J. Schulman et al. [PDF]
  • InstructGPT - Training language models to follow instructions with human feedback (2022), L. Ouyang et al. [PDF]
  • Helpful & Harmless - Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback (2022), Y. Bai et al. [PDF]
  • Vision Transformer - An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (2020), A. Dosovitskiy et al. [PDF]
  • GAN - Generative Adversarial Networks (2014), I. J. Goodfellow et al. [PDF]
  • VAE - Auto-Encoding Variational Bayes (2013), D. Kingma and M. Welling [PDF]
  • VQ VAE - Neural Discrete Representation Learning (2017), A. Oord et al. [PDF]
  • VQ VAE 2 - Generating Diverse High-Fidelity Images with VQ-VAE-2 (2019), A. Razavi et al. [PDF]
  • Diffusion - Deep Unsupervised Learning using Nonequilibrium Thermodynamics (2015), J. Sohl-Dickstein et al. [PDF]
  • Denoising Diffusion - Denoising Diffusion Probabilistic Models (2020), J. Ho. et al. [PDF]
  • Denoising Diffusion 2 - Improved Denoising Diffusion Probabilistic Models (2021), A. Nichol and P. Dhariwal [PDF]
  • Diffusion Beats GANs - Diffusion Models Beat GANs on Image Synthesis, P. Dhariwal and A. Nichol [PDF]
  • CLIP - Learning Transferable Visual Models From Natural Language Supervision (2021), A. Radford et al. [PDF]
  • DALL E - Zero-Shot Text-to-Image Generation (2021), A. Ramesh et al. [PDF]
  • DALL E 2 - Hierarchical Text-Conditional Image Generation with CLIP Latents (2022), A. Ramesh et al. [PDF]
  • Deep Learning - Deep Learning (2015), Y. LeCun, Y. Bengio, and G. Hinton [PDF]
  • GAN - Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks (2016), A. Radford et al. [PDF]
  • DCGAN - Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks (2016), A. Radford et al. [PDF]
  • BigGAN - Large Scale GAN Training for High Fidelity Natural Image Synthesis (2018), A. Brock et al. [PDF]
  • WaveNet - WaveNet: A Generative Model for Raw Audio (2016), A. van den Oord et al. [PDF]
  • BERTology - A Survey of BERT Use Cases (2020), R. Rogers et al. [PDF]
  • GPT - Improving Language Understanding by Generative Pre-Training (2018), A. Radford et al. [PDF]
  • GPT-4 - GPT-4 Technical Report (2023), OpenAI [PDF]
  • Deep Reinforcement Learning - Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm (2017), D. Silver et al. [PDF]
  • Deep Q-Learning - Playing Atari with Deep Reinforcement Learning (2013), V. Mnih et al. [PDF]
  • AlphaGo - Mastering the Game of Go with Deep Neural Networks and Tree Search (2016), D. Silver et al. [PDF]
  • AlphaFold - Highly accurate protein structure prediction with AlphaFold (2021), J. Jumper et al. [PDF]
  • T5 - Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer (2019), C. Raffel et al. [PDF]
  • ELECTRA - ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators (2020), K. Clark et al. [PDF]
  • SimCLR - A Simple Framework for Contrastive Learning of Visual Representations (2020), T. Chen et al. [PDF]

history-of-deep-learning's People

Contributors

saurabhaloneai avatar

Stargazers

Vishnu S Shukla avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.