Giter VIP home page Giter VIP logo

my-reading-list's Introduction

My Reading List

Publications, books, and web pages I've been reading or am planning on reading.

Why

I've been trying to level up recently on ML, LLMs, NLU, etc. and whenever I read a paper, I feel there are ten others I should read as well :) . This repo is to better track what I've read and what I want to read and jot some learnings along the way.

I also want to give this Learning in Public thing a shot. Let's see how it goes!

ML Reading List

General

Paper Read Date Last Revise Date Notes
Evaluating Large Language Models Trained on Code 2023-03-12
Understanding HTML with Large Language Models 2023-03-12 2022-10-08 Notes
Multi-Task Sequence to Sequence Learning 2016-03-01
Emergent Abilities of Large Language Models 2022-10-06
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model 2022-12-11
Finetuned Language Models Are Zero-Shot Learners 2022-02-08
LLaMA: Open and Efficient Foundation Language Models 2023-03-27
Training language models to follow instructions with human feedback 2022-03-04
HTLM: Hyper-Text Pre-Training and Prompting of Language Models 2021-07-14
Environment Generation for Zero-Shot Compositional Reinforcement Learning 2022-01-21
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
LaMDA: Language Models for Dialog Applications

Training Speedups/Scaling

Paper Read Date Last Revise Date Notes
MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices
PaLM: Scaling Language Modeling with Pathways
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
Training Compute-Optimal Large Language Models 2022-03-29
GLaM: Efficient Scaling of Language Models with Mixture-of-Experts 2021-12-13

Non-LLMs

Paper Read Date Last Revise Date Notes
World of Bits: An Open-Domain Platform for Web-Based Agents 2017
Reinforcement Learning on Web Interfaces Using Workflow-Guided Exploration
User-Driven Automation of Web Form Filling 2013
Learning Transferable Visual Models from Natural Language Supervision 2021-02-26
Learning to Generate Reviews and Discovering Sentiment 2017-04-06
WikiGraphs: A Wikipedia Text - Knowledge Graph Paired Dataset 2021-07-20
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Extracting Structured Data from Templatic Documents 2020-06-12

Bloom Filters

Read Date Resource Notes
2024-01-30 Bloom Filters by ByteByteGo Gives a decent intuition
2024-01-30 What are Bloom Filters? Not the best example. prev vid was better
2024-01-30 Advancing Spark - Bloom Filter Indexes in Databricks Delta Interesting, but more about delta than spark, as the title implies
The Case for Learned Index Structures
Optimizing Learned Bloom Filters by Sandwiching

Quantization, Model Compression & Optimization

Read Date Resource Notes
Permute, Quantize, and Fine-tune: Efficient Compression of Neural Networks
2024-01-30 How We Scaled Bert To Serve 1+ Billion Daily Requests on CPUs (Roblox) Interesting. Always nice to read actual case studies. I'd like to see how ONNX compares to their benchmarks.

Blog Posts I Liked

Read Date Post Notes
2024-01-30 How we reduced our text similarity runtime by 99.96% (Microsoft) I skimmed through it. Seems interesting and worth a reread
2024-01-30 How Roblox Reduces Spark Join Query Costs With Machine Learning Optimized Bloom Filters I wonder if this can be applied to other use cases too and not just fact/dim tables. Interesting read.

Blog Posts to Read

Post Notes
Using machine learning to index text from billions of images (Dropbox) Curious abouth the OCR/PDF text extraction part here. Need some caffiene in me to read this.

my-reading-list's People

Contributors

bjamil avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.