Giter VIP home page Giter VIP logo

brent's Introduction

๐Ÿ™‚ Vincent D. Warmerdam
โ”ฃโ”โ” ๐Ÿ“ฆ Open Source Packages
โ”ƒ   โ”ฃโ”โ” bulk              - simple bulk labelling interface
โ”ƒ   โ”ฃโ”โ” embetter          - embeddings ready for sklearn
โ”ƒ   โ”ฃโ”โ” doubtlab          - suite of tools to help find bad labels
โ”ƒ   โ”ฃโ”โ” drawdata          - draw datasets in jupyter
โ”ƒ   โ”ฃโ”โ” scikit-lego       - lego bricks for sklearn
โ”ƒ   โ”ฃโ”โ” scikit-partial    - partial_fit() pipelines for sklearn
โ”ƒ   โ”ฃโ”โ” scikit-bloom      - bloom transformers for sklearn
โ”ƒ   โ”ฃโ”โ” fh-matplotlib     - matplotlib for FastHTML
โ”ƒ   โ”ฃโ”โ” fh-altair         - altair for FastHTML
โ”ƒ   โ”ฃโ”โ” human-learn       - rule-based components for sklearn
โ”ƒ   โ”ฃโ”โ” sentence-models   - a different take on textcat
โ”ƒ   โ”ฃโ”โ” mktestdocs        - turn markdown files into pytest tests
โ”ƒ   โ”ฃโ”โ” lazylines         - lightweight utils for .jsonl wrangling
โ”ƒ   โ”ฃโ”โ” cluestar          - inspiration for your first text labels
โ”ƒ   โ”ฃโ”โ” durations         - pytest duration insights
โ”ƒ   โ”ฃโ”โ” tuilwindcss       - tailwindcss for textual tui apps
โ”ƒ   โ”ฃโ”โ” memo              - saves a whole log of time
โ”ƒ   โ”ฃโ”โ” skedulord         - makes cron a bit more fun
โ”ƒ   โ”ฃโ”โ” icepickle         - cool and safe storage for linear models
โ”ƒ   โ”—โ”โ” evol              - grammar for genetic heuristics
โ”ฃโ”โ” ๐Ÿ‘ Project Contributions
โ”ƒ   โ”ฃโ”โ” fairlearn         - contributed the CorrelationFilter
โ”ƒ   โ”ฃโ”โ” polars            - contributed the .pipe() method
โ”ƒ   โ”—โ”โ” BERTopic          - added lightweight sklearn pipeline support
โ”ฃโ”โ” โญ Online Projects
โ”ƒ   โ”ฃโ”โ” calmcode.io       - intermediate developer education
โ”ƒ   โ”ฃโ”โ” koaning.io        - personal blog
โ”ƒ   โ”—โ”โ” dearme.email      - reflection via a 30 day delay
โ”ฃโ”โ” ๐ŸŽ™๏ธ Popular Talks
โ”ƒ   โ”ฃโ”โ” Natural Intelligence is All You Need
โ”ƒ   โ”ฃโ”โ” Group-by statements that save the day
โ”ƒ   โ”ฃโ”โ” Tools to Improve Training Data
โ”ƒ   โ”ฃโ”โ” Optimal on Paper, Broken in Reality
โ”ƒ   โ”ฃโ”โ” Playing by the Rules-Based-Systems
โ”ƒ   โ”ฃโ”โ” How to Constrain Artificial Stupidity
โ”ƒ   โ”ฃโ”โ” The Profession of Solving the Wrong Problem
โ”ƒ   โ”ฃโ”โ” Winning with Simple, even Linear, Models
โ”ƒ   โ”—โ”โ” Untitled12.ipynb
โ”ฃโ”โ” ๐Ÿ”ฌ Random Experiments
โ”ƒ   โ”ฃโ”โ” scikit-prune   - prune scikit learn pipelines
โ”ƒ   โ”ฃโ”โ” gitlit         - tracking github action times across open source
โ”ƒ   โ”ฃโ”โ” sentimany      - many sentiment models, one repo
โ”ƒ   โ”ฃโ”โ” tokenwiser     - sklearn token tricks
โ”ƒ   โ”ฃโ”โ” clumper        - functional API for lists of dicts
โ”ƒ   โ”—โ”โ” whatlies       - exploration tools for word embeddings
โ”—โ”โ” ๐Ÿ‘จโ€๐Ÿ’ป Employer
    โ”ฃโ”โ” ๐ŸŽฒ :probabl.   - scikit-learn and friends
    โ”ƒ   โ”ฃโ”โ” scikit-churn      - safety rails for churn work
    โ”ƒ   โ”ฃโ”โ” scikit-playtime   - rethinking pipelines
    โ”ƒ   โ”—โ”โ” scikit-mdn        - mixture density networks
    โ”ฃโ”โ” ๐Ÿ’ฅ Explosion   - developer tools for nlp
    โ”ƒ   โ”ฃโ”โ” prodigy-hf        - Prodigy integration for the HuggingFace stack
    โ”ƒ   โ”ฃโ”โ” prodigy-pdf       - Annotate PDFs via Prodigy
    โ”ƒ   โ”ฃโ”โ” prodigy-ann       - ANN techniques to find relevant subsets
    โ”ƒ   โ”ฃโ”โ” prodigy-segment   - Prodigy integration for Segment Anything
    โ”ƒ   โ”ฃโ”โ” prodigy-lunr      - Search techniques to find relevant subsets
    โ”ƒ   โ”ฃโ”โ” prodigy-whisper   - Transcribe audio with OpenAI's whisper models
    โ”ƒ   โ”ฃโ”โ” prodigy-tui       - Prodigy from the terminal
    โ”ƒ   โ”—โ”โ” cluestar          - inspiration for your first text labels
    โ”—โ”โ” ๐Ÿค– Rasa        - conversational software provider
        โ”ฃโ”โ” nlu examples      - custom nlu components for Rasa
        โ”ฃโ”โ” taipo             - data augmentation tools
        โ”—โ”โ” algo whiteboard   - nlp education

Follow me on twitter @fishnets88

brent's People

Contributors

koaning avatar mbrouns avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

brent's Issues

names of queries

When looking at Judea Pearl's slide here it dawns on me that it might make sense to have three query objects. Might this be a nicer API design?

events that *never* happened

this is a graph for the RISK boardgame.

from brent import DAG, Query 
from brent.examples import generate_risk_dag

dag = generate_risk_dag()
q = Query(dag).given(losses=1).given(best_a1=1).given(best_a2=1)
q.plot()

image

when querying this we get an empty result

> q.infer()
{'d2': {},
 'best_d2': {},
 'd1': {},
 'best_d1': {},
 'a1': {},
 'a2': {},
 'a3': {},
 'best_a2': {},
 'best_a1': {},
 'losses': {}}

it's certainly not bad behavior but maybe a better way of dealing with this can be thought of.

implement v1 of counterfact model

this seems like a nice enough API design

q1 = Query(dag).do(a=0).given(g=1).do(b=1)
q2 = CounterFact(dag).when(q1).suppose_do(e=0).suppose_given(b=0).infer()

sklearn model needs to support NaN

Currently the sklearn model can be refactored a bit such that we can deal with missing values in the dataframe. Should add tests for this as well.

dataset api

there's a data folder now. probably want to turn this into a proper API instead. think about file-sizes too.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.