Maths | Data/ML | Fullstack | Nix/NixOS
"Basically a wizard"
Employed at Tweag to build software for a data world
- LinkedIn: https://www.linkedin.com/in/guillaume-desforges
- GitHub: https://github.com/GuillaumeDesforges/
I'm a freelancer, contact me by email. Currently not available for freelance.
- "Supervised Learning" (16h)
- "Scraping and data cleaning" (10h)
- "Handling spatial data" (4h)
Tweag, a Modus Create company
- integrate ML in a marketing solution
- custom software integration (dev + ops)
- leadership: coaching, project management
- growth: hiring (interviews), marketing (speaker, blog editor), sales (solution design)
- scaffold Python monorepos (blog post)
- distributed cloud computing for ML (dataloader backed by ray on Azure AKS)
- native extension for Spark in Scala (github:kaiko.ai/spark-dicom)
- analysis and processing of temporal geospatial data
- speaker at PyConFr 2023: "Python moderne et fonctionnel pour des logiciels robustes" (video)
- "Arrow"-based effect system to author extendable type-safe workflows (github:tweag/funflow, blog post)
- integrate with many third party data sources
- manage ETL jobs, data freshness and data accuracy
- 2019-2020: Master "Data and Artificial Intelligence", Institut Polytechnique de Paris
- 2016-2020: Ingénieur, Ecole des Ponts
- analytics (Hadoop MapReduce, Spark, Modern Data Stack)
- cloud data lakehouse (Spark SQL, BigQuery, Snowflake, Athena)
- parallel computing, distributed computing (Ray is cool)
- data transformation pipelines need similar features than build systems
- you've gotta love a good linear regression (or xgboost)
- aren't Foundational Models just crushing the field?
- functional programming (Haskell, Scala)
- apply FP ideas to other languages (Python, Rust, Java)
- Inheritance is bad
- Inheritance is bad, really
- Domain Driven Design (DDD) is good
- automated testing matters
- aim for 100% automated deployement
- technological success includes reproducibility (Nix ❤)
- frontend: the Open Web Platform is the most stable, React is nice, experimenting with HTMX
- backend: REST is good, GraphQL is nice to serve SPA data but it's not so simple to solve N+1 query problem
- make a web app unless you have hard requirement on being offline