Giter VIP home page Giter VIP logo

My name is Celso, I'm a data scientist. 📡

🏗 What am I working on?

You can see the pinned projects bellow, but let me tell you why I pinned them:

  1. tccENAP: capstone project for my graduate course on data analyis for public policies, where I examine distance by car between municipalities without Brazilian's IRS assistance centes and municipalities with these centers, in order to highlight possible opportunities in establishing and closing centers.
  2. iusNominatim: combination of OSM nominatim in a docker, libpostal and some data wrangling with brazilian geo data from geobr to create a geocoding system that delivers better data than vanilla nominatim.
  3. pypelineDeals: Python wrapper for the API of PipelineDeals.

You can also check my kaggle profile for some interesting projects outside of github.

🤐 What am I working on outside of github?

Due to business strategy and NDAs, I can't put everything I do in github - or we have to keep it in github but private. I'm currently working for Quero Educação,, a marketplace for private education in Brazil. Think Booking, but for private college, schools and other courses. I work as the data lead for the K12 branch. So, here's what takes most of my time:

  • I have experience leading people! Before the pandemic, I use to lead two BI analysts, but now I'm only leading one;
  • A looot of SQL. Like a lot. Mostly quick analysis in Spark SQL, using databricks, for quick business decisions. Sometimes, we can make more complicated analysis, like regressions, classifications or even quasi-experimental studies in order to make strategy pivots, when needed;
  • I create a lot of datamarts in databricks using mostly databricks jobs in Spark SQL and pyspark. I use to write them in R/SparkR as well, but since the community is stronger on python/pypark, I have less of a headache and more support if I keep it all in python. I'm the R guy of the company and not a lot of stack overflow for SparkR issues, so...
  • Some dashboards, mostly Datastudio. I can do them in PowerBI too, but I try to keep dashboards tasks at a minimum in our team's backlog. Dashboards are dead, people!
  • Every semester, we forecast future growth of revenue using Facebook Prophet. We're thinking of using it for other kinds of timeseries forecasting, like B2C leads and visits;
  • We have a Bayesian AB testing framework based on this white paper in the company, that we're refactoring and trying to build as a python package; I'm contributing with this cross sector project;
  • And finally, we've been dabbling with Natural Language Processing (NLP) so that we can better know the K12 market in Brazil. Public data is sparse and diffuse, and we have to gather data from several public sources, where school names, addresses and owners are often not exactly the same. Therefore, a lot of fuzzy matching that we're constantly improving.

📫 Get in touch!

Best way to reach me is checking my Linkedin LinkedIn.

💻 Check my most used languages below!

Top Langs

celsoMattheus's Projects

enapd6 icon enapd6

Repositório para entrega de trabalho final da disciplina D6 na pós da ENAP em Análise de Dados de Políticas Públicas

iusnominatim icon iusnominatim

Docker container to parse addresses and return coordinates, checking them againts IPEA/IBGE vectorized maps of Brazil

rnu-d2 icon rnu-d2

Extração e tratamento de dados de fatura atual do Nubank usando artoo.js para scrapping no próprio navegador e script em R para tratamento dos dados

scarlet.ibis icon scarlet.ibis

Blog pessoal de Celso Mattheus para assuntos de ciência de dados

tccenap icon tccenap

Projeto para trabalho de conclusão de curso em análise de dados em Políticas Públicas na ENAP

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.