Giter VIP home page Giter VIP logo

authorcontributions welcome

Hi, how are you? My name is Lucas Okamura! 👋

About me

I am a mechanical engineer graduated from University of São Paulo and I work as a Data Scientist at Mercado Livre. Since the beginning of the graduation I was interested in programming and it made me search for knowledge beyond the basic taught in classes, through online courses and participations in Hackathons. I seek to work in areas involving AI, Data Science and Machine Learning, aiming to use techniques for developing predictive models to analyze data and obtain insights for solving different scenarios. In my GitHub profile you will find my personal projects, developing a business solution using the concepts and tools of Data Science, from understanding the business to publishing the model in production using APIs.

Analytical Tools:

  • Data Collect and Storage: SQL, MySQL, Hadoop, Spark.
  • Data Processing and Analysis: Python, SQL.
  • Development: Git and Shell Script.
  • Data Visualization: Matplotlib, Plotly, Seaborn and Tableau.
  • Machine Learning Modeling: Classification, Regression and Clustering.
  • Machine Learning Deployment: Heroku, AWS RDS, AWS EC2, AWS S3.

Data Science Projects:

  • Business problem adressed: Eduardo and Marcelo are two Brazilians, friends and business partners. After several successful business, they are planning to enter the fashion market as an E-commerce business model. The initial idea is to enter the market with just one product and for a specific audience, in this case the product would be Jeans for the male audience. Then, the objective is to maintain the operating cost low and scale as they get customers.
  • So this project went through the entire data pipeline, from the extraction to the architecture of an airflow automation.
  • Rossmann Stores sales forecast 6 weeks ahead. Rossmann's CFO needs this information to advance revenue to renovate stores, based on each store prediction.
  • Final model used is a XGBoost Regressor, which obtained a MAPE of 9.81%, predicting a total income of $ 285 million for all stores.
  • A Health Insurance company is analysing the possibility to offer their clients a new product: a car insurance. As well as the health insurance, the clients of this new product should pay annually to obtain a certain value assured by the company, for their cars. Thus, the company should use a strategy to select the most propense customers to make a call and offer their new product.
  • Final model used is a Logistic Regression, that is roughly 2.5 times better than the baseline random model, finding 62.28% of the interested customers within the company capacity to make calls.
  • The company All in One Place is a Multibrand Outlet company, i.e., it sells second line products of various brands at a lower price, through an e-commerce platform. In 1 year of operation, the marketing team realized that some customers in its base buy more expensive products with high frequency and end up contributing with a significant portion of the company's revenue. Based on this perception, the marketing team will launch a loyalty program for the best customers in the base, called Insiders. But the team does not have an advanced knowledge of data analysis to choose the program participants. For this reason, the marketing team asked the data team to select eligible customers for the program, using advanced data manipulation techniques.
  • Final model used is a K-Means with Random Forest Embeddings, finding 8 clusters, which were loaded into a table in a AWS Database.
  • Electronic House is an e-commerce company that sells computer products for homes and offices. The Director of Global Products asked the Head of Design to develop a new way to finalize the purchase with a credit card, without the need for the manually fill in all credit card information and that would work in all countries. After months of developing this device, the Backend Development team delivered a payment solution, in which 90% of the information on the information was filled in automatically.The Head of Designer would like to measure the effectiveness of the new the credit card data on the sales page and report the results to the Global Product Director, to conclude whether the new payment method is really better than the old one.
  • Hypothesis testing was used to identify a possible increase in value per customer on the site with this new feature. However, there was no statistical evidence that the new feature increased the value spent by customers on the site.
  • In a proptech, there is a squad responsible for defining which apartments the proptech should list on its platform or bet on buying, in order to allow sales growth at exponential pace, good use of financial resources and healthy unit economics. The policies created by this squad directly influence the liquidity and risk of the company's portfolio, both from a financial perspective (losses for proptech) and risk of compromising the user experience (poorly calibrated prices). The portfolio targets set in this squad unfold across the company, so it is critical that the models that support these settings are assertive, and that the portfolio strategy adopted follows a logic that makes sense. The challenge is to create a portfolio allocation algorithm to decide, among the apartments available in target_apartments.csv, which ones proptech should buy, refurbish and sell.
  • A survival analysis was performed, identifying the relationship between price variation and liquidity of apartments, to tell which are the best apartments to buy and for which price to sell. In the end, buying the apartments with the best price / liquidity ratio, a profit of 72.62% or R$ 108,604,871.00 can be obtained in relation to the amount spent on purchases.

Skills

MySQL

Python

pandas

NumPy

Matplotlib

seaborn

scikit-learn

SciPy

Flask

statsmodels

Git / GitHub

Heroku

AWS

Connect with me

Pramod's LinkedIn     Pramod's Instagram     Pramod's Gmail    

Lucas Okamura's Projects

Lucas Okamura doesn’t have any public repositories yet.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.