Energy, Tech, and Data Enthusiast
Hey there!ππ»
I'm a highly analytical Bachelor of Geophysical Engineering fresh graduate who have immense passion about energy, tech, startups, and data. Continuously learning and seeking opportunities in data analytics or data science for research or business objective. Experienced in Python, SQL, R, and Spreadsheets for data analysis and Tensorflow or Scikitlearn for machine learning through various projects and certifications. Graduated from Bangkit Academy 2022 Machine Learning Cohort in my final year of studyππ
Kimia Farma Big Data Analytics Virtual Internship (SQL | Google Looker Studio)
Dashboard | Slide
- Designed datamart for sales with Google Sheets and BigQuery.
- Created aggregate table with additional revenue field for sales analytics.
- Designed a dashboard to monitor sales performance with Looker Studio.
Quantium Virtual Internship (Python)
Notebook1, 2 | Slide1, 2, 3
- Handled inconsistent data by data slicing and filtering.
- Transformed and visualized data for high-level analysis.
- Assessed A/B testing by selecting control variables by similarity metrics and hypothesis testing with p-value for change significance control.
Credit Card Fraud Detection (Python | scikit-learn)
Notebook | Slide
- Handled imbalance data by combining undersampling and oversampling
- Detected and removed outliers with IQR
- Built classification models based on multiple algorithms (Logistic Regression, Random Forest Classifier, Support Vector Classifaction, Extreme Gradient Boost) with scikit-learn and reached 0.97 as the best F1 score
Yelp Dataset SQL Lookup (SQL)
Quiz Report | Slide
- Profiled and analyzed Yelp dataset with SQL
- Applied querying techniques (GROUP BY and ORDER BY) for correlation insights
- Performed sentiment analysis with regex for business improvement recommendations
Planticure (Python | TensorFlow)
This is a capstone project of my team for Bangkit Academy 2022 led by Google, Tokopedia, Gojek, & Traveloka. My part was to build an image classification model to classify home-grown plant disease from their leaf photos.
Notebooks | Slide
- Transformed several plant disease datasets from TFDS and Kaggle for train-test input
- Applied CNN with Tensorflow to classify home-grown plant diseases
- Achieved >85% accuracy in 10 tests
Breast Cancer Classification (Python | scikit-learn)
Notebook | Slide
- Performed exploratory data analysis (EDA)
- Detected and removed outliers with IQR and Z-Score methods
- Built classification models based on multiple algorithms (Logistic Regression, Random Forest, Support Vector Classifaction, KNN Extreme Gradient Boost) with scikit-learn and reached 0.98 as the best F1 score
Bellabeat Data Analysis (R)
This is a capstone project for Google Data Analytics specialization course on Coursera.
Notebook | Slide
- Analyzed Bellabeat dataset from Kaggle with R
- Combined multiple tables with join and handled missing values to gain correlation insights
- Suggested several key points for future survey and business development
Airbnb Dataset Analysis (Python)
Notebook | Slide
- Performed exploratory data analysis (EDA) with visualization and descriptive statistics
- Handled missing values, filtering, and other data manipulation with pandas
- Automated a price catogirizing to create a new table based on the price category
HackerRank SQL (Basic) Assessment | TensorFlow Developer Certificate | Google Data Analytics | DeepLearning.AI TensorFlow Developer Specialization | Structuring Machine Learning Projects | TensorFlow: Data and Deployment Specialization | Mathematics for Machine Learning