msikorski93's Projects
Demonstrating and testing a new approach for quantifying and identifying correlation.
The age of an abalone can be estimated by cutting its shell, staining it, and counting the number of rings in the shell through a microscope. However, this process is time-consuming, boring, and can cause death to the creature. Therefore, it is necessary to find another non-collision method for age estimation. Other physical measurements, which are easier to collect, can be used to determine the age. The subject of this notebook was to estimate the number of rings based on other independent features: either as a continuous value or as a classification problem. The task was completed successfully and we built our predictive models. With this dataset we are able to perform both regression and classification. We developed two supervised machine learning algorithms with scikit-learn library: polynomial regression and k-nearest neighbors. We also looked up for some techniques to tune up and improve their performance.
A multi classification using scikit-learn and TensorFlow models on MRI scans of patient's brains.
This notebook aims to explore, summarize, and visualize potential future Earth collision events.
Identifying and assigning breast cancer diagnosis using machine learning methods, based on observations in WDBC dataset. All classifiers have been evaluated and performed well for this task.
Thematic maps plotted in Python.
An example of improving data quality and identifying anomalies within a real-life dataset for master data management or data engineering.
Panic disorder detecting using machine learning techniques.
The purpose of this notebook is to develop an automated function to predict the price of a diamond based on its given features (cut, color, dimensions, etc.). We will create a machine learning model which can estimate these values. We need to find continuous data, so we will perform a regression task. We will use supervised learning to find the prices.
Recently inflation is a popular topic in Poland and is highest since 2001. Experts presume inflation in Poland should continue to rise, and by the end of 2021 it will be close to 8%. This notebook aims to develop a forecasting model for time series using Python.
This repository aims to determine a local geoid model for Carpathian Mountains based on the known topographic height and assessing the precision performance of the Earth gravitational model (EGM2008) in the area of interest. The topographical points used in this task were received during the Topex satellite altimeter mission. Ellipsoidal heights were then converted into orthometric and geoid elevations with UNAVCO online notebook. Carpathians local geoid was visualized with a map. We also developed a function to model geoids prime profiles.
The subject of this repository was to perform binary classification based on respondent's collected features (age, cholesterol level, fasting blood sugar, thallium stress test results, etc.).
Linear and polynomial regression solutions made from scratch using TensorFlow framework.
Basic data analysis focused mainly on visualizing geospatial data worldwide with cartopy.
Config files for my GitHub profile.
Image recognition on Persian digits with LeNet-5 neural network.
A binary classification using Convolution Neural Network (CNN, or ConvNet) model.
Crytpocurrencies are lately very trendy and a big subject to investment. Stock market prediction is the act of trying to determine the future value of an other financial instrument or company stock traded on a financial exchange. Time series predicting is one of most demanding object in machine learning on today's market.
Predicting house prices using different regression analysis models.
Deep learning image recognition based on visual similarity for e-commerce.
Performing a regression task for estimating residue size based on given physicochemical properties of protein tertiary structures (CASP 5-9).
This repository contains thematic maps generated in Python.
To save the work time and accelerate as possible model analysis of the MDM team, a script was written for this task in Python. The verification was based on the analysis of the presence of characteristics in these models. We successfully performed this task on a sample dataset with 15 models and none of them are repeated.
Performing basic clustering on a seeds dataset.
A basic NLP project on musical instruments reviews on Amazon.
Detecting spam (a typical binary classification problem) on Polish emails.
Solutions to tasks and exercises from different websites
Time series forecasting (close prices) with different estimators.
This short repository contains a geospatial data visualization of percentage of people with Ukrainian as their native (or first) language according to 2001 census. A customized choropleth was developed with folium library. The map is interactive and allows the user do basic actions: zoom in-out, choose layers, display tooltips. The choropleth was saved to HTML file and is available for downloading.
Exploratory data analysis for US baby names dataset with SQL