This is a dedicated repository containing jupyter notebooks for my technical blogs posts on Medium.
My Medium Profile Link - https://medium.com/@nroy0110
1. [Blog Link]Building a Text Normalizer using NLTK ft. POS tagger
Code - Text_Normalization_ft_POS_Tagger.ipynb
Summary: An nltk implementation of basic text cleaning and normalization techniques using POS tagger.
[Blog Link] Exploring News about Covid-19 in Indian Media | NLP| Wordcloud | Covid-19 Article 1
[Blog Link] Extracting Features From Covid’s News: A Simple Bag-of-Words Approach for Time-Series Text Data Modelling | Covid-19 Article 2
Code for Scraping - Git Repository NLP_using_News_API
Jupyter Notebooks: https://github.com/royn5618/Medium_Blog_Codes/tree/master/Covid-Blog-Codes
Summary: COVID-19 has turned the world upside-down. The global buzzwords are now coronavirus, death, quarantine and lockdown. So, in this blog, I attempted to dig into the news articles from Indian media and visualize which words appeared the most using a wordcloud. I also explored further into the texts and try to establish significant correlations between features extracted from the texts using Bag-of-Words (BoW) approach. ** Work in progress **
[Blog Link] Predicting Hazardous Seismic Bumps Part I : EDA, Feature Engineering & Train Test Split for Unbalanced Dataset
[Blog Link] Predicting Hazardous Seismic Bumps Part II: Training & Tuning Supervised ML Classifiers and Model Performance Analysis
Summary - The seismic bumps dataset is one of the lesser-known binary classification datasets that capture geological conditions using seismic and seismo-acoustic systems in longwall coal mines to assess if they are prone to rockburst causing seismic hazards or not.
Link to the dataset: https://archive.ics.uci.edu/ml/datasets/seismic-bumps