tinya10 Goto Github PK

followers: 2.0 following: 1.0 repos: 40.0 gists: 0.0

Name: Tushar Kadam

Type: User

Company: M&G Plc

Bio: Data & Analytics consultant

Location: Mumbai

Tushar Kadam's Projects

1.basic-python-programs-for-beginners

analysing_weather_dataset

We'll be working with a csv file that contains weather data for each hour in 2012. There are many interesting connections between everyday life and the weather that we will explore with the help of this dataset. Apply all the numpy and pandas skills learned so far to analyze the data.

analyzing-weather-dataset

Analyzing weather dataset using Numpy & Pandas

banking_inferences

Bank Of New York wants to expand its branches and for that, it has a certain hypothesis and statements it wants to verify. Using the inferential statistics method you just learned, help the bank.

decision_trees_ensemble_techniques_employee_attrition_prediction

decision_trees_loan_defaulters

For this project, we will be exploring the publicly available data from LendingClub.com. Lending Club connects people who need money (borrowers) with people who have money (investors). As an investor one would want to invest in people who showed a profile of having a high probability of paying the amount back The data that we have is from 2007-2010.

deep-learning-optimizing-neural-networks

In this project we are working on the Lending club dataset. Lending Club is a peer to peer lending company based in the United States, in which investors provide funds for potential borrowers and investors earn a profit depending on the risk they take (the borrowers credit score). Lending Club provides the "bridge" between investors and borrowers. For more basic information about the company please check out the wikipedia article about the company. From the given set of data we want to predict loan_status of the borrower. We have to predict the laon staus based on the features like Loan amount,payment plan,grade,verification status,recoveries etc. The loan status having the various categories like Fully paid,charged off,late,Issued,In a grace period etc. In the last assignment you have seen that how to apply the basic neural network on the dataset. In this project we will see how to optimize neural network in order to increase the training speed and how we can increase the accuracy.

deep_learning_lending_club_defaulters_prediction

ensemble_techniques_mars_crater_classification

This dataset was generated using HRSC nadir panchromatic image h0905_0000 taken by the Mars Express spacecraft. The images are located in the Xanthe Terra, centered on Nanedi Vallis and covers mostly Noachian terrain on Mars. The image had a resolution of 12.5 meters/pixel. Problem statement Determine if the instance is a crater or not a crater. 1=Crater, 0=Not Crater About the dataset Using the technique described by L. Bandeira (Bandeira, Ding, Stepinski. 2010.Automatic Detection of Sub-km Craters Using Shape and Texture Information) we identify crater candidates in the image using the pipeline depicted in the figure below. Each crater candidate image block is normalized to a standard scale of 48 pixels. Each of the nine kinds of image masks probes the normalized image block in four different scales of 12 pixels, 24 pixels, 36 pixels, and 48 pixels, with a step of a third of the mask size (meaning 2/3 overlap). We totally extract 1,090 Haar-like attributes using nine types of masks as the attribute vectors to represent each crater candidate. The dataset was converted to the Weka ARFF format by Joseph Paul Cohen in 2012.

extracting-bussiness-insights

Probability and Statistics : The dataset specifically focuses on the Banking, Debt, Financial, Inflation and Systemic Crises that occurred, from 1860 to 2014, in 13 African countries, including: Algeria, Angola, Central African Republic, Ivory Coast, Egypt, Kenya, Mauritius, Morocco, Nigeria, South Africa, Tunisia, Zambia and Zimbabwe. We have, with us, more than 1000 data points. Apply your knowledge of Descriptive Statistics & Probability to get meaningful insights out of it.

feature_selection_forest_cover_type_prediction

The problem statement revolves around the need to predict the forest cover type (the predominant kind of tree cover) from strictly cartographic variables (as opposed to remotely sensed data). It includes four wilderness areas located in the Roosevelt National Forest of northern Colorado. These areas represent forests with minimal human-caused disturbances, so that existing forest cover types are more a result of ecological processes rather than forest management practices. The study area includes four wilderness areas located in the Roosevelt National Forest of northern Colorado. Each observation is a 30m x 30m patch. You are asked to predict an integer classification for the forest cover type. The seven types are: Spruce/Fir Lodgepole Pine Ponderosa Pine Cottonwood/Willow Aspen Douglas-fir Krummholz

ga-learner-dsai-repo

A collection of projects as part of the Data Science with AI at GreyAtom EduTech Pvt Ltd

glabs_dsmx

greyatom-python-for-data-science

A collection of projects as part of the Python for Data Science program at GreyAtom EduTech Pvt Ltd

hackathon

high_rated_games_on_playstore

Google Play Store serves as the official app store for the Android operating system, allowing users to browse and download applications. Success of an app is largely determined by its ratings. But is there any particular pattern among high rated apps? Does size or genre of the app play a role in determining its high rating? Let's find out.

housing-loan-approval-analysis

Problem Statement Dream Housing Finance Inc. specializes in home loans across different market segments - rural, urban and semi-urban. Thier loan eligibility process is based on customer details provided while filling an online application form. To create a targeted marketing campaign for different segments, they have asked for a comprehensive analysis of the data collected so far. Why solve this project ? After completing this project, you will have better grip on working with pandas. In this project you will apply following concepts. Dataframe slicing Dataframe aggregation Pivot table operations

indian_election_analysis_visualisations

Indian Election Analysis India's lower house of Parliament,the Lok Sabha, has 543 seats in total.Members of Lok Sabha (House of the People) or the lower house of India's Parliament are elected by being voted upon by all adult citizens of India, from a set of candidates who stand in their respective constituencies. Every adult citizen of India can vote only in their constituency. Candidates who win the Lok Sabha elections are called 'Member of Parliament' and hold their seats for five years or until the body is dissolved by the President on the advice of the council of ministers. There are more than 700 million voters with more than 800,000 polling stations. The Lok Sabha election is a very complex affair as it involves a lot of factors. It is this very fact that makes it a perfect topic to analyze. Currently there are two major parties in India, Bhartiya Janta Party(BJP) and Indian National Congress(INC). As India is country of diversities, and each region is very different from every other region, there are a lot of regional or state parties having major influences. These parties can either support any of the alliance to make a government or can stay independent. There are two major alliances, the NDA led by BJP and the UPA led by INC.

linear_regression_buidling_football_team

You are a data scientist who wishes to make it big by becoming a football club manager. A rich club has decided to hire you as their manager. You have all the money to build a team from scratch. Your aim is to find out the best squad for the upcoming football championship.

linear_regression_the_lego_collector-s_dilemma

You are a die hard Lego enthusiast wishing to collect as many board sets as you can. But before that you wish to be able to predict the price of a new lego product before its price is revealed so that you can budget it from your revenue. Since (luckily!), you are a data scientist in the making, you wished to solve this problem yourself. This dataset contains information on lego sets scraped from lego.com. Each observation is a different lego set with various features like how many pieces in the set, rating for the set, number of reviews per set etc. Your aim is to build a linear regression model to predict the price of a set

logistic_regression_predict_insurance_claim

Till now you have seen that how to solve the linear regression and regularization problem. Now in this project, you are going to predict the Insurance claim using logistic regression. This dataset contains information on the insurance claim. each observation is different policyholders with various features like the age of the person, the gender of the policyholder, body mass index, providing an understanding of the body, number of children of the policyholder, smoking state of the policyholder and individual medical costs billed by health insurance. The dataset has details of 1338 Insurance claim with 8 features. You need to predict the Insurance Claim (Yes:1/No:0)

mahindra-first-choice

make-sense-of-census

This project is about performing analysis on Census data management.

mobile_app_analysis

The ever-changing mobile landscape is a challenging space to navigate. . The percentage of mobile over desktop is only increasing. Android holds about 53.2% of the smartphone market, while iOS is 43%. To get more people to download your app, you need to make sure they can easily find your app. Mobile app analytics is a great way to understand the existing strategy to drive growth and retention of future user. With million of apps around nowadays, the following data set has become very key to getting top trending apps in iOS app store. This data set contains more than 7000 Apple iOS mobile application details.

movies_ratings

Our aim in this project is to explore the movie dataset and find some movies with high ratings. Your friend has just begun with his vacations and wants you to suggest some good movies for him to watch. Since you have just learned Python, you decided to use your Python skills to analyze a movie dataset and explore the ratings of the movies. In our dataset, we have the details of the movies in more than 50 languages, but your friend is interested only in watching English movies. Thus, our goal is to analyze the data and suggest English movies with high-ratings to your friend.

nlp_amazon_alexa_reviews

You are working in the amazon company as data scientist. They want you to focus on customer reviews on there alexa product. So your aim is to classify the unhappy customer based on the features 'rating', 'date', 'variation', 'verified_reviews', 'feedback'. So let's work on the customer reviews.

olympic_medals_comparison

The Olympic Games, considered to be the world's foremost sports competition has more than 200 nations participating across the Summer and Winter Games alternating by occurring every four years but two years apart. Throughout this project, we will explore the Olympics dataset(scraped from https://en.wikipedia.org/wiki/All-time_Olympic_Games_medal_table) , look at some interesting statistics and then try to find out which country is the King of Olympic Games.

probability-of-the-loan-defaulters

probability-of-the-loan-defaulters-0

project_gradient_boosting_telecom_churn_prediction

Customer churn, also known as customer attrition, customer turnover, or customer defection, is the loss of clients or customers. Telephone service companies, Internet service providers, pay-TV companies, insurance firms, and alarm monitoring services, often use customer attrition analysis and customer attrition rates as one of their key business metrics because the cost of retaining an existing customer is far less than acquiring a new one. Predictive analytics use churn prediction models that predict customer churn by assessing their propensity of risk to churn. Since these models generate a small prioritized list of potential defectors, they are effective at focusing customer retention marketing programs on the subset of the customer base who are most vulnerable to churn. For this project, we will be exploring the dataset of a telecom company and try to predict the customer churn

tinya10 Goto Github PK

Tushar Kadam's Projects

Recommend Projects

Recommend Topics

Recommend Org