sukanya-93 Goto Github PK

followers: 0.0 following: 0.0 repos: 13.0 gists: 0.0

Name: Sukanya Roy Sikdar

Type: User

Sukanya Roy Sikdar's Projects

chatbot-using-rasa

In this project, you will build a chatbot for ‘Foodie’ and then deploy it on Slack.

deep-learning-algorithms-implementation

Implementations of (Deep Learning + Machine Learning) Algorithms

gesture-recognition-project

In this project, you will build a model to recognise 5 hand gestures.

hmms-and-viterbi-algorithm-for-pos-tagging

In this assignment, you need to modify the Viterbi algorithm to solve the problem of unknown words using at least two techniques.

Project Brief You work for Spark Funds, an asset management company. Spark Funds wants to make investments in a few companies. The CEO of Spark Funds wants to understand the global trends in investments so that she can take the investment decisions effectively. Business and Data Understanding Spark Funds has two minor constraints for investments: It wants to invest between 5 to 15 million USD per round of investment It wants to invest only in English-speaking countries because of the ease of communication with the companies it would invest in For your analysis, consider a country to be English speaking only if English is one of the official languages in that country You may use this list: Click here for a list of countries where English is an official language. These conditions will give you sufficient information for your initial analysis. Before getting to specific questions, let’s understand the problem and the data first. 1. What is the strategy? Spark Funds wants to invest where most other investors are investing. This pattern is often observed among early stage startup investors. 2. Where did we get the data from? We have taken real investment data from crunchbase.com, so the insights you get may be incredibly useful. For this assignment, we have divided the data into the following files: You have to use three main data tables for the entire analysis (available for download on the next page): 3. What is Spark Funds’ business objective? The business objectives and goals of data analysis are pretty straightforward. Business objective: The objective is to identify the best sectors, countries, and a suitable investment type for making investments. The overall strategy is to invest where others are investing, implying that the 'best' sectors and countries are the ones 'where most investors are investing'. Goals of data analysis: Your goals are divided into three sub-goals: Investment type analysis: Comparing the typical investment amounts in the venture, seed, angel, private equity etc. so that Spark Funds can choose the type that is best suited for their strategy. Country analysis: Identifying the countries which have been the most heavily invested in the past. These will be Spark Funds’ favourites as well. Sector analysis: Understanding the distribution of investments across the eight main sectors. (Note that we are interested in the eight 'main sectors' provided in the mapping file. The two files — companies and rounds2 — have numerous sub-sector names; hence, you will need to map each sub-sector to its main sector.) 4. How do you approach the assignment? What are the deliverables? The entire assignment is divided into checkpoints to help you navigate. For each checkpoint, you are advised to fill in the tables into the spreadsheet provided in the download segment. The tables are also mentioned under the 'Results Expected' section after each checkpoint. Since this is the first assignment, you have been provided with some additional guidance. Going forward you will be expected to structure and solve the problem by yourself, just like you would be solving problems in real life scenarios. Important Note: All your code has to be submitted in one Jupyter notebook. For every checkpoint, keep writing code in one well-commented Jupyter notebook which you can submit at the end.

lending-club-case-study

Identify risky loan applicants, to reduce the amount of credit loss. Identification of such applicants using EDA is the aim of this case study.

linear-regression---carprice

Problem Statement This assignment is a programming assignment wherein you have to build a multiple linear regression model for the prediction of car prices. You will need to submit a Jupyter notebook for the same. Problem Statement A Chinese automobile company Geely Auto aspires to enter the US market by setting up their manufacturing unit there and producing cars locally to give competition to their US and European counterparts. They have contracted an automobile consulting company to understand the factors on which the pricing of cars depends. Specifically, they want to understand the factors affecting the pricing of cars in the American market, since those may be very different from the Chinese market. The company wants to know: Which variables are significant in predicting the price of a car How well those variables describe the price of a car Based on various market surveys, the consulting firm has gathered a large dataset of different types of cars across the Americal market. Business Goal You are required to model the price of cars with the available independent variables. It will be used by the management to understand how exactly the prices vary with the independent variables. They can accordingly manipulate the design of the cars, the business strategy etc. to meet certain price levels. Further, the model will be a good way for management to understand the pricing dynamics of a new market. Data Preparation There is a variable named CarName which is comprised of two parts - the first word is the name of 'car company' and the second is the 'car model'. For example, chevrolet impala has 'chevrolet' as the car company name and 'impala' as the car model name. You need to consider only company name as the independent variable for model building. Model Evaluation: When you're done with model building and residual analysis, and have made predictions on the test set, just make sure you use the following two lines of code to calculate the R-squared score on the test set. from sklearn.metrics import r2_score r2_score(y_test, y_pred) where y_test is the test data set for the target variable, and y_pred is the variable containing the predicted values of the target variable on the test set. Please don't forget to perform this step as the R-squared score on the test set holds some marks. The variable names inside the 'r2_score' function can be different based on the variable names you have chosen.

neural-network-project

In this assignment, you will build a complete neural network using Numpy. You will implement all the steps required to build a network - feedforward, loss computation, backpropagation, weight updates etc.

probability-of-defaulters

Hackathon : Probability of Defaulters

restricted-boltzmann-machine---assignment

Topic Modelling is the art and science of identifying the 'latent topics' in a text. It is an unsupervised problem. You input a set of documents/ corpus into the model and the model finds the topics that describe the corpus. Each topic is a distribution over the words that best describe the topic.

supercabs---rl-project

build an RL-based algorithm which can help cab drivers maximise their profits by improving their decision-making process on the field.

telecom-churn-project

Business Problem Overview In the telecom industry, customers are able to choose from multiple service providers and actively switch from one operator to another. In this highly competitive market, the telecommunications industry experiences an average of 15-25% annual churn rate. Given the fact that it costs 5-10 times more to acquire a new customer than to retain an existing one, customer retention has now become even more important than customer acquisition. For many incumbent operators, retaining high profitable customers is the number one business goal. To reduce customer churn, telecom companies need to predict which customers are at high risk of churn. In this project, you will analyse customer-level data of a leading telecom firm, build predictive models to identify customers at high risk of churn and identify the main indicators of churn.

tic-tac-toe---assignment

The other popular variant of this game is Numerical Tic-Tac-Toe. Instead of X’s and O’s, the numbers 1 to 9 are used. In the 3x3 grid, numbers 1 to 9 are filled, with one number in each cell. The first player plays with the odd numbers, the second player plays with the even numbers, i.e. player 1 can enter only an odd number in the cell while player 2 can enter an even number in one of the remaining cells.

sukanya-93 Goto Github PK

Sukanya Roy Sikdar's Projects

chatbot-using-rasa

deep-learning-algorithms-implementation

gesture-recognition-project

hmms-and-viterbi-algorithm-for-pos-tagging

investment-assignment

lending-club-case-study

linear-regression---carprice

neural-network-project

probability-of-defaulters

restricted-boltzmann-machine---assignment

supercabs---rl-project

telecom-churn-project

tic-tac-toe---assignment

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent