Sazee S.'s Projects
These are projects I did in university during probability and statistics on R studio. I scored 91% in the subject. The projects involve finding probabilities of events, support and confidence of bayes rules. Performed statistical analysis on Groceries data set to find association rules. Also used probability density function, integrate function, joint probability mass function, poisson distribution, probability mass function, binomial distribution, M|M|1 queue and hypothesis testing in R studio. I also wrote an essay summarizing an article written by a statistician (Jim Ridgway) for someone who is not familiar with statistical terms.
Repository to store sample python programs for python learning
In this project, we'll get familiar with the Python programming language and the Pytorch machine learning framework. The combination of Python and Pytorch facilitate rapid machine learning development and experimentation, while also being suitable for production-ready systems. we will: understand Python syntax and control structures be familiar with Pytorch tensor operations
I created programs in python using spyder (anaconda3) navigator.
I have designed popular tick-tac-toe game in python. The game is designed in two ways: procedural programmng and object-oriented programming. In this project I have used programming concepts: Classes/Objects, conditions(if/elsif/else), user defined functions, In-build functions, Exception handling. I have learned other OOPS concepts like inheritance, encapsulation, and polymorphisim.
Config files for my GitHub profile.
As data scientist for the multinational technology company Apple Inc, I developed a sentiment analytics engine for Twitter, which is used to predict consumers’ review sentiments. The aim is to develop both dictionary based and machine learning-based sentiment analytics scripts using a number of R libraries and SAS Sentiment Analysis Studio. I used the developed engine to predict Apple reviewers’ sentiments and benchmark various algorithms and analytics tools.
Technical documentation for Microsoft SQL Server, tools such as SQL Server Management Studio (SSMS) , SQL Server Data Tools (SSDT) etc.
Azure Data SQL Samples - Official Microsoft GitHub Repository containing code samples for SQL Server, Azure SQL, Azure Synapse, and Azure SQL Edge
I worked as a social marketing analyst in a consulting company to uncover the impacts of online advertising and communication with customers. The aim of the study is to educate the marketing teams of their clients (in diverse industries) to market their products and/or services on social media to maximise customers’ involvement (positive interest and sharing). The company is interested in finding out the relationship between the keywords, comments, sentiments and whether there is a relationship in different topic categories such as entertainment, technology, sports, etc. that are of interest to different clients in various industries.
In this project I have separated my code into a number of files, instead of using the notebook to interact with Python- closer to how a real-life project would be structured. To facilitate this via Colaboratory, I have mounted my Google Drive storage to the notebook so I can use it like a regular file system. After this, I moved on to completing a fully functional code-base which allowed me to train models, then save and load them to disk for later usage. I demonstrate this with a text sentiment task - classifying a piece of text as either positive or negative in sentiment.
Text Classification Algorithms: A Survey
This project uses transfer learning on a raw dataset collected on a holiday in Africa, where collector went on a safari to observe wild animals in their natural habitat. During this trip he captured 100 photos each of buffalo, elephants, rhinos and zebras. Now they asked me to categorise them. To avoid manually labelling each of the 400 photos, I decide that it would be more fun to build an image classification neural network to solve the problem. This way, I am able to just label 10 images of each animal and let a classifier sort out the rest.
In this project we will convert the problem we solved using regression notebook into a classification problem instead. So we will assign each wine quality rating to a different class. So quality of 0 will belong to a different class to quality of 1. The biggest difference between a regression solution and multi-class classification in terms of implementation is that now our model needs to output 10 values (1 for each quality rating) instead of just a single output. Also we will need to change our loss function.
This project is to predict the quality of wine based on features like alcohol content and density, using a publically available dataset. The target variable of "quality" is a subjective measure of the wine's quality based on expert tasters. I am using pandas, numpy, torch and matplotlib in python.