Giter VIP home page Giter VIP logo

👋 Hi, I’m Erinç Koç. I worked as both data engineer and data scientist through banks in Turkey

  • 👀 I’m interested in data applications
  • 🌱 I’m currently learning everything about data
  • 📫 How to reach me [email protected]

I just want to create page that show all my work up to this point in life.

Erinc's Projects

anagram_check icon anagram_check

An anagram is a word or phrase formed by rearranging the letters of a different word or phrase. In other words, both strings must contain the same exact letters in the same exact frequency. Write a python script that reads 2 strings from command line and finds out whether they are anagrams or not. If they are not anagrams, then the script should find and print the minimum number of character deletions required to make the two strings anagrams. Otherwise, just print that they are anagrams. **Input Format** - The first line contains a single string, **a**. - The second line contains a single string, **b**. Expected input and output: ``` $ python3 solution.py a: Tom Marvolo Riddle b: I Am Lord Voldemort remove 7 characters from 'Tom Marvolo Riddle' and 8 characters from 'I Am Lord Voldemort' $ python3 solution.py a: tom marvolo riddle b: i am lord voldemort remove 0 characters from 'tom marvolo riddle' and 1 characters from 'i am lord voldemort' $ python3 solution.py a: tom marvolo riddle b: i am lordvoldemort they are anagrams $ python3 solution.py a: tom riddle b: voldemort remove 3 characters from 'tom riddle' and 2 characters from 'voldemort' ```

bigquery_mysql_connect icon bigquery_mysql_connect

Create an ETL job with python. The python file has to retrieve data from BigQuery piece by piece (10k, 100k, etc.) Data can be stored in any relational (MySQL.) databases on the locale. o The file contains two date parameters: batch and realtime. 'batch’ parameter should get the past data and write to a database as fast as possible. Please, measure its time and improve the performance (Hint: Parallel Processing). realtime parameter should get the last day. o The file has to be robust in terms of logging and try-except mechanisms (DBs connections, etc.).

clustering_categorical_data icon clustering_categorical_data

Discover different segments of sessions which differ from each other by their navigational patterns before adding a product to the baskets. You are free to differentiate your segments based on category id or domain name of the products, if you feel necessary.Dimension reduction is also applied.

data_analysis icon data_analysis

In this notebook, I applied statistical methods for imbalanced data analysis. In terms of basics, it starts with null check, data description and handling missing values. There exists right skewness in data for numerical columns. Shapiro-Wilk and Anderson darling tests are applied to prove that data is not distributed normally. Outlier detection with IGR is applied for numerical columns. Chi-square test is applied for categorical columns in order to test whether there exist differences between distributions for target columns. Correlation analysis for an imbalanced data set is applied by using undersampling methods.

data_enhancement icon data_enhancement

Data Quality: How would you improve the data quality of this data set, what are your main conclusions about the data quality? What interventions have you done on the data set before analysing further? What did you learn?

e181337 icon e181337

Config files for my GitHub profile.

linkfire_data_analysis icon linkfire_data_analysis

Our goal is to understand this traffic better, in particular the volume and distribution of events, and to develop ideas how to increase the links' clickrates.

navigation_pattern_estimation icon navigation_pattern_estimation

Come up with a prescriptive model that is able to give directions on how to maximize the “Purchase Completed” probability of a session. For example, at which state of a session what kind of directions may be given to customers, which patterns contributes at most to “purchase completed” probability etc.

python_hive_connection icon python_hive_connection

Writing pandas df to hive db by using pyhive library. Kerberos authentication is used to reach cluster.

python_hive_sqlalchemy_connection icon python_hive_sqlalchemy_connection

I will show how to connect kerberized hadoop cluster by using sqlalchemy library. Connection engine will be generated and used to write df to the database.

top_seller_class icon top_seller_class

Write a python class using pandas that finds and prints: top seller n products in given date range (product name & quantity), top seller n stores in given date range (store name & quantity), top seller n brands in given date range (brand & quantity), top seller n cities in given date range (city & quantity)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.