thekivs Goto Github PK
Name: Souvik Dutta
Type: User
Company: Veritas Technologies
Bio: Sr. ML Scientist at Veritas | Ph.D. in Physics from UIUC. Interests: ML, AlgoTrading, Haiku.
Location: Santa Clara, CA
Name: Souvik Dutta
Type: User
Company: Veritas Technologies
Bio: Sr. ML Scientist at Veritas | Ph.D. in Physics from UIUC. Interests: ML, AlgoTrading, Haiku.
Location: Santa Clara, CA
Brazil has the largest rainforest on the planet that is the Amazon rainforest. Forest fires are a serious problem for the preservation of the Tropical Forests. Understanding the frequency of forest fires in a time series can help to take action to prevent them.
Dataset with ~20% empty cells for numeric and categorical features. Performed XGBoost regression with best_params_ that were obtained by 4-fold cross-validated GridSearch. Training RMSE = 25667. The predictions for 1459 houses included separately in pred.csv. (Credits: Dean de Cock on Kaggle.com)
Bike sharing systems are a means of renting bicycles where the process of obtaining membership, rental, and bike return is automated via a network of kiosk locations throughout a city. Using these systems, people are able rent a bike from a one location and return it to a different place on an as-needed basis. Currently, there are over 500 bike-sharing programs around the world. The data generated by these systems makes them attractive for researchers because the duration of travel, departure location, arrival location, and time elapsed is explicitly recorded. Bike sharing systems therefore function as a sensor network, which can be used for studying mobility in a city. In this competition, participants are asked to combine historical usage patterns with weather data in order to forecast bike rental demand in the Capital Bikeshare program in Washington, D.C.
Repository for code used in my blog posts
Edge detection is a multi-step algorithm that can detect edges with noise suppressed at the same time. We smooth the image with a Gaussian filter (via convolution) to reduce noise and unwanted details and textures.
iPython code for testing canny edge detection
Cliff walking is a standard undiscounted episodic task with start and goal states, and the usual actions of going UP, DOWN, LEFT or RIGHT. The reward is -1 on all transitions, except in the "cliff" region. Stepping into this region incurs a reward of optimal path -100 and sends the agent instantly back to the start. The graph below shows the performance of the Sarsa and Q-learning methods with epsilon-greedy action selection (epsilon=0.1).
Repo for the Deep Reinforcement Learning Nanodegree program
EPI Judge - Preview Release
Immersion of the world of Finance through Python
Multiclass classification on the RNA-Seq (HiSeq) PANCAN data set, it is a random extraction of gene expressions of patients having different types of tumor: BRCA, KIRC, COAD, LUAD and PRAD.
A toolkit for developing and comparing reinforcement learning algorithms. I usually use it to develop new RL techniques and compare with existing motifs.
A series of Jupyter notebooks that walk you through the fundamentals of Machine Learning and Deep Learning in Python using Scikit-Learn, Keras and TensorFlow 2.
Generic FastJet wrapper
to store the root tuple for the jet clustering exercise
This is a short Keras tutorial that I often revisit myself.
A pure python implementation of CRYSTALS-Kyber
Dataset of high-pT jets from simulations from proton-proton collisions at the Large Hadrom Collider in CERN. The aim is to do binary classification on jet-tagging, whether the jet originated from a quark or a gluon. Includes: High level features (see https://arxiv.org/abs/1804.06913). List: list of jet features with up to 30 particles/jet (see https://arxiv.org/abs/1908.05318).
Build a linear regression based on OECD's life satisfaction data and the IMF's GDP per capita data.
A List of Recommender Systems and Resources
A generic implementation of classification for the MNIST dataset using scikit-learn and pandas libraries
We use content based recommendation systems to predict which movies are closest to the user profile
We use the "People Wikipedia Data", freely available at https://www.kaggle.com/sameersmahajan/people-wikipedia-data/data. We make recommendations for the next article that is similar in spirit to the current article using Non-Matrix Factorization.
Hybrid PSO Clustering Algorithm with K-Means for Data Clustering
Ph.D. Thesis from Goldsmiths, University of London entitled, "Audiovisual Scene Synthesis". Hosts all images and latex files.
Problem sets for PHYS 598 that I designed at UIUC (Fall 2019)
We employ a Long Short Term Memory (LSTM) model on data pulled from coinmarketcap.com to predict cryptocurrency prices for the largest coins by total market cap.
Track reconstruction in Project-8 spectrograms
The interface between FastJet and NumPy
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.