rachitsharma2001 Goto Github PK

followers: 9.0 following: 6.0 repos: 38.0 gists: 0.0

Type: User

rachitsharma2001's Projects

airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

bpnet

Toolkit to train base-resolution deep neural networks on functional genomics data and to interpret them

classifying-handwritten-digits

This program takes in handwritten digits(data provided by MNIST) and then identifies which of the 10 possible digits(0-9) it is. Has accuracy of over 80%.

ec2-github-runner

On-demand self-hosted AWS EC2 runner for GitHub Actions

fake-gcs-server

Google Cloud Storage emulator & testing library.

first-contributions

🚀✨ Help beginners to contribute to open source projects

googleapis

Public interface definitions of Google APIs.

During the Fall quarter of my Freshman year, I was taking a Calculus course, and I was learning about antiderivatives, or integrals. I kept wondering, what exactly was this is a physical sense? I knew that I could imagine derivatives as slopes of graphs at certain points, but I couldn’t understand how to imagine integrals. After thinking for a while, I realized that integrals could be thought of as points in which the y coordinate of a point has the value of the slope of the tangent line at that point in the function whose integral is being taken. To prove this, I decided to make a program that can take a simple function like y = x^2 and graph the integral of it, using the procedure I described. I learned a lot more about Calculus, which helped me a lot in the class!

interpreter

mazegeneration

moby

Moby Project - a collaborative project for the container ecosystem to assemble container-based systems

personalmusicplayer

A music player for me, which should allow me to add songs, remove songs, and play, and randomize the songs.

portscanner

rachitsharma2001.github.io

reinforce

In this project, I implemented the Reinforcement learning algorithm called REINFORCE.

reinforce_implementation

In this project, I implemented the Reinforcement learning algorithm called REINFORCE. This algorithm is under the general class of policy gradient methods for reinforcement learning, meaning it focuses on the policy and changes as it goes and learns more about the environment. In this specific algorithm, the agent goes through many episodes. In each episode, it first does a monte carlo rollout, meaning it acts in the environment according to its parameters(the weights of the neural network, which output the probability of doing the two possible actions an agent can do in an environment), until it either “dies”(in cart pole this means either going off position or off angle) or it “survives” for 200 time steps. For each time step the state, action, and rewards were recorded. After the rollout, the algorithm looks at each recorded state, action, and reward, and it calculates, based on the reward the action got, how to either increase the probability of that action being chosen by the neural network(that represents its policy, or more specifically, its weights represent its policy) or decrease it. We repeatedly do this until the agent consistently gets a score 200 for 100 episodes(which is considered a solve in the cart pole environment). I used pytorch to implement the policy neural network. The biggest challenge I faced was that I made a dumb mistake: I at first only allowed the agent to pick the action associated with the highest probability. This obviously meant the agent did not explore much of its environment! After fixing this, the agent was able to solve at about 2500 episodes, so this is certainly not the best implementation, but its a great and fun start!

rachitsharma2001 Goto Github PK

rachitsharma2001's Projects

Recommend Projects

Recommend Topics

Recommend Org