Giter VIP home page Giter VIP logo

ka-infection's Introduction

An implentation of the Infection project for Khan Academy's application process. The official description of the parameters can be found here. In brief, the challenge is to implement a graph of students and teachers who can be "infected" classroom-by-classroom by a new version of a piece of software. Classrooms are infected as a whole to make sure classmates are using the same version, and in the Limited Infection case, the goal is to infect only a specified portion of the population.

Usage

Run total_infection.py to execute randomized correctness tests for the Total Infection scenario. (Use -h to see command-line options.) These tests simply create randomly-sized classrooms and then infect and/or disinfect them, checking along the way that the infection has spread properly.

Run limited_infection.py to execute randomized correctness tests for the Limited Infection scenario. (Use -h to see command-line options.) These tests construct randomly-sized classrooms and then use the close_enough algorithm (by default) to search for approximate solutions to infect a random percent of the userbase. Prints the percent of these scenarios for which an acceptable solution was found and an average accuracy, which is the discrepancy between the target and the solution, as a percentage of the target (i.e. 10% means the solutions are +/- 10% of the target on average, 0% accuracy means a perfect solution).

Implementation

I chose to complete this project in Python both because the majority of my portfolio is in a web-focused language (JavaScript, PHP, etc.) and so I wanted to highlight my ability in other environments, and because Python is high level enough to focus on the conceptual parts of the project.

The teaching graph is implemented via the classes Teacher and Student in infection.py. These classes include an infect method that toggles their infection state and (by default) immediately spreads the infection through the classroom. Although strictly for the purposes of this project, a smaller data structure (an array of boolean values representing students, for example) would easily suffice, I chose to use fully-fledged classes to mirror the real world where classrooms and students are associated with a lot more information.

Algorithm

The Limited Infection scenario is an example of the famous Subset Sum problem, where the goal is to choose a subset of a given set of numbers that sum to a certain target. If the goal is to infect k students and we have classrooms of sizes a_1, a_2, a_3, ..., a_n then we attempt to find a subset of these classrooms with size summing to k. The Subset Sum problem is known to be NP-complete, so a perfect polynomial time algorithm is beyond the scope of this project. Thus, we can either compromise on complexity (and implement a brute force, exponential time algorithm) and/or correctness (and implement an algorithm that finds a subset of classrooms with only approximately k total students). As a start, the bruteforce function in limited_infection.py provides a brute-force but correct algorithm.

However, even though this problem is very difficult in the general case, the fact that we know something about the sizes of classrooms allows for powerful hueristics. For most practical purposes, we will have a large number of classrooms of relatively small size (a large, distributed user base) and if we would like to infect 10% of the userbase, getting within a margin of one or two classroom sizes is very acceptable. In addition, we probably want the infected classrooms to be chosen randomly so that users are equally likely to try out new features (imagine if we had many classrooms of size one, and always infected those for every new feature). The close_enough function in limited_infection.py implements this kind of an algorithm.

ka-infection's People

Contributors

dzirbel avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.