Giter VIP home page Giter VIP logo

callrouter's Introduction

callrouter

The programming exercise with valuechecker{dot}ai

Data sample

price_list_sample = {
    "A": [
        ("1", 0.9),
        ("268", 5.1),
        ("46", 0.17),
        ("4620", 0.0),
        ("468", 0.15),
        ("4631", 0.15),
        ("4673", 0.9),
        ("46732", 1.1),
    ],
    "B": [
        ("1", 0.92),
        ("44", 0.5),
        ("46", 0.2),
        ("467", 1.0),
        ("48", 1.2),
    ],
}

Intuition

This challenge is similar with the autocomplete searching problem using trie. Intuitively, trie (maybe) is a good fit data structure to solve this challenge.

We can use a bruteforce approach to determine the answer, however, using a certain data structure, I believe there will be a more efficient solution.

Approach

#1

Very first idea, we can build a trie to store price list for each operator. With a certain phone number needs routing, using the trie to search the price of the longest matching prefix with it.

Do the same for all the remaining operators, then compare the results to find out which operator has the cheapest price, and how much the cheapest is.

Illustration for the first idea

#2

Having many trie objects and doing search on all trie is not really efficient, both in time and space complexity. It would be better if we have only one trie object to manipulate.

In combined trie, each node in trie will store a hashmap of <operator: price>.

With a certain phone number, traverse the nodes until reaching the leaf node, at each step, save the most recent price of each operator (if any).

In the end, we have a hashmap that stores the prices of all the operators whose prefix matches that phone number. The last thing is to determine which price is the cheapest in this hashmap.

Illustration for the 2nd idea

How to run

Required

  • Python: 3.10+
  • Packages: coverage

Run test

% coverage run -m unittest -v
test_prefix_namedtuple (tests.test_cases.PriceItemTestCase) ... ok
test_router_advanced (tests.test_cases.RouterTestCase) ... ok
test_router_factory (tests.test_cases.RouterTestCase) ... ok
test_router_simple (tests.test_cases.RouterTestCase) ... ok
test_trie_init (tests.test_cases.TrieTestCase) ... ok
test_trie_insert (tests.test_cases.TrieTestCase) ... ok
test_trie_search (tests.test_cases.TrieTestCase) ... ok

----------------------------------------------------------------------
Ran 7 tests in 0.001s

OK
% coverage report -m
Name                   Stmts   Miss  Cover   Missing
----------------------------------------------------
router.py                 51      2    96%   14, 18
tests/__init__.py          0      0   100%
tests/test_cases.py       60      1    98%   111
utils/__init__.py          8      0   100%
utils/data_sample.py       1      0   100%
utils/trie.py             28      0   100%
----------------------------------------------------
TOTAL                    148      3    98%

Run program

% python3 main.py simple 46743212345 89373822722 24628273929 11282739293 4429283473473
[MODE=SIMPLE]
- results: [(prefix: '46', operator: 'A', price: 0.17), None, None, (prefix: '1', operator: 'A', price: 0.9), (prefix: '44', operator: 'B', price: 0.5)]
- time elapsed: 0.000047s
- size of trie(s): 232 bytes
--------------------------------
% python3 main.py advanced 46743212345 89373822722 24628273929 11282739293 4429283473473
[MODE=ADVANCED]
- results: [(prefix: '46', operator: 'A', price: 0.17), None, None, (prefix: '1', operator: 'A', price: 0.9), (prefix: '44', operator: 'B', price: 0.5)]
- time elapsed: 0.000037s
- size of trie(s): 48 bytes
--------------------------------

Discussions

  • If we just have only one process and one thread program, the appr#2 can reduce memory space to store trie object, we also need only one searching operation (per phone number) to determine the cheapest price. So that, it's really better than appr#01. Here is the test result with 10_000_000 input random numbers:
% python3 main.py test-perf
[MODE=SIMPLE]
- time elapsed: 7.404595s
- size of trie(s): 232 bytes
--------------------------------
[MODE=ADVANCED]
- time elapsed: 5.088647s
- size of trie(s): 48 bytes
--------------------------------
  • In both approaches, multithreading does not help speed up the program, because it's CPU bound task and limitation of Python GIL (Global Interpreter Locking).
  • With single request, using appr#1 we have a good chance to run the program in parallel for a part of program ( build trie and search on it), taking advantage of the cpu with many cores, each trie and search operation on it (for a certain operator) can be handled by a completely separate process. Or imagine a scenario using map-reduce architecture if the problem needs to scale indefinitely.
  • If we need to handle a large number of phone number routing requests simultaneously, the appr#02 is a good candidate to choose. Using multiprocessing, each request is handled by 1 separate process. But we won't reduce the memory space anymore (even increase), because each "fork" process will clone the memory.

callrouter's People

Contributors

linhpduc avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.