Giter VIP home page Giter VIP logo

backend-engineering-challenge's Introduction

Solution

Considerations

When developing the script application I thought about being pragmatic and avoiding over engineering it

I used pip for dependencies (only coverage package used) and use mainly python built-ins for the development of the solution

I decided that I could use a simple Queue (FIFO) that would have the size of the max_window attribute (each slot representing a minute) and for each one it would verify if the first event had already been triggered, and if yes, the duration would be appended to the queue, and if not, a 0 (no translation duration in that minute)

The average delivery time for each minute is calculated by using the sum of all queue slot values divided by the ones bigger than 0 (which means the events in the actual window size)

By having the order of the input lines from oldest to newest, it is possible to remove the events from the list after being inserted in the queue (and in the next minute iteration use the first element)

Requirements

  • The code was developed using Python 3.10 but can be used with Python 3.*
  • coverage package for testing reports

Installation

Run the application

To use the application script, run the following command in your terminal:

$ average_delivery_time_cli --input_file [INPUT_FILE] --window_size [WINDOW_SIZE]

or

$ python main.py --input_file [INPUT_FILE] --window_size [WINDOW_SIZE]

  • INPUT_FILE - the path for the file with the events
  • WINDOW_SIZE - the number of minutes we want to take in account to calculate the average

Run tests

To run the tests, run the following command in your terminal:

$ python -m unittest discover -s tests

or with coverage (generating a report and an html of it)

$ coverage run -m unittest discover -s tests/

$ coverage report

$ coverage html

Backend Engineering Challenge

Welcome to our Engineering Challenge repository ๐Ÿ––

If you found this repository it probably means that you are participating in our recruitment process. Thank you for your time and energy. If that's not the case please take a look at our openings and apply!

Please fork this repo before you start working on the challenge, read it careful and take your time and think about the solution. Also, please fork this repository because we will evaluate the code on the fork.

This is an opportunity for us both to work together and get to know each other in a more technical way. If you have any questions please open and issue and we'll reach out to help.

Good luck!

Challenge Scenario

At Unbabel we deal with a lot of translation data. One of the metrics we use for our clients' SLAs is the delivery time of a translation.

In the context of this problem, and to keep things simple, our translation flow is going to be modeled as only one event.

translation_delivered

Example:

{
	"timestamp": "2018-12-26 18:12:19.903159",
	"translation_id": "5aa5b2f39f7254a75aa4",
	"source_language": "en",
	"target_language": "fr",
	"client_name": "airliberty",
	"event_name": "translation_delivered",
	"duration": 20,
	"nr_words": 100
}

Challenge Objective

Your mission is to build a simple command line application that parses a stream of events and produces an aggregated output. In this case, we're interested in calculating, for every minute, a moving average of the translation delivery time for the last X minutes.

If we want to count, for each minute, the moving average delivery time of all translations for the past 10 minutes we would call your application like (feel free to name it anything you like!).

unbabel_cli --input_file events.json --window_size 10

The input file format would be something like:

{"timestamp": "2018-12-26 18:11:08.509654","translation_id": "5aa5b2f39f7254a75aa5","source_language": "en","target_language": "fr","client_name": "airliberty","event_name": "translation_delivered","nr_words": 30, "duration": 20}
{"timestamp": "2018-12-26 18:15:19.903159","translation_id": "5aa5b2f39f7254a75aa4","source_language": "en","target_language": "fr","client_name": "airliberty","event_name": "translation_delivered","nr_words": 30, "duration": 31}
{"timestamp": "2018-12-26 18:23:19.903159","translation_id": "5aa5b2f39f7254a75bb3","source_language": "en","target_language": "fr","client_name": "taxi-eats","event_name": "translation_delivered","nr_words": 100, "duration": 54}

Assume that the lines in the input are ordered by the timestamp key, from lower (oldest) to higher values, just like in the example input above.

The output file would be something in the following format.

{"date": "2018-12-26 18:11:00", "average_delivery_time": 0}
{"date": "2018-12-26 18:12:00", "average_delivery_time": 20}
{"date": "2018-12-26 18:13:00", "average_delivery_time": 20}
{"date": "2018-12-26 18:14:00", "average_delivery_time": 20}
{"date": "2018-12-26 18:15:00", "average_delivery_time": 20}
{"date": "2018-12-26 18:16:00", "average_delivery_time": 25.5}
{"date": "2018-12-26 18:17:00", "average_delivery_time": 25.5}
{"date": "2018-12-26 18:18:00", "average_delivery_time": 25.5}
{"date": "2018-12-26 18:19:00", "average_delivery_time": 25.5}
{"date": "2018-12-26 18:20:00", "average_delivery_time": 25.5}
{"date": "2018-12-26 18:21:00", "average_delivery_time": 25.5}
{"date": "2018-12-26 18:22:00", "average_delivery_time": 31}
{"date": "2018-12-26 18:23:00", "average_delivery_time": 31}
{"date": "2018-12-26 18:24:00", "average_delivery_time": 42.5}

Notes

Before jumping right into implementation we advise you to think about the solution first. We will evaluate, not only if your solution works but also the following aspects:

  • Simple and easy to read code. Remember that simple is not easy
  • Comment your code. The easier it is to understand the complex parts, the faster and more positive the feedback will be
  • Consider the optimizations you can do, given the order of the input lines
  • Include a README.md that briefly describes how to build and run your code, as well as how to test it
  • Be consistent in your code.

Feel free to, in your solution, include some your considerations while doing this challenge. We want you to solve this challenge in the language you feel most comfortable with. Our machines run Python (3.7.x or higher) or Go (1.16.x or higher). If you are thinking of using any other programming language please reach out to us first ๐Ÿ™.

Also, if you have any problem please open an issue.

Good luck and may the force be with you

backend-engineering-challenge's People

Contributors

grilo13 avatar bacarini avatar hugofvs avatar andreffs18 avatar joaovasques avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.