Giter VIP home page Giter VIP logo

immuni-backend-common's Introduction

Immuni Backend Common


Table of contents

Context

This repository contains the source code common to all of Immuni's backend services. Before proceeding, we recommend reading their documentation first:

Please take the time to read and consider the other repositories in full before digging into the source code or opening an Issue. They contain a lot of details that are fundamental to understanding the source code and this repository's documentation.

The common logic aims to consistently provide the following:

  • The Sanic App creation logic
  • The Celery App creation logic
  • The Prometheus monitoring logic and common metrics
  • The Python packages dependencies, with optional extras
  • The Python packages dev-dependencies
  • The Dockerfile
  • The linters configurations, and a consistent way to invoke them

Installation

This repository is not meant to be used as a standalone. On the contrary, it assumes the following:

  • It is used as a Git submodule of any of the various Immuni backend services
  • The folder containing the submodule is named common

To leverage the common logic in any of Immuni’s backend services, the following is added into their pyproject.toml:

[tool.poetry.dependencies]
immuni-common = { path = "common", develop = true, extras = ["aioredis", "celery"] }

[tool.poetry.dev-dependencies]
immuni-common-dev = { path = "common/dev", develop = true }

[tool.poetry.scripts]
checks = "common.scripts:checks"

The specific Immuni backend service specifies any extras that it requires to run.

For more information about how the project is generated and structured, please refer to the Contributing section below.

Contributing

Contributions are most welcome. Before proceeding, please read the Code of Conduct for guidance on how to approach the community and create a positive environment. Additionally, please read our CONTRIBUTING file, which contains guidance on ensuring a smooth contribution process.

The Immuni project is composed of different repositories—one for each component or service. Please use this repository for contributions strictly relevant to Immuni's backend services. To propose a feature request, please open an issue in the Documentation repository. This lets everyone involved see it, consider it, and participate in the discussion. Opening an issue or pull request in this repository may slow down the overall process.

Contributors

Here is a list of Immuni's contributors. Thank you to everyone involved for improving Immuni, day by day.

Licence

Authors / Copyright

Copyright 2020 (c) Presidenza del Consiglio dei Ministri.

Please check the AUTHORS file for extended reference.

Third-party component licences

Please see the Technology Description’s Backend Services Technologies section, which also lists the corresponding licences.

Licence details

The licence for this repository is a GNU Affero General Public Licence version 3 (SPDX: AGPL-3.0). Please see the LICENSE file for full reference.

immuni-backend-common's People

Contributors

alessandroberlati avatar alfo1995 avatar astagi avatar earien avatar gnfrisicaro avatar immuniopensource avatar immunistaff avatar immuniteam avatar ocardia avatar ralbertazzi avatar xelhark avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

immuni-backend-common's Issues

[BUG] JSON logging is not working

Describe the bug

Although the root logger is set to use a JSON formatter, when running the example of any Immuni backend service through docker-compose, logs are not in JSON format.

This may be caused by Gunicorn and/or Uvicorn overriding and/or disabling the Sanic's loggers.

Celery logs are also not in JSON format.

To Reproduce

Expected behaviour

The logs should be in JSON format.

[FEAT] Use a proper path `/status` for monitoring status

Feature

Use proper endpoint for /status leave the root path for future use.

Describe the solution you'd like
The status of the API to be exposed on a proper path (eg. /status for the ongoing Italian API Guidelines).
The root path should be reserved some sensible choice, eg providing guidance to the client.

There should be some more logic in determining the application status, otherwise it will only reply 200 Ok even when the service is not working.

Describe alternatives you've considered

  • /status :)

Additional context
This simplifies cache policies: see #29
This might require updating the app.

[BUG] Status endpoint should prevent caching

Describe the bug

The status endpoint does not prevent caching.

To Reproduce

  1. curl http://localhost:5000/ -i
< HTTP/1.1 200 OK
< date: Wed, 10 Jun 2020 12:39:08 GMT
< server: uvicorn
< content-length: 0
< content-type: text/plain
< 

Expected behaviour

The following header to be returned

Cache-Control: no-store

Additional context

Since origin servers do not always provide explicit expiration times,
a cache MAY assign a heuristic expiration time when an explicit time
is not specified [...]

https://tools.ietf.org/html/rfc7234#section-4.2.2

[BUG] TekListValidator raises ValueError with an empty list

Describe the bug

The TekListValidator should check if the length of the list is 0 and fail. Otherwise, themin will raise ValueError.

To Reproduce

  1. Try to validate an empty list
from marshmallow import fields
from immuni_common.models.marshmallow.validators import TekListValidator

fields.Nested(
            fields.Integer, required=True, many=True, validate=TekListValidator()
        ).validate([])

Expected behaviour

The validation should not raise

[GENERAL] Typos in CONTRIBUTING.md

There are two typos in CONTRIBUTING.md

  • <type>/name_of_feature_or_fix. should be <type>/name_of_feature_or_fix
  • security and quality standards in the development tools described in should be security and quality standards using the development tools described in

[GENERAL] mongoengine delete already returns the count of deleted objects

As it is already done in the Exposure Ingestion Service code, also the common codebase should leverage the return value of the mongoengine's delete call.

The current implementation sees the following:

    @classmethod
    def delete_older_than(cls, datetime_: datetime) -> None:
        """
        Delete all batches older than the given datetime.

        :param datetime_: the datetime before which the batches are to be deleted.
        """
        objects = cls.objects.filter(id__lte=ObjectId.from_datetime(datetime_))
        count = objects.count()
        objects.delete()
        _LOGGER.info(
		    "BatchFile documents deletion completed.",
            extra=dict(n_deleted=count, created_before=datetime_.isoformat()),
        )

Yet, count = objects.delete() would work just the same.
At this point, I'd rename count to n_deleted to match the logged extra field.

[FEAT] API for extenal contacts tracing

Feature:
Probably many sites allow customers to register in the place, at a certain moment. What about an API to trace it on their (if any) immuni encrypted ID?

Solution:
A callable API to add up contact tracing, when users record their access (mandatory for current italian rules), by an online form, to a public place where usually they do not bring their phones back them (e.g. physical activity)

Geo restrictions aren't working as advertised/intended

Describe the bug

There seem to be some sort of GeoIP block implemented on endpoints (ie: https://get.immuni.gov.it/v1/keys/index) which is preventing the app from working as advertised/intended.

The behaviour observed is far from what's documented in the FAQ:

You can download Immuni from anywhere in the world. However, you can use it only in Italy:
• The app only connects to the server (for example, to download the keys of users who tested positive for the virus) if you’re within the European Union. Currently, connections to the server from other territories are not supported for security reasons.
• Should you test positive for the virus, you’ll have to be in Italy to be able to upload your keys to the server.

To Reproduce

Steps to reproduce the behaviour:

  1. Get an UK or Chinese SIM card (or many other countries really)
  2. Install Immuni
  3. Roam in Italy for a few days, get infected by someone who has Immuni on their phone
  4. Your app won't run any exposure checks and thus not notify you. Opening https://get.immuni.gov.it/v1/keys/index directly returns a 403.
  5. Believe you are safe (spoiler: you are not)

Expected behaviour

People traveling to Italy should be able to use the app regardless of the nationality of their SIM card. You also want to cover the use case of people travelling to Italy for business reasons with mandatory VPN on their phone. Exposure checks should be carried out regardless of real or GeoIP location - and in particular should always be carried out if a user is in-country.

As someone who visited Italy and had Immuni installed, I would expect the app to flag if I have been in someone confirmed positive's proximity - even after I return to my country, being it in EU or not. It will then be on me to contact my local healthcare provider.

(iOS is currently blocking notifications out of the intended country - but you should still be able to flag in-app)

If the app is unable to contact the notification endpoint, I would expect a big red banner saying the app is NOT working. At the moment Immuni on my phone is showing as perfectly working, even if it has been unable to download key sets for 6 days, and the required endpoint is returning a 403.

Screenshots

None.

Smartphone (optional):

  • Device: iPhone, but behaviour can be confirmed by simply curl https://get.immuni.gov.it/v1/keys/index

Additional context

None.

[GENERAL] CODEOWNERS is wrongly structured

The CODEOWNERS file should be stuctured as follows

# These owners will be the default owners for everything in
# the repo. Unless a later match takes precedence,
# @global-owner1 and @global-owner2 will be requested for
# review when someone opens a pull request.
*       @global-owner1 @global-owner2

[GENERAL] Using an integer unique key for BatchFiles

I'd like to thank you guys first for all the work you put into this, you're definitely not getting enough credit for it.

The issue I'd like to point out is with the schema of the BatchFile documents: I see that you're using a field called index to uniquely identify and index the batches generated from the user's keys uploads.

However, using unique integer indexes in MongoDB is a common bad practice, especially for collections with a large amount of documents, as it's pointed out in the MongoDB 3.* documentation: https://docs.mongodb.com/v3.0/tutorial/create-an-auto-incrementing-field/#considerations
I'm not sure why this page is not included in the MongoDB 4 documentation as well (I think they just removed the tutorials section from it) but the considerations made there are just as valid for MongoDB 4 as they were for v3.

Is there any reason why an integer field was preferred instead of the ObjectId already indexed, unique and incremental by design in BatchFile._id?

Additionally, is there any reason why the increment of the index field is implemented as below, instead of using one of the suggested strategies to create an auto incrementing unique field, as described in the documentation page already linked above?

from immuni-backend-exposure-ingestion/tasks/process_uploads.py#L101-L110:

            index = last_index + 1


            batch_file = BatchFile(
                index=index,
                keys=keys,
                period_start=period_start,
                period_end=period_end,
                sub_batch_index=1,
                sub_batch_count=1,
            )

This approach is clearly flawed as it does not allow batch generation to be performed concurrently (hence the shared lock acquired before doing this).

By using an ObjectId, or even just a smarter solution to achieve auto increment, you could easily start processing batches concurrently instead, making better use of celery and reducing the processing time of batches, that would otherwise grow linearly with the number of users of the app. Of course that is after taking on the other concurrency issues pointed out in immuni-backend-exposure-ingestion#15

I would be happy to help if you need clarification or an additional pair of hands to dive into this issues.

Thank you again!

[BUG] Application status response mismatch.

Describe the bug

Using root path for getting status does not return a json response

To Reproduce

  1. curl -i http://localhost:5000/ -H 'accept: application/json'

Expected behaviour

< HTTP/1.1 200 OK
< content-type: application/json

{...json payload..}

** Instead**
Text plain is returned, with no payload

< HTTP/1.1 200 OK
< date: Wed, 10 Jun 2020 12:39:08 GMT
< server: uvicorn
< content-length: 0
< content-type: text/plain

Additional context

[BUG] The logs currently show IPs

Describe the bug

The logging utilities should make sure to redact the messages.
Specifically, they should remove any IP—from the message, and from the extra fields.

To Reproduce

Expected behaviour

No IP should part of the logged information.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.