Giter VIP home page Giter VIP logo

prio-processor's Introduction

prio-processor

CircleCI

Prio is a system for aggregating data in a privacy-preserving way. This repository includes a command-line tool for batch processing in Prio's multi-server architecture.

For more information about Prio, see this blog post.

Docker

This project contains a pre-configured build and test environment via docker.

make

# or run directly though docker-compose
docker-compose build

You can mount your working directory and shell into the container for development work.

docker-compose run -v $PWD:/app prio_processor bash

Adding new dependencies

To add new Python dependencies to the container, use pip-tools to manage the requirements.txt.

pip install pip-tools

# generate the installation requirements from setup.py
pip-compile

# generate dev requirements
pip-compile requirements-dev.in

Any new system dependencies should be added to the Dockerfile at the root of the repository. These will be available during runtime.

Deployment Configuration

See the deployment directory for examples of configuration that can be used to aid deployment. These may also be run as integration tests to determine whether resources are configured properly. These will typically assume Google Cloud Platform (GCP) as a resource provider.

See the guide for more details.

prio-processor's People

Contributors

acmiyaguchi avatar dependabot[bot] avatar mozilla-github-standards avatar wlach avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

prio-processor's Issues

Fix dockerhub deploys

#79 separated building the dev image from the prod image. However, this makes deploys fail:

#!/bin/sh -eo pipefail
docker build --target production -t prio:prod .

unable to prepare context: unable to evaluate symlinks in Dockerfile path: lstat /root/project/Dockerfile: no such file or directory

Exited with code exit status 1

The final workflow stage does not have the repository checked out.

`bin/process` fails when dataset for batch_id does not exist

In our nonprod job:

[2020-09-25 03:28:23,142] {logging_mixin.py:112} INFO - [2020-09-25 03:28:23,142] {pod_launcher.py:125} INFO - b'+ spark-submit --name origin-telemetry --py-files /tmp/bootstrap-Fdu/prio_processor.egg /tmp/bootstrap-Fdu/processor-spark.py verify1 --batch-id content.blocking_opener_after_user_interaction_exempt-0 --n-data 2046 --input gs://moz-fx-prio-dev-a-private/data/v1/F992B575840AEC202289FBF99D6C04FB2A37B1DA1CDEB1DF8036E1340D46C561/origin-telemetry/2020-09-24/raw/shares/batch_id=content.blocking_opener_after_user_interaction_exempt-0 --output gs://moz-fx-prio-dev-a-shared/data/v1/F992B575840AEC202289FBF99D6C04FB2A37B1DA1CDEB1DF8036E1340D46C561/origin-telemetry/2020-09-24/intermediate/internal/verify1/batch_id=content.blocking_opener_after_user_interaction_exempt-0\n'
[2020-09-25 03:28:26,361] {logging_mixin.py:112} INFO - [2020-09-25 03:28:26,361] {pod_launcher.py:125} INFO - b'20/09/25 03:28:26 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable\n'
[2020-09-25 03:28:28,616] {logging_mixin.py:112} INFO - [2020-09-25 03:28:28,616] {pod_launcher.py:125} INFO - b'Running verify1\n'
[2020-09-25 03:28:36,151] {logging_mixin.py:112} INFO - [2020-09-25 03:28:36,151] {pod_launcher.py:125} INFO - b'Traceback (most recent call last):\n'
[2020-09-25 03:28:36,152] {logging_mixin.py:112} INFO - [2020-09-25 03:28:36,152] {pod_launcher.py:125} INFO - b'  File "/tmp/bootstrap-Fdu/processor-spark.py", line 2, in <module>\n'
[2020-09-25 03:28:36,153] {logging_mixin.py:112} INFO - [2020-09-25 03:28:36,152] {pod_launcher.py:125} INFO - b'    commands.entry_point()\n'
[2020-09-25 03:28:36,153] {logging_mixin.py:112} INFO - [2020-09-25 03:28:36,153] {pod_launcher.py:125} INFO - b'  File "/usr/local/lib/python3.6/site-packages/click/core.py", line 829, in __call__\n'
[2020-09-25 03:28:36,154] {logging_mixin.py:112} INFO - [2020-09-25 03:28:36,154] {pod_launcher.py:125} INFO - b'    return self.main(*args, **kwargs)\n'
[2020-09-25 03:28:36,154] {logging_mixin.py:112} INFO - [2020-09-25 03:28:36,154] {pod_launcher.py:125} INFO - b'  File "/usr/local/lib/python3.6/site-packages/click/core.py", line 782, in main\n'
[2020-09-25 03:28:36,155] {logging_mixin.py:112} INFO - [2020-09-25 03:28:36,155] {pod_launcher.py:125} INFO - b'    rv = self.invoke(ctx)\n'
[2020-09-25 03:28:36,155] {logging_mixin.py:112} INFO - [2020-09-25 03:28:36,155] {pod_launcher.py:125} INFO - b'  File "/usr/local/lib/python3.6/site-packages/click/core.py", line 1259, in invoke\n'
[2020-09-25 03:28:36,156] {logging_mixin.py:112} INFO - [2020-09-25 03:28:36,156] {pod_launcher.py:125} INFO - b'    return _process_result(sub_ctx.command.invoke(sub_ctx))\n'
[2020-09-25 03:28:36,156] {logging_mixin.py:112} INFO - [2020-09-25 03:28:36,156] {pod_launcher.py:125} INFO - b'  File "/usr/local/lib/python3.6/site-packages/click/core.py", line 1066, in invoke\n'
[2020-09-25 03:28:36,157] {logging_mixin.py:112} INFO - [2020-09-25 03:28:36,156] {pod_launcher.py:125} INFO - b'    return ctx.invoke(self.callback, **ctx.params)\n'
[2020-09-25 03:28:36,157] {logging_mixin.py:112} INFO - [2020-09-25 03:28:36,157] {pod_launcher.py:125} INFO - b'  File "/usr/local/lib/python3.6/site-packages/click/core.py", line 610, in invoke\n'
[2020-09-25 03:28:36,158] {logging_mixin.py:112} INFO - [2020-09-25 03:28:36,158] {pod_launcher.py:125} INFO - b'    return callback(*args, **kwargs)\n'
[2020-09-25 03:28:36,159] {logging_mixin.py:112} INFO - [2020-09-25 03:28:36,158] {pod_launcher.py:125} INFO - b'  File "/tmp/bootstrap-Fdu/prio_processor.egg/prio_processor/spark/commands.py", line 261, in verify1\n'
[2020-09-25 03:28:36,159] {logging_mixin.py:112} INFO - [2020-09-25 03:28:36,159] {pod_launcher.py:125} INFO - b'  File "/usr/local/lib/python3.6/site-packages/pyspark/python/lib/pyspark.zip/pyspark/sql/readwriter.py", line 300, in json\n'
[2020-09-25 03:28:36,162] {logging_mixin.py:112} INFO - [2020-09-25 03:28:36,162] {pod_launcher.py:125} INFO - b'  File "/usr/local/lib/python3.6/site-packages/pyspark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line 1305, in __call__\n'
[2020-09-25 03:28:36,163] {logging_mixin.py:112} INFO - [2020-09-25 03:28:36,163] {pod_launcher.py:125} INFO - b'  File "/usr/local/lib/python3.6/site-packages/pyspark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 137, in deco\n'
[2020-09-25 03:28:36,163] {logging_mixin.py:112} INFO - [2020-09-25 03:28:36,163] {pod_launcher.py:125} INFO - b'  File "<string>", line 3, in raise_from\n'
[2020-09-25 03:28:36,191] {logging_mixin.py:112} INFO - [2020-09-25 03:28:36,191] {pod_launcher.py:125} INFO - b'pyspark.sql.utils.AnalysisException: Path does not exist: gs://moz-fx-prio-dev-a-private/data/v1/F992B575840AEC202289FBF99D6C04FB2A37B1DA1CDEB1DF8036E1340D46C561/origin-telemetry/2020-09-24/raw/shares/batch_id=content.blocking_opener_after_user_interaction_exempt-0;\n'

Create a standard benchmarking dataset

There should be a standard benchmarking dataset which can be used to stress a reasonable deployment of the servers.

10^6 data points with 2000 elements should result in a 50GB dataset with 200 partitions at 250MB each. A reasonable benchmark would be roughly a tenth of this size at 20 partitions or 5GB.

This supplements the existing end-to-end testing data e.g. processor/bin/generate.

Add `BUCKET_INTERNAL_INGESTION` for initializing data

Currently, the admin server places data into BUCKET_INTERNAL_PRIVATE, which is also used for intermediate values. This can lead to the administrator (the server that handles ingestion/partitioning data) to have overly broad access to each of the server buckets. In the Origin Telemetry case, server A and the admin server are both run in Airflow with access to the same set of individuals.

Set of valid batch ids should be filtered in staging

df = (
data
# NOTE: is it worth counting duplicates?
.dropDuplicates(["id"])
.withColumn("data", explode("prioData"))
# drop the id and assign a new one per encoding type
# this id is used as a join-key during the decoding process
.select(col("data.encoding").alias("batch_id"), "data.prio")
.withColumn(
"id", row_number().over(Window.partitionBy("batch_id").orderBy(lit(0)))
)
)

The partitioning job should limit the set of valid batch ids to those set in content.json.

[2019-08-07 23:44:33,104] {logging_mixin.py:95} INFO - [2019-08-07 23:44:33,103] {pod_launcher.py:104} INFO - + parallel process ::: 'submission_date=2019-08-05/batch_id=content.blocking_blocked_TESTONLY-0/part-00000-389a770c-9507-4c6e-a669-160366928daf.c000.json
[2019-08-07 23:44:33,105] {logging_mixin.py:95} INFO - [2019-08-07 23:44:33,105] {pod_launcher.py:104} INFO - submission_date=2019-08-05/batch_id=content.blocking_blocked-1/part-00000-389a770c-9507-4c6e-a669-160366928daf.c000.json
[2019-08-07 23:44:33,105] {logging_mixin.py:95} INFO - [2019-08-07 23:44:33,105] {pod_launcher.py:104} INFO - submission_date=2019-08-05/batch_id=content.blocking_blocked-0/part-00000-389a770c-9507-4c6e-a669-160366928daf.c000.json
[2019-08-07 23:44:33,106] {logging_mixin.py:95} INFO - [2019-08-07 23:44:33,106] {pod_launcher.py:104} INFO - submission_date=2019-08-05/batch_id=content.blocking_blocked_TESTONLY-1/part-00000-389a770c-9507-4c6e-a669-160366928daf.c000.json'
[2019-08-07 23:44:42,679] {logging_mixin.py:95} INFO - [2019-08-07 23:44:42,678] {pod_launcher.py:104} INFO - Running verify1
[2019-08-07 23:44:42,681] {logging_mixin.py:95} INFO - [2019-08-07 23:44:42,681] {pod_launcher.py:104} INFO - Traceback (most recent call last):
[2019-08-07 23:44:42,684] {logging_mixin.py:95} INFO - [2019-08-07 23:44:42,683] {pod_launcher.py:104} INFO -   File "/usr/local/bin/prio", line 11, in <module>
[2019-08-07 23:44:42,688] {logging_mixin.py:95} INFO - [2019-08-07 23:44:42,687] {pod_launcher.py:104} INFO -     sys.exit(main())
[2019-08-07 23:44:42,690] {logging_mixin.py:95} INFO - [2019-08-07 23:44:42,689] {pod_launcher.py:104} INFO -   File "/usr/local/lib64/python3.6/site-packages/click/core.py", line 764, in __call__
[2019-08-07 23:44:42,691] {logging_mixin.py:95} INFO - [2019-08-07 23:44:42,691] {pod_launcher.py:104} INFO -     return self.main(*args, **kwargs)
[2019-08-07 23:44:42,693] {logging_mixin.py:95} INFO - [2019-08-07 23:44:42,692] {pod_launcher.py:104} INFO -   File "/usr/local/lib64/python3.6/site-packages/click/core.py", line 717, in main
[2019-08-07 23:44:42,695] {logging_mixin.py:95} INFO - [2019-08-07 23:44:42,694] {pod_launcher.py:104} INFO -     rv = self.invoke(ctx)
[2019-08-07 23:44:42,697] {logging_mixin.py:95} INFO - [2019-08-07 23:44:42,696] {pod_launcher.py:104} INFO -   File "/usr/local/lib64/python3.6/site-packages/click/core.py", line 1137, in invoke
[2019-08-07 23:44:42,699] {logging_mixin.py:95} INFO - [2019-08-07 23:44:42,698] {pod_launcher.py:104} INFO -     return _process_result(sub_ctx.command.invoke(sub_ctx))
[2019-08-07 23:44:42,700] {logging_mixin.py:95} INFO - [2019-08-07 23:44:42,700] {pod_launcher.py:104} INFO -   File "/usr/local/lib64/python3.6/site-packages/click/core.py", line 956, in invoke
[2019-08-07 23:44:42,701] {logging_mixin.py:95} INFO - [2019-08-07 23:44:42,701] {pod_launcher.py:104} INFO -     return ctx.invoke(self.callback, **ctx.params)
[2019-08-07 23:44:42,704] {logging_mixin.py:95} INFO - [2019-08-07 23:44:42,703] {pod_launcher.py:104} INFO -   File "/usr/local/lib64/python3.6/site-packages/click/core.py", line 555, in invoke
[2019-08-07 23:44:42,707] {logging_mixin.py:95} INFO - [2019-08-07 23:44:42,706] {pod_launcher.py:104} INFO -     return callback(*args, **kwargs)
[2019-08-07 23:44:42,709] {logging_mixin.py:95} INFO - [2019-08-07 23:44:42,709] {pod_launcher.py:104} INFO -   File "/usr/local/lib64/python3.6/site-packages/prio/cli/commands.py", line 134, in verify1
[2019-08-07 23:44:42,711] {logging_mixin.py:95} INFO - [2019-08-07 23:44:42,711] {pod_launcher.py:104} INFO -     libprio.PrioVerifier_set_data(verifier, share)
[2019-08-07 23:44:42,714] {logging_mixin.py:95} INFO - [2019-08-07 23:44:42,713] {pod_launcher.py:104} INFO -   File "/usr/local/lib64/python3.6/site-packages/prio/libprio.py", line 319, in PrioVerifier_set_data
[2019-08-07 23:44:42,716] {logging_mixin.py:95} INFO - [2019-08-07 23:44:42,715] {pod_launcher.py:104} INFO -     return _libprio.PrioVerifier_set_data(*args)
[2019-08-07 23:44:42,717] {logging_mixin.py:95} INFO - [2019-08-07 23:44:42,717] {pod_launcher.py:104} INFO - RuntimeError: PrioVerifier_set_data was not successful.
[2019-08-07 23:44:42,846] {logging_mixin.py:95} INFO - [2019-08-07 23:44:42,845] {pod_launcher.py:104} INFO - Usage: prio verify1 [OPTIONS]
[2019-08-07 23:44:42,847] {logging_mixin.py:95} INFO - [2019-08-07 23:44:42,846] {pod_launcher.py:104} INFO - Try "prio verify1 --help" for help.
[2019-08-07 23:44:42,848] {logging_mixin.py:95} INFO - [2019-08-07 23:44:42,848] {pod_launcher.py:104} INFO -
[2019-08-07 23:44:42,849] {logging_mixin.py:95} INFO - [2019-08-07 23:44:42,849] {pod_launcher.py:104} INFO - Error: Invalid value for "--n-data": null is not a valid integer
[2019-08-07 23:44:43,062] {logging_mixin.py:95} INFO - [2019-08-07 23:44:43,062] {pod_launcher.py:104} INFO - Usage: prio verify1 [OPTIONS]
[2019-08-07 23:44:43,063] {logging_mixin.py:95} INFO - [2019-08-07 23:44:43,063] {pod_launcher.py:104} INFO - Try "prio verify1 --help" for help.
[2019-08-07 23:44:43,064] {logging_mixin.py:95} INFO - [2019-08-07 23:44:43,064] {pod_launcher.py:104} INFO -
[2019-08-07 23:44:43,065] {logging_mixin.py:95} INFO - [2019-08-07 23:44:43,065] {pod_launcher.py:104} INFO - Error: Invalid value for "--n-data": null is not a valid integer

CODE_OF_CONDUCT.md file missing

As of January 1 2019, Mozilla requires that all GitHub projects include this CODE_OF_CONDUCT.md file in the project root. The file has two parts:

  1. Required Text - All text under the headings Community Participation Guidelines and How to Report, are required, and should not be altered.
  2. Optional Text - The Project Specific Etiquette heading provides a space to speak more specifically about ways people can work effectively and inclusively together. Some examples of those can be found on the Firefox Debugger project, and Common Voice. (The optional part is commented out in the raw template file, and will not be visible until you modify and uncomment that part.)

If you have any questions about this file, or Code of Conduct policies and procedures, please see Mozilla-GitHub-Standards or email [email protected].

(Message COC001)

Set up key management using Cloud KMS

The current production keys are currently located in sops as per bug 1552315. The container should expect to pull the private keys from a bucket that's been populated with the keys. These keys are copied to the container and decrypted into exported environment variables. The gcloud kms utility includes audit logs that can be monitored for proper usage.

Remove `Prio` prefix from the SWIG wrapper

Section 5.4.7 of the SWIG documentation gives the syntax for renaming functions. In particular:

%rename("%(strip:[wx])s") ""; // wxHello -> Hello; FooBar -> FooBar

This should probably be done with the Prio functions, since they are implicitly namespaced in Python.

Add instructions for setting up AWS S3 configuration

gsutil supports s3 file operations via boto3. There should be documentation around enabling support for s3. This should be as straightforward as passing the appropriate environment variablesAWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY into the docker container.

The authentication method should stat the relevant buckets before proceeding.

function authenticate() {
local cred=${GOOGLE_APPLICATION_CREDENTIALS}
local test_bucket=${BUCKET_INTERNAL_PRIVATE}
if [[ -n "${cred}" ]]; then
gcloud auth activate-service-account --key-file "${cred}"
else
# https://cloud.google.com/kubernetes-engine/docs/tutorials/authenticating-to-cloud-platform
echo "No JSON credentials provided, using default scopes."
fi
gsutil ls "gs://${test_bucket}"
}

Missing prio-processor egg for bootstrapping Spark

It looks like the egg for bootstrapping spark jobs is not being built anymore in the container.

[2020-03-12 23:27:25,907] {logging_mixin.py:112} INFO - [2020-03-12 23:27:25,906] {pod_launcher.py:125} INFO -   File "/app/processor/prio_processor/bootstrap.py", line 42, in run
[2020-03-12 23:27:25,907] {logging_mixin.py:112} INFO - [2020-03-12 23:27:25,907] {pod_launcher.py:125} INFO -     raise RuntimeError("missing bdist_egg artifact")
[2020-03-12 23:27:25,941] {logging_mixin.py:112} INFO - [2020-03-12 23:27:25,940] {pod_launcher.py:125} INFO - RuntimeError: missing bdist_egg artifact

Support generic s3v4 buckets for exchanging data

Currently, the container assumes that it will be run on GCP and takes advantage of authorizing against a service account to connect to various buckets. For example, the JSON credentials set in GOOGLE_APPLICATION_CREDENTIALS may be granted access from two separate GCP projects.

It's desirable to run the container agnostic to a cloud provider. MinIO provides a self-hosted object store that implements the s3v4 API. The hadoop-aws module allows compatibility with s3v4 APIs, and I've confirmed that I can read data from MinIO using docker-compose.

I'm proposing the following changes to support generic providers.

Use of access and secret keys on buckets

First, for normal operation of the container, GOOGLE_APPLICATION_CREDENTIALS will no longer be necessary. Instead the following environment variables will need to be set with the HMAC keys for each party e.g.

INTERNAL_ACCESS_KEY
INTERNAL_SECRET_KEY
EXTERNAL_ACCESS_KEY
INTERNAL_SECRET_KEY

These map to the AWS environment variables that are commonly: AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY. HMAC keys can be made from a GCP service account as follows:

gsutil hmac create SERVICE_ACCOUNT_EMAIL

This is then distributed to the co-processor (partner, party).

MinIO requires a bit of configuration for server-side encryption, which requires TLS or KMS of some kind. It's a bit out of scope here, but it should suffice to use the default user supplied key and password since the shares themselves are encrypted too.

Use of s3a hadoop connector as the default connector

Currently the jobs will use the gs connector to connect to buckets. The s3a connector is compatible with anything that implements s3v4. Performance may vary, so the spark configuration will be important to tune downloads/uploads. The stages should be bottlenecked on CPU though.

Use mc (minio client) instead of gsutil

gsutil is the google storage commandline utility. While it does support s3 (for aws), it does not work for minio. The minio client on the other hand supports the the major cloud providers, and provides some nice functionality on top like being able to output json.

Create an examples folder

The wrapper_example should be moved into an examples folder. The purpose of the examples folder is to provide basic references for building more complex applications.

Some examples that I want to include:

  • An example of using the autogenerated lib.prio bindings. browser-test is probably a good example that uses most of header functions.
  • An example of using the pythonic wrapper. This could involve using numpy (vectorized data) and generating larger amounts of testing data.
  • An example of using this library asynchronously. This example should make the data-flow easier to understand. It might be useful to include asyncio and a messaging queue.
  • An example of a jupyter notebook performing analysis on encoded binary data.

The examples should fit in a single file and should include dependencies via a Pipfile.

# Run an example
$ cd examples/browser-test
$ pipenv run python browser_test.py

Add a Dockerfile to build a reproducible runtime environment

The build for this library is currently involved because of the dependencies introduced in libprio. On Fedora 27, I was able to build libprio by installing the following packages:

dnf install nss-devel msgpack-devel

A Dockerfile will also need to include build essentials like make, gcc, and python-devel.

A docker container should make building and testing less of a hassle.

`prio-processor` command fails with `pkg_resources.ContextualVersionConflict`

To reproduce:

% docker run -it mozilla/prio-processor:latest prio-processor
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/pkg_resources/__init__.py", line 570, in _build_master
    ws.require(__requires__)
  File "/usr/lib/python3.6/site-packages/pkg_resources/__init__.py", line 888, in require
    needed = self.resolve(parse_requirements(requirements))
  File "/usr/lib/python3.6/site-packages/pkg_resources/__init__.py", line 779, in resolve
    raise VersionConflict(dist, req).with_context(dependent_req)
pkg_resources.ContextualVersionConflict: (setuptools 39.2.0 (/usr/lib/python3.6/site-packages), Requirement.parse('setuptools>=40.3.0'), {'google-auth'})

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/prio-processor", line 6, in <module>
    from pkg_resources import load_entry_point
  File "/usr/lib/python3.6/site-packages/pkg_resources/__init__.py", line 3095, in <module>
    @_call_aside
  File "/usr/lib/python3.6/site-packages/pkg_resources/__init__.py", line 3079, in _call_aside
    f(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/pkg_resources/__init__.py", line 3108, in _initialize_master_working_set
    working_set = WorkingSet._build_master()
  File "/usr/lib/python3.6/site-packages/pkg_resources/__init__.py", line 572, in _build_master
    return cls._build_from_requirements(__requires__)
  File "/usr/lib/python3.6/site-packages/pkg_resources/__init__.py", line 585, in _build_from_requirements
    dists = ws.resolve(reqs, Environment())
  File "/usr/lib/python3.6/site-packages/pkg_resources/__init__.py", line 779, in resolve
    raise VersionConflict(dist, req).with_context(dependent_req)
pkg_resources.ContextualVersionConflict: (setuptools 39.2.0 (/usr/lib/python3.6/site-packages), Requirement.parse('setuptools>=40.3.0'), {'google-auth'})

The workaround is to invoke the module directly:

docker run -it mozilla/prio-processor:latest bash -c "cd processor; python3 -m prio_processor"

installation steps don't work

Using the recommended instructions in the README (docker build -t prio .), things seem to work fine for a while until...

...
  Downloading https://files.pythonhosted.org/packages/e3/d9/d9c56deb483c4d3289a00b12046e41428be64e8236fa210111a1f57cc42d/virtualenv_clone-0.5.1-py2.py3-none-any.whl
Installing collected packages: virtualenv, enum34, typing, certifi, virtualenv-clone, pipenv
Successfully installed certifi-2019.3.9 enum34-1.1.6 pipenv-2018.11.26 typing-3.6.6 virtualenv-16.4.3 virtualenv-clone-0.5.1
Removing intermediate container 1ed0d90316a5
 ---> 36988accc720
Step 8/13 : ENV PATH="$PATH:~/.local/bin"
 ---> Running in bf77f7a13f09
Removing intermediate container bf77f7a13f09
 ---> 5bd06bb8c8ce
Step 9/13 : WORKDIR /app
 ---> Running in 8fa3943d5ed3
Removing intermediate container 8fa3943d5ed3
 ---> b381387375f0
Step 10/13 : ADD . /app
 ---> 2809c4cd10cb
Step 11/13 : RUN make
 ---> Running in 0a9b6988ec3c
cd libprio && CCFLAGS='-fPIC' scons && cd ..

scons: *** No SConstruct file found.
File "/usr/lib/python2.7/site-packages/SCons/Script/Main.py", line 924, in _main
make: *** [Makefile:2: all] Error 2
The command '/bin/sh -c make' returned a non-zero code: 2

@acmiyaguchi -- did you forget to add something to the repository? Also this would be a good excuse to make CI work. :) Currently circle is saying there are no builds.

Fix `PrioPublickey_export` functions

On the publickey-export branch, there are typemaps defined so export functions read directly into a PyString or PyByteArray.

https://github.com/acmiyaguchi/python-libprio/blob/b0705d1416e8b66d0a2b4a1dd14da52544b022cc/libprio.i#L107-L131

This typemap is unused at SWIG generation run-time. I ran the following command to debug the typemap search.

$ swig -python -debug-tmsearch libprio.i | less

I searched for PrioPublickey_export and found that typemap(in) unsigned char data[CURVE25519_KEY_LEN] was not on the list of typemaps.

This could be based on the order of compilation. CURVE25519_KEY_LEN is defined at the C preprocessor, which is probably run after the SWIG typemap matching. I could also be using SWIG incorrectly for array types.

Wiki changes

FYI: The following changes were made to this repository's wiki:

These were made as the result of a recent automated defacement of publically writeable wikis.

Support Python 2.7

The build process currently only supports Python 3. It can be modified to support Python 2 too.

It seems like SWIG is mostly independent of the language. There might be some type-mappings in the SWIG interface files that are version specific, like the ones that handle string and bytestring inter-op with C.

Modify BUCKET variables to support protocol prefixes

We are currently setting up the prio servers internally in a cross cloud deployment. The generate script and process script will need to be modified so they are not assuming locations in gs://. For example:

function rsync() {
local server_id=$1
local bucket=$2
local dest=gs://${bucket}/raw/

function send_output_external() {
# Copy data generated by a processing step into the receiving bucket of the
# co-processing server. A _SUCCESS file is generated on a successful copy.
: "${BUCKET_EXTERNAL_SHARED?}" # bucket of the external server
local output_internal=$1 # path to internal output
local output_external=$2 # relative path to external output
local path="gs://${BUCKET_EXTERNAL_SHARED}/${output_external}"

The documentation will also need to be updated to note BUCKET_* variables must be prefixed with the protocol.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.