Giter VIP home page Giter VIP logo

dataclay-packaging's People

Contributors

alexbarcelo avatar dgasull avatar marcmonfort avatar pierlauro avatar

Watchers

 avatar  avatar

dataclay-packaging's Issues

Re-using (misusing) alias bug

Roughly, to replicate the bug, follow the following:

p = Person("Real VIP", 13)
p.make_persistent("vip")
p = Person("Fake", 15)
try:
    p.make_persistent("vip")
except:
    print("Expected exception, ignoring")
p = Person.get_by_alias("vip")
print(p.name)

The exception kicks in. We expected to receive Real VIP as a print, but instead we receive the fake one.

On a marginally related note, would it make sense to add an "overwrite" flag to the make_persistent with alias?

Race condition during make persistent of alias and memory pressure

Race condition/bug found:

  1. Client sends object to be persistent make_persistent(alias) CL -> EE
  2. EE is under memory pressure
  3. EE receives make_persistent call, loads the object in memory and registers it in LM calling register_object: EE -> LM
  4. LM receives register_object call and registers it before the client
  5. Client tries to register the object in LM because it has an alias register_object: CL -> LM
  6. LM receives register_object call and sends and exception to client ObjectAlreadyRegistered
  7. Client receives exception and calls add_alias in LM
  8. LM should add the alias but is not

Re-design multiple aliases

Since the release 2.0, dataClay is not allowing anymore to keep more than one alias per object. The APIs are not updated accordingly, resulting in undefined behavior if an application tries to assign 2 or more aliases for the same object.

My proposal is to re-design the "additional aliases" feature in the same way file systems manage symbolic links: an alias is just a pointer to an objectID.

Unified logging configuration

Design and implement unique-common logging configuration for javaclay and pyclay. One idea could be to use special configurations or service parameters (--debug) that are translated to specific configurations for each language (log4j in java, python logging library...)

Tackle the teardown procedure for huge ephemereal datasets

When dataClay is being shut down, it stores all the data into the database.

This is correct for an object store where data is persistent. However, all our HPC use cases are focused on some kind of volatile datasets that are huge. The innecessary teardown procedures means that:

  • Executions may be half an hour longer than necessary (so batch explorations are much lower)
  • Home quota is filled (so sequential executions have to wait for a human to manually clean the files)

dataClay should retain the feature --as it is an object store-- but we should improve its behaviour on the "ephemereal execution" --as those are all our current HPC use cases.

Proposal:

  • Flag (or similar mechanism) for the orchestration to indicate a "dirty shutdown". This may be the default for enqueue_compss-triggered scenarios.
  • Alternatively, an "ephemereal HPC" flag which forces dataClay DataServices to avoid serializing to disk altogether.

COMPSs + dataClay synchronization of traces

While creating unified traces of COMPSs + dataClay USING DOCKER CONTAINERS, result threads belonging to dataClay services (dsjava, dspython, logicmodule) are not synchronized with COMPSs master and worker threads. This is happening because merger process in Extrae in COMPSs is using -no-syn flag. Using -no-syn flag will not synchronize traces created in different nodes/clocks. Currently, one workaround is to replace trace.sh script in COMPSs docker container to force them to synchronize resulting traces while merging them.

This may work in MareNostrum (still pending to test, with single-clock)

More information about synchronization:
https://tools.bsc.es/sites/default/files/documentation/html/extrae-3.5.1/merge.html

Modify dataClay contributions structure

A dataClay contribution is something that we offer to users. DataClay contributions should be defined in two different packages:

  • dataclay.contrib.models: this package contains classes that must be registered in dataClay (and therefore, execution classes and stubs must be generated) because those classes define object states. Objects that can be persisted, moved, replicated...
  • dataclay.contrib.modules: this package contains methods or functionalities that must not be registered in dataClay but registered classes can use it (like synchronization)

Then, users can register his own synchronization or his own collections in any case.

Also, users could use external libraries installing them in servers and clients. However, we should modify javaclay to avoid registration of external libraries and pyclay to not register inherited mixins as code.

Add support for batch object info into ExecutionEnvironments

Add a new RPC in ExecutionEnvironments for retrieving batch information about the objects.

The current use case for this call is to enable a performant way to retrieve extra information, to be used by the split, and use a single RPC call per ExecutionEnvironment (instead of doing a RPC per object).

This is useful because the split will need information about a bunch of objects (e.g. the split needs to know if the objects are in-memory or have been evicted into disk) and the split can use the object hint to aggregate all the objects into a single batch call to the ExecutionEnvironment.

The proof-of-concept implementation will allow to ask for "is_in_memory", but further metadata may be provided in the future (e.g. replica stuff, versions, memory tier, placement, whtaever).

Automatize merge and PR for dataclay-common

If a developer wants to change dataclay-common, he has to do a PR, and once accepted, modify javaclay/pyclay code (update submodule). Travis could do this somehow so developers don't need to do a PR in dataclay-common but only javaclay/pyclay (like we are doing for packaging).

  • A developer creates a PR in javaclay/pyclay that points to a different dataclay-common submodule reference i.e. (feature/new_calls)
  • The PR is accepted
  • Travis realizes that the submodule reference != dataclay-common develop last commit --> Travis merges datacaly-common branch feature/new_calls -> develop and updates the submodule reference

@alexbarcelo what do you think?

REST API

Design and implement a REST API for dataClay (dataclaycmd and services)

Federate model

Design and implement a system that allow users to "federate" a class. One design can be:

  • The user calls "DataClay.federateClass(namespace, classname, destDataClay)" and the class is send to an external dataClay. If the class is already there, then there's no effect

Then, when the user wants to update the class they can call "DataClay.updateFederatedClass" which will work exactly the same as a model redefinition explained in issue #14

Distributed Metadata system

Design and implement a distributed metadata system to:

  • Improve performance
  • Avoid going to LM to get metadata

Provide the Paraver configuration files in a easily reachable centralized place

At the moment of writing this issue, there is a paraver folder in each demo in the dataclay-demos repository. The compss_and_dataclay.cfg can be found there, and is the configuration that a user should use on the paraver application in order to see the tracing.

We may want to have a single central place to have that configuration (e.g. just as COMPSs does: the folder files/paraver/cfgs in their repo) and let the user use those configuration. Also, those configuration files should be available in MareNostrum as sometimes users may want to use paraver from within the MareNostrum login nodes.

Registered model redefinition

Design and implement a way to "redefine" or "replace" an already registered class. This will replace the current classes deployed in execution environments and objects belonging to the class will be lost/removed to avoid serialization issues. The call should warn the user about it. A name for the API call can be overwriteModel or replaceModel ...

Simplify singularity builds

As remote builds from dockerhub images are preserving environment variables, is it not needed anymore to parse any Dockerfile.

  • Remove [begin|end]ENVruntime from dockerfiles
  • Simplify singularity build

Bootstrap sanity check

At bootstrap time, if mandatory environment variables and/or properties are not set, a proper error message should be returned.

When important parameters are missing, currently thrown exceptions are not easily interpretable.

Investigate and address certain performance problems for high-volume workloads.

We detected certain misbehaviour when executing numerical applications without active features --i.e. doing attribute getters and attribute setters with big numerical data structures.

At first we assumed that it was some performance penalty at the gRPC or socket/serialization layer, and certain preliminary results seemed to confirm that, but after more in-depth tests the results came up unconclusive.

Nightly builds or snapshots

Decide to use nightly builds or run tests in last available package in maven, pypi and dockers. Any votes?

This is associated to epic #6

Implement testing system

Refers to Epic #6

Create repository that test current packages in development (dockers, pypi and maven) using BDD functional tests.

Tests will be developed using cucumber and python behave. Reports will be published in github.io , one tool could be allure.

Tests will run tests using current published dev packages in different environments and architectures (arm, amd64, ...) . For that we will use travis matrix.

Functional tests should be grouped in travis jobs and may take maximum 50 minutes.

Orchestration of dataClay for each test still under design. We could use bash scripts that execute docker-compose up or any needed docker command; bash scripts could be reused in orchestration repository in the future (current only using singularity).

Java functional test can be "prepared" using python behave (deployment, new account, contracts, ...) and using jpype we could call each Java specific step. @alexbarcelo what do you think? This could allow us to have more reusability of code. Another option is to create two different feature files, one for Java and one for Python (and then use cucumber-jvm for Java and behave for Python) but is not conceptually correct that a feature depends on the environment (is like creating features for OS)

Final structure would be:

.
├── make-persistent
│   ├── docker-compose.yml  (maybe as a docstring in feature)
│   ├── environment.py (include calls to test-orchestration)
│   ├── java
│   │   ├── app
│   │   │   ├── cfgfiles
│   │   │   └── src
│   │   └── model
│   ├── make-persistent.feature
│   ├── python
│   │   ├── app
│   │   │   ├── cfgfiles
│   │   │   └── src
│   │   │       └── hellopeople.py
│   │   └── model
│   │       └── src
│   │           ├── classes.py
│   │           └── __init__.py
│   └── steps
│       └── make-persistent.py
└── test-orchestration
    └── orchestrate.sh

Add support to python 3.6

Integration with COMPSs + dataClay may require support to python 3.6 and/or make it default.
One option could be to install numpy only for x86_64 architecture?

Discuss what to do with client image (tag client with python version used by the client image?) Client image is used for demos of compss and dataclay in dockers.

Update 'delete alias'

Since atm an object can have just one alias, the following 2 things should be changed:

  • The deleteAlias method should not require any argument (both in Java and Python)
  • The documentation should be updated accordingly

Travis ARM release

When dataclay-packaging is merged to master, travis can deploy dockers to Dockerhub, check if Travis can deploy them in ARM also.

This is associated to #6

Simplify logging

Simplify logging so users don't need to configure log4j and global.properties CHECK_LOG4J_ENABLED=true in java or DEBUG=True in python. Also, simplify python logging.

There should be two flags "--debug" and "--verbose" that can be added in docker-compose and singularity-compose.

dsjava: 
   command: "--debug" 

When this is enabled, then services will print debug info (grpc debug info will be printed in verbose mode even if grpc consider it debug)

Currently, for dataclaycmd, we just need to add "--debug" or "--verbose" to the command itself:

docker run -v $PWD/app/cfgfiles/:/home/dataclayusr/dataclay/cfgfiles/:ro \
	 bscdataclay/client:2.1 --debug WaitForDataClayToBeAlive 10 5

Distributed objects in cycle are not being cleaned by GC

We have two Storage Locations, with two objecs A and B where A references B and B references A. When those objects are not accessible, they should be cleaned by the GC but they're currently not.

When the DS1 GC finds out that object A is candidate to be cleaned, it is marked as 'unaccessible' and all references objects from A should receive a "-1" ref. counting. GC notifies DS2 that B has -1 reference. In the next iteration, GC realizes that A has 1 reference, so it is not an 'unaccessible' candidate anymore and A is marked as 'accessible', so GC should notify DS2 +1 reference to B.

If, during -1 notification, DS2 processes candidate B and notifies -1 to DS1, then objects are properly removed, however, this cannot be guaranteed, so if DS2 processes the -1 notification and notifies -1 later, then the objects are never cleaned.

This is not happening if the cycle is distributed.

Any ideas on how to fix that? @alexbarcelo

"LOCAL" flag for make persistent not working in COMPSs

"LOCAL" flag uses session.properties file to determine in which Storage Location persisted objects created during that session should be stored. However, in a COMPSs environment, session.properties file is propagated and copied to all workers, so all objects created by workers are going to be stored in master's SL. This should be fixed with a new design.

New dataclaycmd library

New dataclaycmd library that do not depend on java or python (maybe go) so users can install and use it. Previous dataclay tool required docker or dataclay JAR, dataClay 2.0 avoid the user to have anything installed since it is inside a docker container, however we should offer a better and friendlier way to do that (and not depend on dockers)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.