bsc-dom / dataclay-packaging Goto Github PK
View Code? Open in Web Editor NEWLicense: BSD 3-Clause "New" or "Revised" License
License: BSD 3-Clause "New" or "Revised" License
Roughly, to replicate the bug, follow the following:
p = Person("Real VIP", 13)
p.make_persistent("vip")
p = Person("Fake", 15)
try:
p.make_persistent("vip")
except:
print("Expected exception, ignoring")
p = Person.get_by_alias("vip")
print(p.name)
The exception kicks in. We expected to receive Real VIP
as a print, but instead we receive the fake one.
On a marginally related note, would it make sense to add an "overwrite" flag to the make_persistent
with alias?
Create a demo that integrates all the components. Related to #2
Race condition/bug found:
make_persistent(alias)
CL -> EEmake_persistent
call, loads the object in memory and registers it in LM calling register_object
: EE -> LMregister_object
call and registers it before the clientregister_object
: CL -> LMregister_object
call and sends and exception to client ObjectAlreadyRegistered
add_alias
in LMDesign and implement an integration (contribution classes, needed methods...) with dislib
Design and implement a new model registration mechanism registering source code
Implement demo for compss + dataClay in dockers taking into account Python versioning.
Since the release 2.0, dataClay is not allowing anymore to keep more than one alias per object. The APIs are not updated accordingly, resulting in undefined behavior if an application tries to assign 2 or more aliases for the same object.
My proposal is to re-design the "additional aliases" feature in the same way file systems manage symbolic links: an alias is just a pointer to an objectID.
Refers to #6
Design and implement unique-common logging configuration for javaclay and pyclay. One idea could be to use special configurations or service parameters (--debug) that are translated to specific configurations for each language (log4j in java, python logging library...)
When dataClay is being shut down, it stores all the data into the database.
This is correct for an object store where data is persistent. However, all our HPC use cases are focused on some kind of volatile datasets that are huge. The innecessary teardown procedures means that:
dataClay should retain the feature --as it is an object store-- but we should improve its behaviour on the "ephemereal execution" --as those are all our current HPC use cases.
Proposal:
enqueue_compss
-triggered scenarios.While creating unified traces of COMPSs + dataClay USING DOCKER CONTAINERS, result threads belonging to dataClay services (dsjava, dspython, logicmodule) are not synchronized with COMPSs master and worker threads. This is happening because merger process in Extrae in COMPSs is using -no-syn flag. Using -no-syn flag will not synchronize traces created in different nodes/clocks. Currently, one workaround is to replace trace.sh
script in COMPSs docker container to force them to synchronize resulting traces while merging them.
This may work in MareNostrum (still pending to test, with single-clock)
More information about synchronization:
https://tools.bsc.es/sites/default/files/documentation/html/extrae-3.5.1/merge.html
A dataClay contribution is something that we offer to users. DataClay contributions should be defined in two different packages:
Then, users can register his own synchronization or his own collections in any case.
Also, users could use external libraries installing them in servers and clients. However, we should modify javaclay to avoid registration of external libraries and pyclay to not register inherited mixins as code.
Add a new RPC in ExecutionEnvironments for retrieving batch information about the objects.
The current use case for this call is to enable a performant way to retrieve extra information, to be used by the split, and use a single RPC call per ExecutionEnvironment (instead of doing a RPC per object).
This is useful because the split will need information about a bunch of objects (e.g. the split needs to know if the objects are in-memory or have been evicted into disk) and the split can use the object hint to aggregate all the objects into a single batch call to the ExecutionEnvironment.
The proof-of-concept implementation will allow to ask for "is_in_memory", but further metadata may be provided in the future (e.g. replica stuff, versions, memory tier, placement, whtaever).
If a developer wants to change dataclay-common, he has to do a PR, and once accepted, modify javaclay/pyclay code (update submodule). Travis could do this somehow so developers don't need to do a PR in dataclay-common but only javaclay/pyclay (like we are doing for packaging).
@alexbarcelo what do you think?
Annotate methods to be executed in federated or replica objects
Design and implement a REST API for dataClay (dataclaycmd and services)
Design and implement a system that allow users to "federate" a class. One design can be:
Then, when the user wants to update the class they can call "DataClay.updateFederatedClass" which will work exactly the same as a model redefinition explained in issue #14
Design and implement a distributed metadata system to:
At the moment of writing this issue, there is a paraver
folder in each demo in the dataclay-demos
repository. The compss_and_dataclay.cfg can be found there, and is the configuration that a user should use on the paraver application in order to see the tracing.
We may want to have a single central place to have that configuration (e.g. just as COMPSs does: the folder files/paraver/cfgs
in their repo) and let the user use those configuration. Also, those configuration files should be available in MareNostrum as sometimes users may want to use paraver from within the MareNostrum login nodes.
Design and implement a way to "redefine" or "replace" an already registered class. This will replace the current classes deployed in execution environments and objects belonging to the class will be lost/removed to avoid serialization issues. The call should warn the user about it. A name for the API call can be overwriteModel or replaceModel ...
Current scripts implemented for dataClay HPC using singularity can be used to implement local executions using docker.
Required support for arm64/v8 architecture
Design and implement a new exception management in a performant and usable way.
As remote builds from dockerhub images are preserving environment variables, is it not needed anymore to parse any Dockerfile.
At bootstrap time, if mandatory environment variables and/or properties are not set, a proper error message should be returned.
When important parameters are missing, currently thrown exceptions are not easily interpretable.
We detected certain misbehaviour when executing numerical applications without active features --i.e. doing attribute getters and attribute setters with big numerical data structures.
At first we assumed that it was some performance penalty at the gRPC or socket/serialization layer, and certain preliminary results seemed to confirm that, but after more in-depth tests the results came up unconclusive.
Decide to use nightly builds or run tests in last available package in maven, pypi and dockers. Any votes?
This is associated to epic #6
Implement deployment in supercomputers using Singularity containers
Design and implement a CI system including CI Testing
Refers to Epic #6
Create repository that test current packages in development (dockers, pypi and maven) using BDD functional tests.
Tests will be developed using cucumber
and python behave
. Reports will be published in github.io , one tool could be allure
.
Tests will run tests using current published dev packages in different environments and architectures (arm, amd64, ...) . For that we will use travis
matrix.
Functional tests should be grouped in travis jobs
and may take maximum 50 minutes.
Orchestration of dataClay for each test still under design. We could use bash scripts that execute docker-compose up
or any needed docker command; bash scripts could be reused in orchestration repository in the future (current only using singularity).
Java functional test can be "prepared" using python behave
(deployment, new account, contracts, ...) and using jpype
we could call each Java specific step. @alexbarcelo what do you think? This could allow us to have more reusability of code. Another option is to create two different feature
files, one for Java and one for Python (and then use cucumber-jvm
for Java and behave
for Python) but is not conceptually correct that a feature depends on the environment (is like creating features for OS)
Final structure would be:
.
├── make-persistent
│ ├── docker-compose.yml (maybe as a docstring in feature)
│ ├── environment.py (include calls to test-orchestration)
│ ├── java
│ │ ├── app
│ │ │ ├── cfgfiles
│ │ │ └── src
│ │ └── model
│ ├── make-persistent.feature
│ ├── python
│ │ ├── app
│ │ │ ├── cfgfiles
│ │ │ └── src
│ │ │ └── hellopeople.py
│ │ └── model
│ │ └── src
│ │ ├── classes.py
│ │ └── __init__.py
│ └── steps
│ └── make-persistent.py
└── test-orchestration
└── orchestrate.sh
Integration with COMPSs + dataClay may require support to python 3.6 and/or make it default.
One option could be to install numpy only for x86_64 architecture?
Discuss what to do with client image (tag client with python version used by the client image?) Client image is used for demos of compss and dataclay in dockers.
newVersion feature hasn't been properly implemented nor tested for Python objects
Since atm an object can have just one alias, the following 2 things should be changed:
deleteAlias
method should not require any argument (both in Java and Python)Implement scripts for orchestrating singularity/docker instances in localhost/marenostrum/...
When dataclay-packaging is merged to master, travis can deploy dockers to Dockerhub, check if Travis can deploy them in ARM also.
This is associated to #6
Simplify logging so users don't need to configure log4j and global.properties CHECK_LOG4J_ENABLED=true
in java or DEBUG=True
in python. Also, simplify python logging.
There should be two flags "--debug" and "--verbose" that can be added in docker-compose and singularity-compose.
dsjava:
command: "--debug"
When this is enabled, then services will print debug info (grpc debug info will be printed in verbose mode even if grpc consider it debug)
Currently, for dataclaycmd, we just need to add "--debug" or "--verbose" to the command itself:
docker run -v $PWD/app/cfgfiles/:/home/dataclayusr/dataclay/cfgfiles/:ro \
bscdataclay/client:2.1 --debug WaitForDataClayToBeAlive 10 5
Design and implement a way to manage memory leaks caused by several requests at the same time (accounting)... Most of the memory management design can be found here:
Unlimited solution: Thread state persistence
Configure travis to run nightly singularity push and run singularity pulls in MN
We have two Storage Locations, with two objecs A and B where A references B and B references A. When those objects are not accessible, they should be cleaned by the GC but they're currently not.
When the DS1 GC finds out that object A is candidate to be cleaned, it is marked as 'unaccessible' and all references objects from A should receive a "-1" ref. counting. GC notifies DS2 that B has -1 reference. In the next iteration, GC realizes that A has 1 reference, so it is not an 'unaccessible' candidate anymore and A is marked as 'accessible', so GC should notify DS2 +1 reference to B.
If, during -1 notification, DS2 processes candidate B and notifies -1 to DS1, then objects are properly removed, however, this cannot be guaranteed, so if DS2 processes the -1 notification and notifies -1 later, then the objects are never cleaned.
This is not happening if the cycle is distributed.
Any ideas on how to fix that? @alexbarcelo
"LOCAL" flag uses session.properties
file to determine in which Storage Location persisted objects created during that session should be stored. However, in a COMPSs environment, session.properties
file is propagated and copied to all workers, so all objects created by workers are going to be stored in master's SL. This should be fixed with a new design.
New dataclaycmd library that do not depend on java or python (maybe go) so users can install and use it. Previous dataclay tool required docker or dataclay JAR, dataClay 2.0 avoid the user to have anything installed since it is inside a docker container, however we should offer a better and friendlier way to do that (and not depend on dockers)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.