Giter VIP home page Giter VIP logo

fuse-posix's Introduction

Rucio - Scientific Data Management

Rucio is a software framework that provides functionality to organize, manage, and access large volumes of scientific data using customisable policies. The data can be spread across globally distributed locations and across heterogeneous data centers, uniting different storage and network technologies as a single federated entity. Rucio offers advanced features such as distributed data recovery or adaptive replication, and is highly scalable, modular, and extensible. Rucio has been originally developed to meet the requirements of the high-energy physics experiment ATLAS, and is continuously extended to support LHC experiments and other diverse scientific communities.

Documentation

General information, API/REST description and guides can be found in our documentation or on our webpage.

Try it out

We provide a dockerized environment which serves both as a demo environment and a development environment. It includes all the necessary preconfigured components for multiple storage and transfers developments.

Developers

For information on how to contribute to Rucio, please refer and follow our CONTRIBUTING guidelines. We strongly recommend to use the dockerized environment for development.

Operators

To learn how to deploy and configure Rucio, consult the documentation available online.

Getting Support

If you are looking for support, please contact us via one of our official channels.

fuse-posix's People

Contributors

gabrielefronze avatar viveknigam3003 avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fuse-posix's Issues

Make token expiration time eval consistent with UTC

Rucio server always uses UTC and the fuse module must do the same.
At the moment too much use of local timezones in the logs and internal operations is causing issues when moving from one timezone to another or connecting through multiple timezones.

Moving to UTC will make everything universal, the logs as well.

Mark incoming files as downloading

When a file starts being downloaded it's inode should be marked as incoming, or an error message should be triggered to inform the opener the file is getting ready.
Alternatively the download should be blocking.

Avoid segfault if file not found/yet downloaded

In branch use-rucio-cli, if the call through bash to rucio download fails, the fuse module breaks.
It might be enough to touch the file and just return nothing if the file is not found.
Even if the call trough bash won't be there in the production environment, this check will remain the same and will make the algorithm widely more reliable.

Improve and Fix Documentation

Motivation

The documentation for this software needs improvements for better collaboration and improving code readability. The required section should be created with a CONTRIBUTING and SET UP guide, along with proper formatting of existing README.

Add script to setup FUSE mount after build.

Motivation

The build script handles the compilation of the program but does not set up FUSE in the way it can be used for mounting the Rucio server(s) to a directory using the FUSE mount. A post-build script is thus required to automate the process of setting up FUSE and mounting the server using the same.

Expected Result

vivek@ubuntu:~/Desktop/GSoC_2020/fuse-posix$ sudo ./postbuild.sh
Setting up FUSE for ruciofs.........
Done.

vivek@ubuntu:~/Desktop/GSoC_2020/fuse-posix$ ./cmake-build-debug/bin/rucio-fuse-main

Settings file at: ./settings.json
Parsing settings file:

	Server 0 -> rucio-dev-server:
		url = https://localhost/
		account = root

Actual Result

Manually set up FUSE. Add the user to the fuse group and set the right permissions.

Convert and extend existing tests in unit tests

Motivation

Some tests are already implemented, but they are not part of a test framework.
Reimplementing and extending them with ctest or google test is crucial for CI and CD.

Modification

Add units tests written with ctest or Google Test for Rucio-FUSE

Hide hidden files from FUSE handling

Motivation

Both on macOS and Ubuntu (and many other OSs), hidden files are the current implementation for recycle bin and many other features.
We must avoid Rucio FUSE mount from trying to handle them since they result in unsuccessful server calls.

Expected result

Rucio FUSE should avoid handling of .hidden files

Actual Result

Example:
Enviroment: macOS

$ ./cmake-build-debug/bin/rucio-fuse-main

Settings file at: ./settings.json
Parsing settings file:
curl_easy_perform() failed: Couldn't resolve host name

Server https://rucio-server not reachable, skipping connection.
curl_easy_perform() failed: Couldn't resolve host name

Server https://rucio-server not reachable, skipping connection.
Server DCIM not found. Aborting!
Server .Spotlight-V100 not found. Aborting!
Server .Spotlight-V100 not found. Aborting!
Server .Spotlight-V100 not found. Aborting!
Server .Spotlight-V100 not found. Aborting!
Server .Spotlight-V100 not found. Aborting!
Server .metadata_never_index not found. Aborting!
Server .hidden not found. Aborting!
Server .localized not found. Aborting!
Server .DS_Store not found. Aborting!
Server ._.DS_Store not found. Aborting!

Environment: Ubuntu

$ ./cmake-build-debug/bin/rucio-fuse-main -f ~/ruciofs

Settings file at: ./settings.json
Parsing settings file:
curl_easy_perform() failed: Couldn't resolve host name
Server https://rucio-server not reachable, skipping connection.
curl_easy_perform() failed: Couldn't resolve host name
Server https://rucio-server not reachable, skipping connection.
Server .Trash not found. Aborting!
Server .Trash not found. Aborting!
Server .Trash not found. Aborting!
Server .Trash not found. Aborting!
Server .Trash not found. Aborting!
Server .Trash-1000 not found. Aborting!
Server .Trash-1000 not found. Aborting!
Server .Trash-1000 not found. Aborting!
Server .Trash-1000 not found. Aborting!
Server .Trash-1000 not found. Aborting!
Server .xdg-volume-info not found. Aborting!
Server autorun.inf not found. Aborting!
Server .hidden not found. Aborting!

Modification

It should be enough to return if the last item of the path starts by ..

Adding a settings.json.template

Motivation

The settings.json file provides the fixed key-value pairs which help the program to fetch the details of the rucio-server. However, this may be changed accordingly depending on the dev environment or user requirements. So a settings.json.template is required which can be modified according to the user/developer's need.

Expected Result

A settings.json.template based on the current settings.json file.

Bad requests & core dumps

After running the rucio-fuse-main executable, I try to list the /ruciofs/rucio-server directory:

# ls /ruciofs/rucio-server
00BadReques

If i list the contents beyond rucio-server/ everything crashes:

# ls /ruciofs/rucio-server/00BadReques
ls: reading directory /ruciofs/rucio-server/00BadReques: Software caused connection abort

Sometimes, if i try the rucio-server-clone directory, i can actually see the scopes from my server's namespace, though:

# ls /ruciofs/rucio-server-clone
ls: cannot access /ruciofs/rucio-server-clone/ER8: Software caused connection abort
ls: cannot access /ruciofs/rucio-server-clone/O3: Transport endpoint is not connected
ls: cannot access /ruciofs/rucio-server-clone/O2: Transport endpoint is not connected
ls: reading directory /ruciofs/rucio-server-clone: Transport endpoint is not connected
ER8  O2  O3

The log on my rucio server shows:

10.244.11.1	[31/Oct/2019:21:47:53 +0000]	-	1	1770247	"root-ligolab-unknown-3214775a91484c389a9634e4cf77f681"	XbtWiY7nZXm5q7NqKYYBpQAAAAI	-	"GET /scopes/ HTTP/1.1"	200	19
10.244.11.1	[31/Oct/2019:21:47:56 +0000]	-	0	118925	"root-ligolab-unknown-3214775a91484c389a9634e4cf77f681"	XbtWjI7nZXm5q7NqKYYBpgAAAAI	-	"GET /dids/00BadReques/ HTTP/1.1"	200	-

All of these actions eventually result in the whole thing crashing:

docker run -it  --device /dev/fuse --cap-add SYS_ADMIN jclarkastro/fuse-posix
----------------------------------------
Activating rucio fuse-posix interface...
Creating CURL instance
/docker-entrypoint.sh: line 4:     6 Segmentation fault      (core dumped) /fuse-posix/cmake-build-debug/bin/rucio-fuse-main

Adding documentation for post-build actions

Motivation

The FUSE mount does not directly work after executing ./build.sh and needs some additional steps to mount the server to a directory. This issue is opened to improve the documentation and add the post-build actions to successfully mount the server.

Implement file download to persistent cache

The FUSE module does not provide yet the ability to download files via rucio CLI.
The implementation in use-rucio-cli branch does that, but the downloaded files are not stored locally across reboots of the system.
The downloads should happen towards a separate cache location. The FUSE representation would rely on this separate persistent cache location to provide the actual file contents via a mechanism similar to a symlink directly handled by FUSE.

Mount point specification flag -f not parsed at runtime.

Motivation

The documentation mentions mounting the server to the desired location overriding the default /ruciofs directory by adding the -f option while running the script. However, this does not work at the time of execution.

Environment: Ubuntu 18.04.4 LTS (bionic)

Steps to reproduce

vivek@ubuntu:~/Desktop/GSoC_2020/fuse-posix$ mkdir ~/ruciofs
vivek@ubuntu:~/Desktop/GSoC_2020/fuse-posix$ ./cmake-build-debug/bin/rucio-fuse-main -f ~/ruciofs

Expected Result

vivek@ubuntu:~/Desktop/GSoC_2020/fuse-posix$ mkdir ~/ruciofs
vivek@ubuntu:~/Desktop/GSoC_2020/fuse-posix$ ./cmake-build-debug/bin/rucio-fuse-main -f ~/ruciofs

Settings file at: ./settings.json
Parsing settings file:

	Server 0 -> rucio-dev-server:
		url = https://<rucio-server-url>/
		account = <your_account>
		username = <username>
		password = <password>
Server .Trash not found. Aborting!
Server .Trash not found. Aborting!
Server .Trash not found. Aborting!
Server .Trash not found. Aborting!
Server .Trash not found. Aborting!
Server .Trash-1000 not found. Aborting!
Server .Trash-1000 not found. Aborting!
Server .Trash-1000 not found. Aborting!
Server .Trash-1000 not found. Aborting!
Server .Trash-1000 not found. Aborting!

Actual Result

vivek@ubuntu:~/Desktop/GSoC_2020/fuse-posix$ mkdir ~/ruciofs
vivek@ubuntu:~/Desktop/GSoC_2020/fuse-posix$ ./cmake-build-debug/bin/rucio-fuse-main -f ~/ruciofs

 Settings file at: ./settings.json
Parsing settings file:	Server 0 -> rucio-dev-server:
		url = https://<rucio-server-url>/
		account = <your_account>
		username = <username>
		password = <password>
fuse: bad mount point `/ruciofs': No such file or directory

Notice wrong authentication parameters with pingable server

At startup all the servers are pinged, but the authentication is not tested.
Later on, at runtime, this might cause bizarre issues when trying to contact the server, leading to error messages rendered as folders and files.
The server credentials must be tested at startup, disabling servers impossible to ping AND authenticate to.

cd to non existent directory still possible at server mountpoint and scopes levels

cd to non existent directory is still possible at server and scopes level.

[root@ruciofs]/ruciofs# ls
test-userpass  test-x509
[root@ruciofs]/ruciofs# cd this-server-doesnt-exist
[root@ruciofs]/ruciofs/this-server-doesnt-exist# ls
[root@ruciofs]/ruciofs/this-server-doesnt-exist#
[root@ruciofs]/ruciofs/test-userpass# ls
NoProjectDefined  archive  data13_hip  mock  test
[root@ruciofs]/ruciofs/test-userpass# cd this-scope-doesnt-exist
[root@ruciofs]/ruciofs/test-userpass/this-scope-doesnt-exist#
[root@ruciofs]/ruciofs/test-userpass# cd test
[root@ruciofs]/ruciofs/test-userpass/test# cd this-file-doesnt-exist
cd: Input/output error: this-file-doesnt-exist
[root@ruciofs]/ruciofs/test-userpass/test/container# cd this-file-doesnt-exist
cd: Input/output error: this-file-doesnt-exist

Retry download not working

Apparently there's an error within the code wrapping rucio download, which was previously commented to avoid issues.
If a download fails it should be retried by the pipeline:

if(output->freturn_code != MAX_ATTEMPTS){
fastlog(INFO,"Trying again did %s download in %s.", output->fdid.data(), output->full_cache_path().data());
// fInputQ->append(*output);

The idea is nice, but the issue is with the if condition. In fact testing:

output->freturn_code != MAX_ATTEMPTS

should be:

output->freturn_code != TOO_MANY_ATTEMPTS

or:

output->fattempt > MAX_ATTEMPTS

based on the definition here:

rucio_download_info* rucio_download_wrapper(rucio_download_info& info){
if(not server_exists(info.fserver_name)) {
fastlog(ERROR, "Server %s not found. Aborting!", info.fserver_name.data());
info.freturn_code = SERVER_NOT_FOUND;
return &info;
} else {
if (info.fattempt <= MAX_ATTEMPTS and info.freturn_code != SETTINGS_NOT_FOUND) {
info.fattempt++;
info.freturn_code = rucio_download_wrapper(info.fserver_name, info.fserver_config, info.scopename(),
info.filename());
info.fdownloaded = (info.freturn_code != FILE_NOT_FOUND and info.freturn_code != SETTINGS_NOT_FOUND);
} else {
info.freturn_code = TOO_MANY_ATTEMPTS;
info.fdownloaded = false;
}
}
return &info;
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.