Giter VIP home page Giter VIP logo

sda-filesystem's Introduction

SDA-Filesystem / Data Gateway

Linting go code Unit Tests Coverage Status

This project has been rebranded as Data Gateway

Data Gateway makes use of the:

It builds a FUSE (Filesystem in Userspace) layer and uses Airlock to export files to SD Connect. Software currently supports Linux, macOS and Windows for:

Binaries are built on each release for all supported Operating Systems.

Requirements

Go version 1.21

Set these environment variables before running the application:

  • FS_SD_CONNECT_API - API for SD-Connect
  • FS_SD_SUBMIT_API โ€“ a comma-separated list of APIs for SD Apply/SD Submit
  • SDS_ACCESS_TOKEN - a JWT for authenticating to the SD APIs
  • FS_CERTS - path to a file that contains certificates required by SD Connect, SD Apply/SD Submit, and SDS AAI

Optional envronment variables:

  • CSC_USERNAME - username for SDA-Filesystem
  • CSC_PASSWORD - password for SDA-Filesystem and Airlock CLI

For test environment follow instructions at https://gitlab.ci.csc.fi/sds-dev/local-proxy

Graphical User Interface

Dependencies

cgofuse and its dependencies on different operating systems.

Install Wails and its dependencies.

Install pnpm

Build and Run

Before running/building the repository for the first time, generate the frontend assests by running:

pnpm install --prefix frontend
pnpm --prefix frontend run build

To run in development mode:

cd cmd/gui
wails dev

To build for production:

cd cmd/gui

# For Linux and macOS
wails build -upx -trimpath -clean -s

# For Windows
wails build -upx -trimpath -clean -s -webview2=embed

Deploy

See Linux setup.

Command Line Interface

Two command line binaries are released, one for SDA-Filesystem and one for Airlock.

SDA-Fileystem

The CLI binary will require a username and password for accessing the SD-Connect Proxy API. Username is given as input. Password is either given as input or in an environmental variable.

Build and Run

go build -o ./go-fuse ./cmd/fuse/main.go

Test install.

./go-fuse -help                        
Usage of ./go-fuse:
  -alsologtostderr
    	log to standard error as well as files
  -http_timeout int
    	Number of seconds to wait before timing out an HTTP request (default 20)
  -log_backtrace_at value
    	when logging hits line file:N, emit a stack trace
  -log_dir string
    	If non-empty, write log files in this directory
  -loglevel string
    	Logging level. Possible values: {debug,info,warning,error} (default "info")
  -logtostderr
    	log to standard error instead of files
  -mount string
    	Path to Data Gateway mount point
  -project string
    	SD Connect project if it differs from that in the VM
  -sdapply
      Connect only to SD Apply
  -stderrthreshold value
    	logs at or above this threshold go to stderr
  -v value
    	log level for V logs
  -vmodule value
    	comma-separated list of pattern=N settings for file-filtered logging

Example run: ./go-fuse -mount=$HOME/ExampleMount will create the FUSE layer in the directory $HOME/ExampleMount for both 'SD Connect' and 'SD Apply'.

User input

User can update the filesystem by inputting the command update. This requires that no files inside the filesystem are being used. Update also clears cache. As a result of this operation, new files may be added and some old ones removed.

The filesystem can be also updated programatically with the SIGUSR2 signal.

To update filesystem on bash in SD Desktop:

# Update CLI version
kill -s SIGUSR2 $(pgrep go-fuse)

# Update GUI version
kill -s SIGUSR2 $(pgrep sda-fuse)

If the user wants to update particular SD Connect files inside the filesystem, the user can input command clear <path>. <path> is the path to the file/folder that the user wishes to update. <path> must at least contain a bucket, i.e. SD-Connect/project/bucket or SD-Connect/project/bucket/file would be acceptable paths, but not, e.g., SD-Connect/project. If the user gives a path to a folder, all files inside this folder are updated but no files are added or removed. This operation clears the cache for all the neccessary files so that the new content is read from the database and sizes of these files are updated in the filesystem.

Airlock

The CLI binary will require a username, a bucket and a filename. Password is either given as input or in an environmental variable.

Build and Run

go build -o ./airlock ./cmd/airlock/main.go

Test install.

./airlock -help
Usage of ./airlock:
  -alsologtostderr
    	log to standard error as well as files
  -debug
    	Enable debug prints
  -journal-number string
    	Journal Number/Name specific for Findata uploads
  -log_backtrace_at value
    	when logging hits line file:N, emit a stack trace
  -log_dir string
    	If non-empty, write log files in this directory
  -logtostderr
    	log to standard error instead of files
  -original-file string
    	Filename of original unecrypted file when uploading pre-encrypted file from Findata vm
  -project string
    	SD Connect project if it differs from that in the VM
  -quiet
    	Print only errors
  -segment-size int
    	Maximum size of segments in Mb used to upload data. Valid range is 10-4000. (default 4000)
  -stderrthreshold value
    	logs at or above this threshold go to stderr
  -v value
    	log level for V logs
  -vmodule value
    	comma-separated list of pattern=N settings for file-filtered logging

Example run: ./airlock username ExampleBucket ExampleFile will export file ExampleFile to bucket ExampleBucket.

Troubleshooting

See troubleshooting for fixes to known issues.

License

Data Gateway is released under MIT, see LICENSE.

Wails is released under MIT

CgoFuse is released under MIT

sda-filesystem's People

Contributors

emm1r avatar blankdots avatar dependabot[bot] avatar github-actions[bot] avatar teemukataja avatar vvaltia avatar

Stargazers

 avatar Nils Hoffmann avatar Evgenia Lyjina avatar  avatar

Watchers

Janne Lauros avatar Timo Nurminen avatar  avatar  avatar

Forkers

o-alex

sda-filesystem's Issues

Remove .c4gh extension from encrypted files

At the moment files that are encrypted have .c4gh extension in mounted filesystems. This causes issues with applications that try to detect file format from filename extension. Especially, this will cause problems Windows where this is default behaviour.

We should also try to find a way to deal with filename conflicts as that might happen when we remove .c4gh extension.

move selecting directory from main view

The view below should change as follows:
image

  • Choose directory should be remove from Home tab s well as the title for the 2. Mount Directory
  • We will default to user's Home/Projects directory
  • keeping choosing the directory functionality, by introducing a Settings tab after Home which will allow users to change the directory
  • if an error occurs and the Home/Projects directory has already data or is already mounted we will redirect users to Settings tab to change it

change default window size

Default window size should not be full screen.
pending information what should be the default window size.

SD-Connect API merger

SD-Connect Data-API and Metadata-API are going to be merged into a single app. Old endpoints stay the same. Instead of two environment variables we're going to need only one.

Old

export FS_SD_CONNECT_METADATA_API=https://connect-metadata-api-test.sd.csc.fi
export FS_SD_CONNECT_DATA_API=https://connect-data-api-test.sd.csc.fi

New, e.g.

export FS_SD_CONNECT_API=https://connect-api-test.sd.csc.fi

Filter logs

Give user the possibility to filter logs in the UI based on the levels (info, warning, error)

change main services names

change the names of the services:

  • SD-Submit- > SD Apply - probably we should change folder name to SD Apply instead of SD-Submit
  • SD-Connect- > SD Connect

Refresh FUSE layer

As a user i need to be able to refresh the fuse layer with new projects/datasets that have been added while i keep the UI open

It is assumed that we need to check whether projects/datasets have been added/removed and what is inside projects/datasets

Export

Design: https://www.figma.com/file/L7lldtlYvpXNZrriYueZVC/DRAFT-2---Data-Gateway-with-compute-(Copy)?node-id=2434%3A3668

Given that user wants to select files
when user Select files button
then user will see file browser and can select files
and when user has selected files
then user will see files in the Export list.

Given that user wants to drag and drop files
when user drags files to area
then user will see that area is highlighted
and when user drops files
then user will see files in the Export list.

Given that user wants to remove files from the Export list
when user selects Remove link next to the file
then user will see that file was removed from the Export list.

Given that user wants to export files
when user selects Export button
then user will see progress bar and file list with progress bars for individual files and can follow progress
and when export is ready
then user will see that progress bar is full
and also progress bars of individual files are full.

Prevent thumbnail creation for files

Updating fuse is not always possible even if user is not using files. This is because, when opening folders in a file browser, some process begins reading files for the intention of creating thumbnails for them (QuickLook Thumbnailing on macOS). This chould be possibly prevented by checking the pid of the program that reads the file.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.