Giter VIP home page Giter VIP logo

gin-cli's Introduction

GIN-CLI

GoDoc Build status


G-Node Infrastructure Command Line Client

This package is a command line client for interfacing with repositories hosted on GIN. It offers a simplified interface for downloading and uploading files from repositories hosted on GIN.

It consists of commands for interfacing with the GIN web API (e.g., listing repositories, creating repositories, managing SSH keys) but primarily, it wraps git and git-annex commands to make working with data repositories easier.

Information, setup, and guides

For installation instructions see the GIN Client Setup page.

General information, help, and guides for using GIN can be found on the GIN Info Wiki. Help and information for the client in particular can be on the following pages:

gin-cli's People

Contributors

achilleas-k avatar cgars avatar gicmo avatar hkchekc avatar mpsonntag avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

gin-cli's Issues

Host key verification failed

On new systems where the server's SSH host key has not been entered into the known_hosts, the git commands fail because the host key is unknown and the user is not prompted.

The SSH option is
ssh -o StrictHostKeyChecking=no [email protected]

Do not use agent key for different logged in user

Consider the following scenario:

  1. Alice logs in using the client gin login alice
  2. Alice uploads her key to gin-auth gin keys --add ~/.ssh/id_rsa.pub
  3. Alice does some work using her key in her SSH agent
  4. Alice logs out gin logout
  5. Bob logs in on the same workstation gin login bob
  6. Bob tries to perform a git transaction that Alice is not permitted to do gin get bob/bobs_data

In step 5, the following will occur:

  1. The client will check to see if there's an SSH agent running
  2. It will find Alice's SSH agent and key
  3. It will use Alice's key to perform a git clone
  4. When Alice's key is sent to gin-repo, the repo service will query gin-auth and auth will respond that the key belongs to Alice.
  5. Bob will get an error that he is not authorised to download bob/bobs_data, because he is trying to authenticate using Alice's key.

Solution: Before using the key found in the agent, check with gin-auth that the currently logged in user (if any) matches the owner of the key.

Selective download and upload

By far the most requested feature is being able to selectively sync files to and from the server. This is particularly useful for working with very large repositories without cloning all the data locally.

Split out client library code

A new repository will be created for common code that will be shared between all user clients (e.g., gin-client-lib). This should contain most of the code found in this repository that is reusable, e.g., packages client, repo, auth.

The current repository, gin-cli, will then be a thin client on top of gin-client-lib which simply parses command line arguments, calls library functions, and handles output.

Other clients, like a desktop GUI or a filebrowser plugin, will also use gin-client-lib for all operations.

Create commits when getting files from remote

When an annexed file's contents are made available locally (git annex get file), a commit needs to be created (and pushed?) to register the new location of the file. Is this the proper way of handling this or does it detect modifications because of the WORM backend?

Handle merge conflicts

I can think of a few ways to handle merge conflicts:

  1. Pick either the remote or the local file to be the main version (consistently) and rename the other one to indicate it is a conflicting file (dropbox style). I think this is the friendliest solution.
  2. Rename both files with a timestamp of their last modification time. Not fun, since the user will be left with no file which carries the original name.
  3. Ask the user what to do. Maybe this could give a few options: Let the user indicate which is the primary file and rename the other one (never delete), or allow them to cancel the operation.

Use external configuration file to define behaviour

A config file in $XDG_CONFIG_HOME/gin that can be used to configure the client's behaviour (e.g., host addresses for custom installations).

List of options:

  • Host address(es).
  • Key mechanism priority: file or agent.
  • File types that are added to git instead of annex.

Verbose flags

Add two verbose flags.

The standard -v or --verbose flag should print extended output.
A second one should also be added that prints the git and annex commands that are being executed (e.g., --git).

When both are used in combination, the output of the git commands should also be printed.

Library functions should not print output

Certain library functions, like GetRepos in repo/repo.go, print output. Anything that's in a subpackage should be a library function and so all console output should be handled by main.

We should consider how to handle partial progress status printing in these cases.

Better error messages

The error messages for the following cases are puzzling or uninformative:

  • gin info <user> when <user> doesn't exist.
  • gin get <reponame> when <reponame> doesn't exist.
  • gin repos <user>: regardless of whether <user> exists or has no public/shared repositories, currently the message is the same (because the server returns 404). On 404, the client should check if <user> exists and return with appropriate feedback.

Repository functionality

Should support:

  • Create repository
  • Upload data to repository
  • Download data from repository

EDIT: Not going to implement editing repository info. Can be done on the web and just changing the description on the client is unnecessary. Closing.

Documentation

  • Document all options properly. Currently, all we have is the output of gin cmdhelp in the repo readme.
  • Advanced documentation with explanation of what the client does in the background.
  • Create tutorial for common use cases.
  • Add basic documentation into the packaged archives.

Status output for time-consuming operations

This isn't exactly like issue #7 (Progress Bars), but it is more about informing the user about what is happening, especially when an operation can take a while.

When working with very large files, some steps in an operation, like computing the hash when adding new files, can take a long time. A progress bar for data transfers wouldn't help here; there's a long computation occurring before the actual transfer begins. What we need is a message in the style of Preparing files for upload..., perhaps even a file count progress indicator.

Basically, we need anything that informs the user that something is happening, otherwise it looks like the program is hanging.

Support annex v5

When v6 is not available, fall back to v5 and unlocked files or direct mode.

No upload with commited but non synced annexes

Just because there are no (local) changes does IMHO not necessarily mean that "everything" is uploaded (just consider. upload errors both permanent or more important temporary)
so i am not sure that
this is a sufficient test.

Limit commit message length

When adding many (thousands of) files for upload, the commit message becomes long enough that it exceeds the limits of the command and the "annex push" command fails.

Download repository on create

Perhaps we should prompt the user for this, but it would probably be very convenient and low cost to do a clone after creating a repository on the server, especially considering the repository will be empty.

Create activity log

Make the client log all activities, while storing as little identifiable information as possible. The log should help in debugging during development, but also in troubleshooting client problems.

Use git annex in direct mode

Would make more sense to the user (their files would not be replaced by symlinks) and is also required for gin annex on Windows.

Upload command does not add dotfiles

This is a tricky one.
Should we add dotfiles?

I might rethink this if I add a command that does git add exclusively.
Also under consideration is a command that performs git add $* && git commit ..., to record changes locally without necessarily uploading.

Command to make repository out of current directory

I imagine a scenario where a user has a directory with data and just wants to turn into a gin repository. We could have a command that creates a remote and instead of using gin get (git clone), the client performs git init, git annex init, sets defaults, adds the newly created repository as a remote, and begins tracking the current directory.

I'd like to think about this more and hear other people's thoughts on this though. One downside is that this function enables users to start using their primary data storage as a gin repository. Forcing users to create a separate directory to initialise as a gin repo and having them copy their data manually, also forces them to think about which data they want to push.

gin get for clone

Separate the git clone command from the git pull command.

Currently they are both handled by gin download. To avoid ambiguity, cloning should be handled by a new command called gin get.

Help for individual commands

People expect to be able to use

gin help keys

to see get help on the keys command.

Same for

gin keys --help

Would be nice to have both.

Create key pair and load user key from file for transactions

Currently loads directly from agent.
Eventually, the client should be able to generate its own key pair, upload the public key to the server, and directly load the private key from a local file.

Keys in an SSH agent could be used based on a global configuration setting.

Don't create unnecessary commits.

Pull should do a fast forward when possible without creating a new commit.
Uploading also creates a commit even when there are no changes to any of the files in the repository.

Ask users to add keys in documentation/tutorial

An interesting edge case:
A user has a tool (e.g., shell prompt) that relies on being able to speak to a git remote. If their key isn't set up with gin, and they rely on the temporary key handling in the client, this will cause issues (specifically, it will prompt for a password).

In the tutorial or the basic docs, we should have a section that explicitly asks users to add their keys. We should also tell users that don't have SSH keys or don't generally use git to ignore this section.

Descriptive errors

Some common, very specific errors give vague messages.
One in particular was caught a few times during the BrainHack workshop.

When a gin get is called and the directory already exists, the command fails with a vague error message. It would be trivial to distinguish this case and give an informative error message.

There are other errors like this that could use work.


UPDATE
Checklist of errors to handle:

  • Directory already exists
  • Permission denied on git command
  • File not found (e.g., on gin get-content)
  • Unable to unlock file with no content
  • Unable to lock, unlock, get-content, or remove-content of untracked or non-annexed files
  • Unable to run any git commands outside a git repository
  • Unable to run any command on file that does not exist
  • Not enough free space (git annex performs the check itself. Related issue #21)

Change behaviour of repos command

Current behaviour:

  • By default, it lists public repositories.
  • If username is specified, it lists the repositories owned by username and accessible by me (the logged in user). These can be either shared or public.

New behaviour:

  • By default, list my repositories.
  • Provide a flag that lists repositories shared with me.
  • Provide a flag that lists public repositories.
  • If username is specified, same as current behaviour.

Can't accept host key when using gin for the first time

Since the commands run by the command line client aren't interactive, the user isn't prompted to accept the host key of the gin repo server when the first git command is issued.

One way around this is to add -o StrictHostKeyChecking=no to all git{,-annex} commands.

Another way would be to see if we can detect the prompt and pass it on to the user. This wasn't an issue before since gin-cli didn't capture stdout. But with the newest changes in logging, this will change. I decided that capturing stdout is important for troubleshooting and that the user shouldn't be presented with the output of the backend commands.

We could also have a command that simply attempts to connect to the git server (similar repo.Connect()) and prompts the user to accept the key. I don't really like this last idea though.

Packaging and distribution

Need an automated way to package the client for different OSes and Linux distros (deb, rpm, etc).

Will need to figure out which distributions (and distribution versions) require bundling annex v6.

Add current user to commit messages

When doing a gin upload, consider using logged in user info to create the commit signature. Though this might break things for people who have a git configuration.

Perhaps a compromise could be to let the git configuration set the commit Author, but add the logged in user's info (username, real name) to the commit message.

Annex modes

I'm having trouble settling on the mode of operation for the client with respect to git annex. I'd like to describe all the cases and what they entail to see if anyone else has any idea how to best handle this.

Let me start with a summary of the situations and go into more detail below.

Mode Advantages Disadvantages Notes
Indirect mode Default mode. Safe. Doesn't work on Windows (requires symlinks). Could be confusing for the user. Although this is the current default, it seems to be heading towards being deprecated for the v6 mode
Direct mode Closer to what a user unfamiliar with git-annex would expect. Deprecated as of version 6. Unsafe, since the user is always working with "unlocked" files. Although deprecated, I'm including this mode as we might consider it for compatibility with distributions that still run v5 of git-annex.
Version 6 unlocked Best of both modes above. Files aren't symlinks but it's also not unsafe. Can use normal git commands and configuration options to commit files into annex. Doubles the disk space usage for annexed files. Requires newest version of git-annex.
Version 6 unlocked thin Resolves issue with double disk space usage. Unsafe, since there is no unmodified copy in .git/annex/objects when working on a file.

My issue here is with trying to balance convenience for the user and data safety. The last mode, V6 thin, is pretty convenient: It behaves like most users would expect (e.g., no symlinks and unlocking required) and it doesn't use up the extra disk space.

Since I started writing this I've experimented with all modes and I have some ideas about how to use v6 unlocked thin mode while also making sure the user is always aware of the safety of their data.

In thin mode, there is no local backup of the last checked in version of the annexed file. A file's state is only backed up once an upload to a remote is complete. If we stick to this mode, we could provide commands (and in the future, GUI indicators) showing which of the files in the current working directory are in a backed up state. In other words, we can tell a user if their latest changes have been recorded on the gin servers.

I'm going to follow up this issue with some more details on the other modes and how we can handle various cases, but in the meantime feel free to add comments, suggestions, and share concerns.

Sources:

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.