Giter VIP home page Giter VIP logo

dat's Introduction

Dat

Dat is the package manager for datasets. Share files with version control, back up data to servers, browse remote files on demand, and automate long-term data preservation. Secure, distributed, fast.

#dat IRC channel on freenode datproject/discussions docs protocol

Table of Contents

  • Installation
  • Getting Started
  • Using Dat
  • Troubleshooting
  • For Developers
  • What is Dat?

    The Dat Project is the home to open source data sharing applications led by Code for Science & Society, a grant-funded non profit. The Dat Project developed the Decentralized Archive Transport (Dat) protocol, which transfers files in a secure, distributed, and fast network allowing you to focus on the fun work without worrying about moving files around.

    Key features

    • Secure - Data is encrypted upon transfer and the content is verified on arrival. Prevents third-party access to metadata and content. Learn more.
    • Transparent - Changes to data are written in an append-only log, creating a version history that improves transparency and auditability.
    • Distributed - With the Dat protocol you'll connect directly to other users or servers sharing or downloading common datasets. Any device can host files to share without the need for centralized servers. Read more.
    • Future-proof - Unique links are generated using a public key and thus can be used instantly and forever to verify the dataset from anywhere. You don't need to wait for the entire archive to be hashed before you can begin uploading to peers.
    • Fast - Files download from multiple sources. Quickly sync updates by only downloading the new bytes, saving time and bandwidth.

    Installation

    Visit our site for an installation guide or pick your favorite client application:

    • Dat Command Line - You are here! Scroll down for the installation details.
    • Dat Desktop - A desktop app to manage multiple Dats on your desktop machine.
    • Beaker Browser - An experimental p2p browser with built-in support for the Dat protocol.
    • Dat Protocol - Build your own application on the Decentralized Archive Transport (Dat) protocol.
    • require('dat') - Node.js library for downloading and sharing Dat archives.

    Dat Command Line

    Share, download, and backup files with the command line! Automatically sync changes to datasets. Never worry about manually transferring files again.

    Mac/Linux Windows Version
    Travis Build status NPM version

    Have questions or need some guidance? You can chat with us in IRC on #dat or Gitter!

    Command Line Installation

    The Dat command line tool can be installed with npm. Make sure you have node version 4 or above and npm installed. You can run node -v and npm -v to check!

    Need to install Node? Start here.

    Install dat from npm with the --global, -g option:

    npm install -g dat
    

    You should be able to run the dat command now. If not, see the installation troubleshooting for tips.

    Getting started

    We have Dat installed, let's use it! Currently, the Dat command line always moves files from a single source to any number of destinations. If you are creating files on your computer to share, you will be the source and will use the share commands. Otherwise, you can clone files from remote Dat archives shared to you with a Dat link.

    You can mostly get around in the Dat command line world with two commands:

    dat share <dir>

    Use dat share to send files from your computer to any number of other computers or servers. This will share a local <dir> and print a dat:// link. Send the printed link to other users so they can download your files.

    Dat share will create a .dat directory in the folder you share. The .dat folder allows you to add new files and update old files on the same Dat link.

    Make sure not to share the .dat folder with anyone, it contains a private key that allows you to update and add files.

    share

    dat clone dat://<link> <download-dir>

    Use dat clone to download files from a remote computers sharing files with Dat. This will download the files from dat://<link> to your <download-dir>. The download exits after it completes but you can continue to update the files later after the clone is done.

    clone

    A Brief Primer on dat:// links

    You may have seen Dat links around: dat://ff34725120b2f3c5bd5028e4f61d14a45a22af48a7b12126d5d588becde88a93. What is with the weird long string of characters? Let's break it down!

    dat:// - the protocol

    The first part of the link is the link protocol, Dat (read about the Dat protocol at datprotocol.com). The protocol describes what "language" the link is in and what type of applications can open it.

    ff34725120b2f3c5bd5028e4f61d14a45a22af48a7b12126d5d588becde88a93 - the unique identifier

    The second part of the link is a 64-character hex strings (ed25519 public-keys to be precise). Each Dat archive gets a public key link to identify it. With the hex string as a link we can do two things: 1) encrypt the data transfer and 2) give each archive a persistent identifier, an ID that never changes, even as file are updated (as opposed to a checksum which is based on the file contents).

    dat://ff34725120b2f3c5bd5028e4f61d14a45a22af48a7b12126d5d588becde88a93

    All together, the links can be thought of similarly to a web URL, as a place to get content, but with some extra special properties. Links point to a set of files instead of a specific server. This means when you run dat clone dat://<link> you do not have to worry about who is hosting the files at that link or if the content has changed. You'll always get the latest content in the network and the link helps to verify the integrity of the content!

    Try out dat clone with the link above to read more about the protocol!

    Demo File Download

    To get started using Dat, we can download files via Dat. Similar to git, you do this by running dat clone:

    dat clone dat://ff34725120b2f3c5bd5028e4f61d14a45a22af48a7b12126d5d588becde88a93 datprotocol-website
    
    Cloning Dat Archive: /Users/joe/Desktop/datprotocol-website
    Link: dat://ff34725120b2f3c5bd5028e4f61d14a45a22af48a7b12126d5d588becde88a93
    
    Metadata: [==================================================>] 100%
    Content:  [=====>---------------------------------------------] 11%
    Total size: 8 files (43.3 kB)
    
    2 peers on the Dat Network
    Downloading: 2.68 kB/s
    

    This will download the files shared at that link to a folder named datprotocol-website. These files are being shared by a server over Dat (to ensure high availability) but you may connect to any number of peers also hosting the content.

    Get started using Dat today with the share and clone commands or read below for more details.

    Usage

    Dat archives have a one to many relationship. There is a single source that can create and write files. There are many peers that can download the files (in future versions there may be several sources).

    • Sharing: If you want to share files from your computer that you will update, you are the source.
    • Downloading: If your colleague has files they want to share, they will be the source and you'll be downloading from a remote archive.

    The first time you run a command, a .dat folder to store the Dat metadata. Once a Dat is created, you can run all the commands inside that folder, similar to git.

    Sharing

    The quickest way to get started sharing files is to share:

    > dat share
    
    Syncing Dat Archive: /Users/joe/Desktop/datproject-website
    Link: dat://b44a53f3dcad90349ba743e21fca4869cd3fb79d9f8b55a556af04e5ad49bb79
    
    Archive update finished! Sharing latest files.
    Total Size: 51 files (3.23 MB)
    
    Looking for connections in Dat Network...
    

    You can also do create and sync in separate steps if you'd like more control over the importing.

    Creating a Dat archive

    dat create [<folder>] [--import]
    

    Create a new Dat Archive in the current directory (or specify dir). Will automatically import the files in that directory to the archive.

    • Use --import to import files after the archive is created.

    Syncing to Network

    dat sync [<folder>] [--no-import] [--no-watch]
    

    Start sharing your Dat Archive over the network. Sync will import new or updated files since you ran create or sync last. Sync watched files for changes and imports updated files.

    • Use --no-import to not import any new or updated files.
    • Use --no-watch to not watch directory for changes. --import must be true for --watch to work.

    Snapshot

    A snapshot archive shares a folder with specific content. Content can only be added once and then the link is created. Once a snapshot archive has a link, it cannot be updated.

    dat snapshot [<folder>]
    

    Snapshot will create the archive in snapshot, {live: false}, mode.

    Downloading

    Start downloading by running the clone command. This will create a folder, download the content and metadata, and create a .dat folder. Once you started the download, you can resume using clone or the other download commands.

    dat clone <dat-link> [<folder>] [--temp]
    

    Clone a remote Dat Archive to a local folder. This will create a folder with the key name is no folder is specified.

    Options

    --temp: Creates a temporary database and does not save the metadata to disk.

    Updating Downloaded Archives

    Once a Dat is clone, you can run either dat pull or dat sync in the folder to update the archive.

    dat pull [<folder>]
    

    Update a cloned Dat Archive to latest files and exit.

    dat sync [<folder>]
    

    Download latest files and keep connection open to continue updating as remote source is updated.

    Shortcut commands

    • dat <link> {dir} will run dat clone for new dats or resume the exiting dat in dir
    • dat {dir} is the same as running dat sync {dir}

    Dat Registry and Authentication

    As part of our Knight Foundation grant, we are building a registry for Dat archives. We will be running a Dat registry at datproject.org, but anyone will be able to create their own. Once registered, you will be able to publish Dat archives from our registry. Anyone can clone archives published to a registry without registration:

    dat clone datproject.org/karissa/more-tweets-more-votes
    

    Auth (experimental)

    Other auth commands are still in an experimental status. New registrations on the Dat archive registry are currently limited.

    dat register
    dat login
    dat logout
    dat whoami
    

    Once you are logged in to a server. You can publish a Dat archive:

    cd my-data
    dat create
    dat publish
    

    All authentication requests take the --server option. You can deploy your own compatible registry server if you'd rather use your own service.

    Upgrading from version 11

    If you are familiar with the Dat CLI version 11, there will only be minor changes to use dat version 12.

    To share files with Dat, you'll need to specify the share command now.

    cd my-folder
    dat share my-folder
    

    After the Dat archive is created, you can run dat sync in the folder to share updates or use the same command as the old dat: dat my-folder.

    To download files, you can use dat clone which will do the same thing as dat <link> did before:

    dat clone dat://<link> download-folder
    

    Once the initial download is complete, you can use dat pull inside to folder to update or use the same command as the old dat: dat download-folder.

    Troubleshooting

    We've provided some troubleshooting tips based on issues users have seen. Please open an issue or ask us in our chat room if you need help troubleshooting and it is not covered here.

    If you have trouble sharing/downloading in a directory with a .dat folder, try deleting it and running the command again.

    Check Your Dat Version

    Knowing the version is really helpful if you run into any bugs, and will help us troubleshoot your issue.

    Check your Dat version:

    dat -v
    

    You should see the Dat semantic version printed, e.g. 11.1.2.

    Installation Issues

    Node & npm

    To use the Dat command line tool you will need to have node and npm installed. Make sure those are installed correctly before installing Dat. You can check the version of each:

    node -v
    npm -v
    

    Global Install

    The -g option installs Dat globally allowing you to run it as a command. Make sure you installed with that option.

    • If you receive an EACCES error, read this guide on fixing npm permissions.
    • If you receive an EACCES error, you may also install dat with sudo: sudo npm install -g dat.
    • Have other installation issues? Let us know, you can open an issue or ask us in our chat room.

    Debugging Output

    If you are having trouble with a specific command, run with the debug environment variable set to dat (and optionally also dat-node). = This will help us debug any issues:

    DEBUG=dat,dat-node dat clone dat://<link> dir
    

    Networking Issues

    Networking capabilities vary widely with each computer, network, and configuration. Whenever you run a Dat there are several steps to share or download files with peers:

    1. Discovering Peers
    2. Connecting to Peers
    3. Sending & Receiving Data

    With successful use, Dat will show Connected to 1 peer after connection. If you never see a peer connected your network may be restricting discovery or connection. Please try using the dat --doctor command (see below) between the two computers not connecting. This will help troubleshoot the networks.

    • Dat may have issues connecting if you are using iptables.

    Dat Doctor

    We've included a tool to identify network issues with Dat, the Dat doctor. The Dat doctor will run two tests:

    1. Attempt to connect to a server running a Dat peer.
    2. Attempt a direct connection between two peers. You will need to run the command on both the computers you are trying to share data between.

    Start the doctor by running:

    dat doctor
    

    For direct connection tests, the doctor will print out a command to run on the other computer, dat doctor <64-character-string>. The doctor will run through the key steps in the process of sharing data between computers to help identify the issue.


    JS API

    You can use Dat in your javascript application:

    var Dat = require('dat')
    
    Dat('/data', function (err, dat) {
      // use dat
    })

    Read more about the JS usage provided via dat-node.

    For Developers

    Please see guidelines on contributing before submitting an issue or PR.

    This command line library uses dat-node to create and manage the archives and networking. If you'd like to build your own Dat application that is compatible with this command line tool, we suggest using dat-node.

    Installing from source

    Clone this repository and in a terminal inside of the folder you cloned run this command:

    npm link
    

    This should add a dat command line command to your PATH. Now you can run the dat command to try it out.

    The contribution guide also has more tips on our development workflow.

    • npm run test to run tests
    • npm run auth-server to run a local auth server for testing

    License

    BSD-3-Clause

    dat's People

    Contributors

    max-mapper avatar okdistribute avatar mafintosh avatar joehand avatar karissa avatar finnp avatar pkafei avatar jjjjw avatar rgbkrk avatar clkao avatar juliangruber avatar shama avatar waldoj avatar paulfitz avatar joyrexus avatar wking avatar jeroen avatar ekg avatar chuesler avatar podviaznikov avatar bmpvieira avatar konklone avatar optikfluffel avatar sethvincent avatar tmcw avatar timoxley avatar watson avatar stefanw avatar flgmwt avatar rahatarmanahmed avatar

    Watchers

    R Vaughan avatar James Cloos avatar

    Recommend Projects

    • React photo React

      A declarative, efficient, and flexible JavaScript library for building user interfaces.

    • Vue.js photo Vue.js

      ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

    • Typescript photo Typescript

      TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

    • TensorFlow photo TensorFlow

      An Open Source Machine Learning Framework for Everyone

    • Django photo Django

      The Web framework for perfectionists with deadlines.

    • D3 photo D3

      Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

    Recommend Topics

    • javascript

      JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

    • web

      Some thing interesting about web. New door for the world.

    • server

      A server is a program made to process requests and deliver data to clients.

    • Machine learning

      Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

    • Game

      Some thing interesting about game, make everyone happy.

    Recommend Org

    • Facebook photo Facebook

      We are working to build community through open source technology. NB: members must have two-factor auth.

    • Microsoft photo Microsoft

      Open source projects and samples from Microsoft.

    • Google photo Google

      Google โค๏ธ Open Source for everyone.

    • D3 photo D3

      Data-Driven Documents codes.