Giter VIP home page Giter VIP logo

dat-wot's Introduction

Hello:grey_exclamation:

I am a software engineer living in Oakland, CA with interests in computational biology, creative coding, and sharing knowledge.

I currently work on infrastructure for genomics data analysis at Karius Diagnostics.

I've been spending some free time solving Rosalind bioinformatics problems and writing a small puzzle game in Unity involving cats.

Lately I've enjoyed programming in Rust and C!

Other projects:

Some incomplete projects that I may return to:

dat-wot's People

Contributors

aschrijver avatar jayrbolton avatar karissa avatar okdistribute avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dat-wot's Issues

super brain dump

terminology

  • metadat: this is a special behind-the-scenes dat that holds metadata about a user, device, listing of dats, or any user-to-user communication
  • public metadat: this is a metadat that holds a public listing of dats for a user, plus the user's pgp public key
  • relationship metadat: this is a metadat that holds relationship metadata for two users (eg, what private dats are shared between them)

public dat collections

In order to find and follow cool public metadats (eg "NPR's dats""), simply enter a public metadat address. With a UI, that address could have a clickable link.

following someone

To follow someone, you enter the address of their metadat, which gets saved locally/privately. Then all their public dats can be listed.

adding a contact

Adding a contact allows you to give them private access to certain dats. To create this relationship, both contacts need to initiate a contact request with the other.

On the creation of a new contact requset, a new dat gets generated that will hold the relationship metadata ("relationship metadat"). Once one contact initiates the contact request, that relationship metadat address gets encrypted using the other contact's public key and placed in their public metadat. Once both contacts read each other's relationship metadat addresses from each other's public metadats, then that encrypted address can be immediately deleted from their public metadats.

For each contact, two relationship metadats should be saved, one that contact A writes to and contact B reads from, and the other for contact B to write to and contact A to read from.

Once both relationship metadats are established, then contact A can start privately sharing dats with contact B, and vice versa.

setting permissions on dats

When you create or edit a dat, you can explicitly set which among your contacts have access to that dat. For each contact that you give access, the dat address will get saved into the relationship metadat that you have for each contact. If you make the dat public, then the dat address simply gets listed in your public metadat instead.

user groups

When you set permissions on a dat, it might be handy to give access to a whole group of contacts all at once, without having to select individually.

After a contact is established, you can simply put a group label within the relationship metadat. Each contact could request to be in a group from the other contact. When you set permission on a dat, you can select groups as well as individual contacts.

cross-device pairing

Similar to creating a regular contact, you create a fully-trusted relationship between two devices. Any private dat address on one device would be immediately put in the "relationship dat" and listed and downloadable on the paired device (eg your laptop and your phone). This could be thought of as a kind of "full-trust device network". For each dat you create on any device, you could choose which other devices in your trust network should auto-download the full dat.

local privacy and encryption

Since everything is based around PGP keys, we could encrypt all metadata in all metadats. Even though the dats themselves are private, and are shared over SSL, if your device was stolen, then someone could find all your metadat information. Encrypting everything locally would ensure that simply stealing the device wouldn't expose all your dat addresses. All dat addresses that are shared with other contacts are encrypted using their pubkey.

messaging

Could have some simple device-to-device messaging by placing a pgp-encrypted message in your relationship metadat for someone else using their pubkey.

"forking" UI

If you download a dat and want to edit it, there could be a simple method for initializing a new dat with the edits and notifying the initial dat creator of your edited dat address, kind of like a fork.

data rendering

Say you have a dat with files in a certain format (.csv, .md, etc). When you click the file, you could choose a "renderer" for that file. The renderer could be a JS/HTML application that takes the file as input . You could simply provide the .git url of the renderer app when opening the file (and in a UI, the "renderers" would be pre-listed). This would open up potential to build open-source renderers for spreadsheets, wiki data, etc. The renderer itself does not need to be part of the dat, and can just be pulled off github or similar.

encrypting arbitrary files

Since pgp is built into it, it would be easy to support pgp encryption of arbitrary files inside dats. For regular dats holding any data, metadat could handle encrypting the file and saving the encrypted version of the file to the dat, rather than any plaintext. If the dat is shared among users and groups, then metadat could automatically handle multi-key encryption of the files using the pubkeys of all the other users who have access to the dat.

Since dat addresses are private, and data is sent over an encrypted connection, the only advantage of this kind of encryption is if your device is stolen or if someone steals your login.

commenting and messaging on dats and on files within dats

Could potentially have a commenting and message-threading system on dats and on dat files, all encrypted using each user's pubkey

Parallelize and clean up tests

The password hashing functions and some other things are slow, and the full test suite takes almost 60s to run. Maybe we could split the tests into separate files and run them in parallel. The trick will be to aggregate exit codes in all the child processes. Maybe there is a package for doing this? Possibly this: https://github.com/DiegoRBaquero/bogota

Migrate to libsodium

People suggest we use ECC keys instaed of 2048-bit (or 4096) RSA, which can be generated with libsodium

  • is it too slow?
  • using openpgp.js has been handy; what do we lose by managing asymmetric keypairs without the OpenPGP stuff?

Sharing dats privately

  • user A requests to be a contact with user B
  • both users generate new dats that hold metadata specifically for their relationship (what private dats they want to share w eachother)
  • both users encrypt the addresses of these dats using each other's pubkeys and temporarily place those encrypted addresses in their own public metadats
  • each user decrypts each other's "contact metadat" address and then saves the address privately on their own device.
  • whenever user A wants to share a dat privately with user B, they save the dat address inside the "relationship dat" that they have for user B. user B reads that address and it shows up in the UI.

So the only downside is that there's a temporary file (the encrypted address) that could indicate to the public that a contact request is pending (maybe it could be pretty obfuscated though). The upside is no-one will ever be able to read the full list of your contacts, and nobody would ever have a clue about what you're sharing with whom.

Web of trust

https://en.wikipedia.org/wiki/Web_of_trust

A major initial problem with authority-less key sharing (ie adding someone as a trusted contact), is being able to verify that someone is who they say they are. Pgp has a system where people can sign each other's keys, so you can verify that someone is who they say they are based on other contacts you have in common. It would be cool to have a UI that showed this web of trust for new contacts

Push based messaging.

So metadat is setup for

  • public pull-based messaging. User A can broadcast public data into their public metadat and any followers can pull that data
  • private pull-based messaging with one-to-one and one-to-group. User A can put data in a "relationship metadat" for user B, and user B can read those updates

However, it would be nice to add push-based messaging. This would allow user A to message user B without user B knowing user A. The push messages could be handled by the receiver according to a hardcoded set of message types built into metadat (eg "initiate-contact", "send-dat", etc). Maybe this can be done with hypercore

Group based sharing and privacy

First answer here is pretty good: https://security.stackexchange.com/questions/91704/which-strategy-to-encrypt-data-accessed-by-multiple-users

Summary of group encryption with asymmetric keys:

  • Generate a new key pair for the group and keep both keys private. One key is a write key and one is a read key (the write key is the "public key" and the read key is the "private key"---however, both are kept private)
  • Grant read access to a user by taking the group's read key and encrypting it using the user's own public key, then sending the encrypted version to them so they can store it
  • Grant write access to a user by taking the group's write key and also encrypting using a user's public key.
    Revoke access by creating a new group key pair and re-sending all the user-encrypted keys.

Create a dat with read permissions for certain contacts

the createDat function could take an option for who you want to share it with, which could be an array of user ids. That dat address should be placed in a dats.json file inside each push-relationship dat for each contact.

Use async throughout

This library will probably be used in other applications and so should be async. There are some synchronous fs calls

password recovery

the total non-recoverability of keypair passphrases is a little rough for user experience. Probably the biggest reason people lose private keys is because they lose the device it's on, or the filesystem gets wiped. We could copy encrypted private keys for every device to every other device in their device network. Any UI built on top of this library should encourage users to add many devices to prevent key loss.

Device networks

I was thinking that metadat could have two types of "device networks"

  • user networks
  • group networks

A "device network" would simply be redundant devices that pull and share all the exact same stuff over the network. Since users will likely span multiple devices, you can replicate their full metadat directory over the two devices. Whenever you create a dat on one device, it gets auto-pulled and shared on the other device.

(If one device in a user device network is stolen, would we need to revoke the pgp key? The PGP private key would be encrypted using the user's passphrase, and all their data could be encrypted, so maybe it'd be ok.)

Likewise, goups may want to provide more bandwidth or storage for their dats. These "group device networks" would pull all the dats from a group, as soon as they are created for that group, and store and share the data. These group networks would not need to store any user information, but would just replicate the group dats.

Peer review use case

@lukeburns : I saw your post here and was excited to hear about your potential use cases.

I initially started this project with UI in mind for collaborative data backups, but hopefully the public/private key sharing system could be generic enough that it could support any kind of public and private user-to-user identity and communication over dat. Check out this rough high-level overview I wrote our here: https://github.com/jayrbolton/dat-pki/wiki/How-it-Works. If you have any specific feature requests, ideas, etc for your use case, please post them here.

todos

  • read-only groups
    • create group, delete group, add/remove users to group, share dats with group
  • download dat for user from key
  • unfollow, remove contact, make dat private or revoke access from group or user
  • web of trust stuff---sign another user's pubkey and view number of signers
    • think about introduction system---ie. a trusted contact could auto-introduce you to other contacts (eg. within a group)
    • think about other classic web of trust ideas
  • delete hosted or downloaded dat for user
  • device pairing
    • add a "device pair" relationship that is a separate user account and auto-downloads all dats (or at least syncs all public/private metadata) from all other users in the device network.
    • think about this feature more. Need a handshake process for this -- probably not hard.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.