generouslabs / brainstorming-encrypted-git Goto Github PK

View Code? Open in Web Editor NEW

0.0 0.0 0.0 14 KB

Brainstorming how to build encrypted git remotes on top of isomorphic-git

License: GNU Affero General Public License v3.0

brainstorming-encrypted-git's People

Contributors

Watchers

brainstorming-encrypted-git's Issues

Encrypt every ref and object into a "new" repository

Following this comment I'm breaking this out into a separate issue.

Idea: Encrypt each ref and object from the source repo as a file in the encrypted repo

When pushing the unencrypted repo:

Create .git/encrypted
- cd .git/encrypted && git init && git remote add ... && cd ../..
Encrypt the output of git show-ref
- Save this into .git/encrypted/refs
Iterate over each ref and recursively over every linked object
- Encrypt the output of git cat-file $ref
- Save this into .git/encrypted/objects/xx/xx/xxxx
Add a commit with all changes from above inside .git/encrypted
- cd .git/encrypted && git add . && git commit -m updates && git push

Pulling an encrypted repo would then look like:

Create .git/encrypted
- cd .git/encrypted && git init && git remote add ... && git pull
Iterate over every object
- find objects/ -type f
- Decrypt the object
- Copy the object into the parent git object store
Decrypt .git/encrypted/refs and recreate the refs

Brainstorming privacy designs

Following #1, what privacy tradeoffs make sense?

git-remote-gcrypt obscures everything by encrypting the packfiles and replacing the remote's single commit with a new one on each push. This is very far towards the privacy end of the spectrum, but introduces trade offs.

For mobile first applications that use git to store data, what tradeoffs would make sense?

How does git-remote-gcrypt work?

This issue is to track discussion around git-remote-gcrypt and if we could achieve interop with its encryption format.

Having dug into the code, read the docs, and run some local tests, I think it works like this:

The git objects and pack files get encrypted
Their encrypted filename is the hash of their contents
The keys to encrypt those files are put into a "manifest" file
- This file is always called 91bd0c092128cf2e60e1a608c31e92caf1f9c1595f83f2890ef17c0e4881aa0a
The encrypted pack file, object files, maybe some others, and the manifest are put into a directory
This directory is then converted into a git repository with a single commit
- Put differently, if you push this repo to something like GitHub you get a repo with a single commit, with a fixed, hardcoded commit author, time, and a list of files which are the encrypted contents
- There is no commit history visible, and only one commit

Effectively, a git hosting service is used as a store of the latest "encrypted" git repository. So any pushes must upload the whole commit history again, because from GitHub's perspective, there's only ever 1 single commit.

Pros

Strong privacy, not much info is leaked
Not even commit history, frequency, etc, is leaked
Long standing codebase that's probably been well tested

Cons

Brutal uploads for each push, especially harsh on mobile
Doesn't use any "git"ness of the host
Slightly complex to package up and unpackage

Building a git remote helper

I have a working prototype now that takes a git repo and encrypts it by encrypting each object independently. I can push data to it and pull data back. That much works.

https://github.com/GenerousLabs/git-remote-encrypted

Now, I'm looking at how would it work if implemented as a git remote helper.

Firstly, reading this: https://git-scm.com/docs/gitremote-helpers

Then this is a very helpful guide: https://rovaughn.github.io/

Which in turn links to: https://github.com/git/git/blob/master/t/t5801/git-remote-testgit

The basics are:

The helper gets invoked with 2 arguments
- First: The remote name or it's URL if there is no name
- Second: The remote URL
git passes data to the stdin of the helper and reads from the stdout
- Commands are batched and separated by a blank line \n\n
- It sends a capabilities command first
- We probably want to support fetch and push only at first
- This actually means supporting several commands

That's about as much as I've understood thus far. Some additional useful reading:

generouslabs / brainstorming-encrypted-git Goto Github PK

brainstorming-encrypted-git's People

Contributors

Watchers

brainstorming-encrypted-git's Issues

Encrypt every ref and object into a "new" repository

Brainstorming privacy designs

How does git-remote-gcrypt work?

Building a git remote helper

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent