datrs / hypercore Goto Github PK

View Code? Open in Web Editor NEW

325.0 325.0 37.0 858 KB

Secure, distributed, append-only log

Home Page: https://docs.rs/hypercore

License: Apache License 2.0

Rust 98.06% JavaScript 1.94%

hypercore's People

Contributors

Stargazers

Watchers

hypercore's Issues

read-only mode

Hypercore should support not having a private key available. It currently doesn't.

Seal KeyPair type

We're forwarding the KeyPair type from another crate. We should create a wrapper type for this.

We should also be able to have a half-opened state for these - e.g. read-write vs just read

Change stack from `failure` to `std::error::Error`

Feature Request

Summary

Nowadays hypercore is defined for storage where errors are failure::Error eg

This predates when Rust had std::error::Error trait, and could now be replaced with a Error + Send + Sync + 'static, such as anyhow, snafu or coreerror for a no_std crate.

This is also one less crate on the stack.

Motivation

This should help us move into no_std in the future, specially useful for WASM and embedded.

Guide-level explanation

We could replace failure with anyhow on random-access-disk and random-access-storage to use context during errors, which is also compatible with std::error::Error.

We can then define hypercore using the std Error trait, and broaden the types of Storages hypercore can use.

Reference-level explanation

Drawbacks

Increase min rust version (would be increased by #97 anyway)

Rationale and alternatives

Unresolved Questions

Upgrade sha2 to 0.8.x

This is proposed in #40 and currently blocked by dalek-cryptography/curve25519-dalek#201. This will probably be resolved at some point, so we'll just have to wait until then.

Seal Signature type

We're forwarding the Signature type from another crate. We should create a wrapper type for this.

expose the builder as hypercore::builder

Expose the builder as hypercore::builder, just like hyper::Server::builder() https://docs.rs/hyper/0.12.6/hyper/server/struct.Server.html#methods

Cannot declare a satisfiable feed reference

Question

Your Environment

Software	Version(s)
hypercore	0.9.0
Rustc	rustc 1.34.1 (fc50f328b 2019-04-24)
Operating System	linux

Question

So I'm not sure if I should ask this question here. I have tried to find help elsewhere but writing here as a last-resort.
I'm new to rust and I have been picking up the language at a steady pace.
But there's one issue that I've been struggling with to the point of insanity.

How do you store a hypercore::Feed in a struct or define a function that can take a feed as a parameter?

// What is the correct type declaration for feed? 
pub struct MyStruct {
  m_feed: Feed<RandomAccess>,
}; 

pub fn (a_feed: &Feed<RandomAccess>) { ... }

// Or even slightly modifying the example from the docs:
let path = PathBuf::from("./my-first-dataset");
let mut feed: TYPE_DECLARATION = Feed::new(&path).unwrap();

I've literally spent a week trying all kinds of exotic declaration to satisfy the compiler traits, but to no avail..

Again humble apologies for asking this here, please help!

Building latest version fails

version 0.11.0 does not build.

Stacktrace:

   Compiling hypercore v0.11.0
error[E0277]: the trait bound `random_access_memory::RandomAccessMemory: random_access_storage::RandomAccess` is not satisfied
   --> /Users/dylan/.cargo/registry/src/github.com-1ecc6299db9ec823/hypercore-0.11.0/src/feed.rs:572:6
    |
30  | pub struct Feed<T>
    |            ---- required by a bound in this
31  | where
32  |     T: RandomAccess<Error = Error> + Debug,
    |        --------------------------- required by this bound in `feed::Feed`
...
572 | impl Default for Feed<RandomAccessMemory> {
    |      ^^^^^^^ the trait `random_access_storage::RandomAccess` is not implemented for `random_access_memory::RandomAccessMemory`
    |
help: trait impl with same name found
   --> /Users/dylan/.cargo/registry/src/github.com-1ecc6299db9ec823/random-access-memory-1.2.0/src/lib.rs:55:1
    |
55  | / impl RandomAccess for RandomAccessMemory {
56  | |   type Error = Box<dyn std::error::Error + Send + Sync>;
57  | |
58  | |   fn write(&mut self, offset: u64, data: &[u8]) -> Result<(), Self::Error> {
...   |
203 | |   }
204 | | }
    | |_^
    = note: perhaps two different versions of crate `random_access_storage` are being used?

How do I read an existing hypercore on disk?

Question

How do I read an existing hypercore on disk?

Your Environment

Software	Version(s)
hypercore	0.10.0
Rustc	rustc 1.37.0 (eae3437df 2019-08-13)
Operating System	macOS 10.12.6 (16G1815)

Question

I was doing some initial evaluation of rust hypercore. One of the things I wanted to try was to read existing hypercores, such as the ones written by mafintosh/hypercore. However, I could not figure out how to do that. Even when passing in the directory of an existing hypercore, the .len() method returns 0.

Context

I want to check compatibility with js hypercore and also compare read performance.

Merge with component repos

This issue is a proposal for merging the hypercore repo with its dependency libraries from the datrs project (and keeping them as separate crates if necessary). The merged dependency libraries would be the following:

flat-tree
memory-pager
merkle-tree-stream
random-access-disk
random-access-memory
random-access-storage
sleep-parser
sparse-bitfield
tree-index

There are a couple of benefits in doing this and not a lot of downsides that I know of. In short, it would facilitate faster development with fewer breakages and easier understanding of the whole project.

On the plus side:

it would be trivial to check whether a change breaks any of the dependent libs. Currently, you need to find which are these libraries, change their Cargo.toml to use your locally modified library and run the tests. With this proposal, it would be a matter of running cargo test in the repo root.
rewrites touching several libs could be done in one single commit (e.g., moving a function from one library to another)
dependency versions would be consistent for all included libraries/crates (one Crate.lock file in the repo root)

As for the disadvantages, these are what I could think of:

github issues of dependency libraries would be lost. These can be migrated, but the issue references in commits would point to the wrong issue.
every push will start a Travis build for the whole project (hypercore and its dependencies). Note that this can also be regarded as a benefit because the Travis runtime is dominated by rustup, the compilation of clippy, rustfmt and other dependencies, and these would be done only once, not for each library.

All in all, I think it's a net positive.

Why do it in the hypercore repo? It seems to me that this is the first layer of the Dat Project which should be usable on its own. The dependencies of this project are implementation details. Some of those could be usable on their own, so it might be worthwhile to keep them as separate crates. And I guess the least controversial move is to keep them as separate crates.

To keep the dependencies as separate crates, the workspace feature of cargo can be used: https://github.com/rust-lang/rfcs/blob/master/text/1525-cargo-workspace.md.
Here is a project using this feature: https://github.com/gfx-rs/gfx

What do you think?

If you think it might be worthwhile, I can do the merge in a test repo where anyone can try how it would work before committing to this repo overhaul.

Implement feed.audit()

Feature Request

Summary

Implement feed.audit() to verify all data you think you have is actually there.

Motivation

This gives us parity with holepunchto/hypercore#180.

implement as OO

Was thinking, a cleaner way of implementing many of the things we have in-source right now would be by using traits / structs a bit better. Currently we have the following problems:

crypto is not aware of the structs in storage that it needs to sign.
the struct in crypto/key_pair is unaware of methods that are available in crypto/mod.
we don't have a first-class Hash struct yet.
Signature lives in storage, but is relevant to crypto too.

Something like the following:

Traits

Encrypt: using a KeyPair, sign data. Alt name: Sign.
Hash: hash data using BLAKE2b into a Hash.
Store: persist data to a Storage instance.

Structs

Node: tree data that is stored.
Bitfield: the main bitfield instance (this should stay as an internal module)
Signature: data hash.
Storage: in charge of writing values to a random-access-storage instance.
KeyPair: set of keys to sign things.
Feed: the external facing API.

Storage API

It'd be good to have the Store trait to implement a method that can write to a Storage instance.

trait Storage {
  fn from_bytes(index: usize, buf: &[u8]) -> Self;
  fn to_vec(&self) -> Vec<u8>;
  fn store<T>(&self, index: usize) -> Result<(), Error>;
}

The read/write methods can then respectively just call the .from_bytes() and .to_vec() methods respectively. Persisting to disk can then be done by just calling out to .store().

By having it in a trait we should be able to simplify the Storage module a bunch. We'll still need dedicated methods for reading / writing each type, but the hope is that we'll be able to do away with some of the inline magic numbers (e.g. 40, 32) and move them to be part of the corresponding structs.

Why Encrypt & Hash traits?

I think it's pretty nice to have a separate submodule dedicated to crypto. By creating traits rather than functions, we can create a looser relationship between our crypto internals, and the way we provide data to the crypto functions.

Plan 0.12 release

This is a tracking issues on what I think we should merge for a breaking 0.12 release soonish:

breaking changes

Move to async/await #103
Move to generic std::error::Error #102
Erase generic type argument from hypercore #113

non-breaking changes

New signatures for Hypercore 9 compat #112
bugfix: put_data #110

I think it would be great to get those merged rather sooner than later, because at least the first three ones all change the public and internal API so that all other PRs oftenly have to be manually updated.

Because we don't have, I think, anyone using hypercore-rs, and because there is no one "properly" maintaining this at the moment, I propose that we let this issue sit here for a couple of days or a week and then go ahead, merge the PRs, and figure out bugs and issues in the process.

Also very open to other suggestions :)

update random-access-*

Feature Request

Summary

Update to the most recent versions of the random-access-* packages

Motivation

Always good to keep up-to-date
RandomAccess trait contains len function, which seems to be required by #72

Unresolved Questions

Is there some reason this hasn't been done already? Are these packages in a state of flux? Are there any known issues which make this hard/impossible

Implement peers / network

hypercore now works pretty well, it's time to implement the networking part.

Todos

Bonus

implement pub feed.cancel() (ref)
implement pub feed.clear() (ref)
implement pub feed.seek() (ref)

Skipping

implement pub feed.download() (ref)
implement pub feed.undownload() (ref)

Resources

Feed.put with data yields weird results

I was looking into coupling hypercore-protocol-rs to hypercore today a bit.

However, I was unable to put any data into a cloned hypercore. It seems the put method either has a bug or I did not understand how to use it correctly. I'll open a PR with a failing test.

use rust ed25519

https://github.com/dalek-cryptography/ed25519-dalek instead of the sodium lib

Async API on hypercore

Feature Request

Summary

The design of hypercore on node is quite callback oriented, and operations may happen at some point. We could provide an async API that feels more like Rust instead of callbacks.

Motivation

Translating the callback oriented operations from dat.js to Rust seems to prompt a more asynchronous API. Now with the stable async/await support, it seems to be ideal to provide such API as we would be dealing with file operations and network.

This would be a breaking change as I don't know yet how to propose maintaining both the sync and async API on the same crate.

Guide-level explanation

We could have an API such as:

let mut feed = hypercore::open("./feed.db").await?;

feed.append(b"hello").await?;
feed.append(b"world").await?;

assert_eq!(feed.get(0).await?, Some(b"hello".to_vec()));
assert_eq!(feed.get(1).await?, Some(b"world".to_vec()));

This would require to change the underlying storage to benefit from async operations as well.

Reference-level explanation

Async traits are not yet stable, but we could use https://crates.io/crates/async-trait to implement it.

We could implement on https://github.com/datrs/random-access-storage a blanket implementation:

#[async_trait]
trait AsyncRandomAccess {
  async fn async_open(&mut self) -> Result<(), Self::Error>;
}

impl AsyncRandomAccess for RandomAccess {
 async fn async_open(&mut self) -> Result<(), Self::Error> {
 async_std::task::spawn_blocking({ self.open() }).await
}

}

This is how far I've thought about it and have a concrete example. I would need to test it out further, but I would like to open for feedback.

Drawbacks

How to maintain both a async and sync API?
We would require all consumers to also use an async executor if we don't expose both APIs.
Async trait is not stable yet, requires an extra dependency on the stack

Rationale and alternatives

Not doing this would mean that networking might not be as performant as it could, as access to file would block the operations.

Unresolved Questions

Many:

How to maintain both a async and sync API?
- Should we keep only an async API, like node?
- Is it ok to require an async executor?
Is it ok to add async_trait as a dependency?
Async trait is not stable yet, requires an extra dependency on the stack

Remove NodeTrait

Having to expose the NodeTrait just to be able to use the methods on the Node type feels leaky. We should create a "sealed" and "unsealed" Node type, where one acts as a front for the other.

We can do this by creating a facade using https://github.com/chancancode/rust-delegate, which should allow us to seal the trait requirements internally without leaking it.

We should probably name one SealedNode and the other Node. Using Node` exclusively throughout our code as to prevent the sealed version from leaking.

README is incorrect about what the sample code prints

Bug Report

Your Environment

Software	Version(s)
hypercore	0.9.0
Rustc	rustc 1.35.0-nightly (237bf3244 2019-03-28)
Operating System	Ubuntu 18.04

Expected Behavior

Program prints

hello
world

Current Behavior

Program prints

Ok(Some([104, 101, 108, 108, 111]))
Ok(Some([119, 111, 114, 108, 100]))

Code Sample

extern crate hypercore;

use hypercore::Feed;
use std::path::PathBuf;

let path = PathBuf::from("./my-first-dataset");
let mut feed = Feed::new(&path).unwrap();

feed.append(b"hello").unwrap();
feed.append(b"world").unwrap();

println!("{:?}", feed.get(0)); // prints "hello"
println!("{:?}", feed.get(1)); // prints "world"

Read key pair from disk

Hi, I was reading the code and stumble onto the comment that said "@todo Read key pair from disk". It doesn't look too hard to implement but I had a few questions:

Do you already know from where you'd want to read the keys? My guess is that it should use the Storage, not sure!
When the keypair doesn't exist (is generated), should it be written to disk (to the Storage?) as well?

Maybe that it's too early in the development to implement this though :).

Cheers.

Dynamic dispatch for storage?

Hi,
I've been thinking about whether it would make sense to remove the generic Storage<T> field on the Feed struct and instead have a Box<dyn DynStorage> object on the Feed. DynStorage would be a trait with all public methods of Storage<T>. Thus, Feed<T> would just be Feed.

Benefits:

More ergonomic to work with in code that uses Feed. This becomes more apparent on the branch that removes failure (#102) where the generic type argument becomes really long and has to be carried through all code that uses Feed.
Allows to combine both in-memory and disk-backed Feeds in a container. Currently, that wouldn't be possible.

Drawbacks:

Cost of dynamic dispatch

I would guess that compiler optimizations might oftenly reduce the cost of dynamic dispatch if you'd only use one implementation in your code paths, but I'm not sure about it.

Opinions?

Running benchmarks fails

$ rustup toolchain list
stable-x86_64-unknown-linux-gnu (default)
1.47.0-x86_64-unknown-linux-gnu

And

$ rustc --test -O benches/bench.rs 
error[E0670]: `async fn` is not permitted in the 2015 edition
  --> benches/bench.rs:10:1
   |
10 | async fn create_feed(page_size: usize) -> Result<Feed<RandomAccessMemory>, Error> {
   | ^^^^^ to use `async fn`, switch to Rust 2018
   |
   = help: set `edition = "2018"` in `Cargo.toml`
   = note: for more on editions, read https://doc.rust-lang.org/edition-guide

error: expected one of `!`, `)`, `,`, `.`, `::`, `?`, `{`, or an operator, found keyword `move`
  --> benches/bench.rs:12:28
   |
12 |         |_| Box::pin(async move { Ok(RandomAccessMemory::new(page_size)) }),
   |                            ^^^^ expected one of 8 possible tokens

error: aborting due to 2 previous errors

For more information about this error, try `rustc --explain E0670`.

But in the toml file there's already

edition = "2018"

why is it being ignored?

Dependabot can't resolve your Rust dependency files

Dependabot can't resolve your Rust dependency files.

As a result, Dependabot couldn't update your dependencies.

The error Dependabot encountered was:

    Updating crates.io index
error: failed to select a version for the requirement `curve25519-dalek = "^0.20"`
  candidate versions found which didn't match: 1.2.1, 1.1.4, 1.1.3, ...
  location searched: crates.io index
required by package `ed25519-dalek v0.8.1`
    ... which is depended on by `hypercore v0.9.0 (/home/dependabot/dependabot-updater/dependabot_tmp_dir

If you think the above is an error on Dependabot's side please don't hesitate to get in touch - we'll do whatever we can to fix it.

You can mention @dependabot in the comments below to contact the Dependabot team.

Remove Feed Builder in favor of Feed simple create function

hypercore/src/feed_builder.rs

Lines 14 to 18 in 482f491

 /// Construct a new `Feed` instance. 

 // TODO: make this an actual builder pattern. 

 // https://deterministic.space/elegant-apis-in-rust.html#builder-pattern 

 #[derive(Debug)] 

 pub struct FeedBuilder<T>

The link referenced shows a crate that offers a macro to automatically derive builders for structs. It seems like it should be relatively easy to use derive_builder with Feed. Is the intention to use this macro, or write a custom builder?

refactor binary interface types from `usize` to their specific sizes

Feature Request

Summary

In some places usize is being used in situations where it would be better to explicitly define what bit size we want to use. One example can be found here:

hypercore/src/storage/node.rs

Lines 53 to 54 in c7c8757

 // TODO: This will blow up on 32 bit systems, because usize can be 32 bits. 

 let length = reader.read_u64::<BigEndian>()? as usize;

Motivation

The crate is currently built with a 64 bit word size in mind, this will cause issues when run on a 32-bit system

Explanation

Review all uses of usize, evaluate if u64 would be more appropriate, change the types, fix any compiler errors that occour. (optional: build and run on a 32 bit system to check compatability)

Unresolved Questions

Is this something that should be done now? are there valid reasons to postpone this work? Are there any other relevant issues that have not been described in this ticket

🙋 Android bindings

Hi there (not sure if the appropriate repo to do this)

Thanks for the project, it seems to be moving quite well. The idea of a dat component in Rust is quite nice as it could reach many devices.

One platform that interests me is Android. I've been looking at staltz/dat-installer and this is quite a nice example of possible app on Android that would benefit a lot of having datrs available as a lib. I would like to help with providing such capabilities for Android, but I'm not sure how to help.

I have no experience on bindings and Android NDK's FFI, but I'm interested to be able to use dat on mobile applications. A bonus point would be to provide a Flutter plugin which would allow cross-platform development as well.

What would be the next steps for this?

Possible issue with signing data.

Bug Report

Your Environment

Software	Version(s)
hypercore	Master Branch
Rustc	N/A
Operating System	N/A

Expected Behavior

If I understand correctly the sign function in hypercore/crypto should be using the secret/private key to sign the data rather than the public key.

Current Behavior

Currently using the public key rather than private key from what I can tell.

Code Sample

https://github.com/datrs/hypercore/blob/master/src/crypto/key_pair.rs#L15-L22

Networking: NOISE handshake and transport encryption

For a working Dat2 / hypercore 8 implementation in Rust, we'll have to implement the NOISE handshaking and transport encryption.

I'll document what I found out while looking into this.

I don't think there's any "official" documentation about the NOISE handshake and transport encryption in hypercore yet. Looking into the code reveals:

The handshake pattern used is Noise_XX_25519_XChaChaPoly_BLAKE2 (see noise-protocol/handshake-state and simple-hypercore-protocol/handshake). The handshake uses a state machine module called simple-handshake, which then uses noise-protocol, a Javascript implementation of some parts of the NOISE spec that uses sodium-native for the actual crypto.
The handhshaking messages are sent over the binary stream with a varint length prefix (see simple-hypercore-protocol/lib/handshake.
During the handshake, the payload being transmitted from each side is a protocol buffers encoded message with a 24 byte nonce (random bytes), created in simple-hypercore-protocol/index.js
After finishing the handshake, the transport encryption is not done by noise-protocol (in other NOISE impls, there's usually a function to switch from the handshake phase directly into transport phase, where the cipher type is the same in transport mode as in the handshake).
Instead, the transport is encrypted with XSalsa20 directly in simple-hypercore-protocol with sodium.crypto_stream_xor (see simple-hypercore-protocol/lib/xor.js and the sodium docs), with instances for receiving and transmitting, where for each the keys are coming from the split result of the NOISE handshake and the nonces are the payloads that were transmitted during the handshake. sodium.crypto_stream_xor resolves to crypto_stream_xsalsa20_xor which, in sodium-native, uses crypto_stream_xsalsa20
The messages in transport mode are not length-delimited. Each buffer is decrypted as it floats in.

I tried to connect to a nodejs hypercore-protocol stream from Rust, however I hit a few roadblocks. I started with snow as it seems to be the most complete NOISE impl in rust. Following are, I think, what's missing to make connecting to a hypercore-protocol stream:

snow doesn't support the XChaChaPoly cipher. I created an issue for that. There's a rust impl of XChaChaPoly in chacha20poly1305.
In a quick test where I added XChaChaPoly to snow I still couldn't handshake with a hypercore-protocol stream (decryption error on either side), this needs to be investigated more
snow doesn't allow to switch cipher types between the handshake phase and the transport phase, nor does it currently allow to access the split results directly

	/// Construct a new `Feed` instance.
	// TODO: make this an actual builder pattern.
	// https://deterministic.space/elegant-apis-in-rust.html#builder-pattern
	#[derive(Debug)]
	pub struct FeedBuilder<T>

	// TODO: This will blow up on 32 bit systems, because usize can be 32 bits.
	let length = reader.read_u64::<BigEndian>()? as usize;

datrs / hypercore Goto Github PK

hypercore's People

Contributors

Stargazers

Watchers

Forkers

hypercore's Issues

Feature Request

Summary

Motivation

Guide-level explanation

Reference-level explanation

Drawbacks

Rationale and alternatives

Unresolved Questions

Question

Your Environment

Question

Question

Your Environment

Question

Context

Feature Request

Summary

Motivation

Traits

Structs

Storage API

Why Encrypt & Hash traits?

Feature Request

Summary

Motivation

Unresolved Questions

Todos

Bonus

Skipping

Resources

Feature Request

Summary

Motivation

Guide-level explanation

Reference-level explanation

Drawbacks

Rationale and alternatives

Unresolved Questions

Bug Report

Your Environment

Expected Behavior

Current Behavior

Code Sample

Feature Request

Summary

Motivation

Explanation

Unresolved Questions

Bug Report

Your Environment

Expected Behavior

Current Behavior

Code Sample

Recommend Projects

Recommend Topics

Recommend Org