The swarm's discuss from ethersphere

Mock datastore for synchronization tests

To conduct fast synchronization tests it would be useful to implement a mock datastore which does not actually store data.
A central storage component just stores which chunk is available at which node, there is no real data involved.

Swap with no chequebook contract

Let's make swap work without chequebook contract, so people can start to use swarm without having any ethers.
If the node provides an account address instead of a chequebook contract let's just send ether there directly

GET request for '/' should return HTTP 200

GET request for just '/' (as opposed to bzz:/ etc.) should return a HTTP 200 and some minimal status info, not an error about an invalid protocol

https://github.com/ethereum/go-ethereum/blob/master/swarm/api/http/server.go#L555

Merge all `pss` related branches to one

We have some pss related branches which could be merged into one branch.

swarm needs a gateway mode (transparent proxy that rewrites bzz:// URLs)

Context:
You can run swarm on a local node and see swarm content on http://localhost:8500 or you could access swarm via a gateway. These require different HTML content than if you have a bzz:// URL handler installed in your browser.

Problem:
Swarm is meant to serve HTTP content for web3 dapps and serverless websites such as our swarm homepage theswarm.eth. In the HTML of that website we want to have URLs of the form

bzz://theswarm.eth/images/some-image.png

but at the moment we are using the relative URL

bzz:/theswarm.eth/some-image.png

which loads

http://localhost:8500/bzz:/theswarm.eth/images/some-image.png
or
http://swarm-gateways.net/bzz:/theswarm.eth/images/some-image.png

This is not good. We cannot have the content of swarm hosted websites to have to be adapted in this way. We need the following:

Default behaviour of all swarm hosted websites should be to use bzz:// URLs.

2a. It should be possibly to run swarm with a --gateway flag which adds a transparent proxy to all swarm content, replacing bzz:// with http://localhost:8500/bzz:/ or with http://gateway-url/bzz:/.

2b. Swarm should not come with a gateway mode itself, but come bundled with an nginx config file that achieves the same as 2a. We should then describe this method in documentation and here: https://ethereum.stackexchange.com/questions/8187/how-to-run-a-swarm-gateway

Some notes:
An alternative is to do the re-writing client side, but then you cannot surf swarm content without a browser plugin / bzz:// handler

Negotiate on chunk price in link

Upon connection nodes should agree on a swap chunk price
Minimal implementation: offer prices, if it is the same, connect, otherwise do not connect.
Use a default price

p2p/simulations: event journal posting to p2p layer

(reminder)

System information

Geth version: NA
OS & Version: any/tests

Expected behaviour

Journal events for msg and node start/stop in simulation should be triggered by callback in p2p layer.

Actual behaviour

It's now triggered in the simulation layer.

SIP - Changes in manifest handling (the trailing slash problem and other issues)

Manifest traversal, from trailing slashes to over/under matched paths has been a headache for a while. We have gotten bugs, we have gotten unexpected behaviour, we have gotten confused.

This issue is created as a placeholder for the following discussion:

"Manifests should treat / as a special character and should always break on / and not on any substring."

Review swear (and swindle) contracts

Let's check the status of the swear and swindle contracts and make a plan what can we keep and what do we have to change in them:
https://github.com/ethersphere/swap-swear-and-swindle/

Make swarm-network-rewrite green on travis

After #160 we have all our network rewrite / pss in code in https://github.com/ethersphere/go-ethereum/tree/swarm-network-rewrite

However, this branch is not green on travis: e.g.: https://travis-ci.org/ethersphere/go-ethereum/builds/316594647

Let's fix this, and then we can use this branch as our mainline for developments which are dependent on the network rewrite.

Calculate swap in ether instead of chunk

Current swap code calculates the swap balance in number of chunks.
Instead of this peers should agree on a chunk price during handshake (see issue: #223) and do the accounting in ethers from then.

JSON response for GET request for '/'

If the content type in the request header is application/json we need a diffferent response for the '/' request and not the html one (see issue and comment: #155 (comment))

ENS resolver for mutable resource updates

create ENS resolver that accepts transactions with arguments blockNumber, version, CH, S from any third party to update the record

Implement intervals

Intervals are meant to store the chunk index intervals which are already fetched. When a peer disconnects it can happen that we lose an interval and historic syncing will have to fetch it on the next connect.
Intervals should be saved every time a batch is done (Client.BatchDone), before the takeover proof is sent back (if there is one).
This is partially implemented and commented out, we need to finish the implementation and write tests for it.

We need two different type of syncer streams: historical and session syncing. Between full storer nodes both will be there in both directions (so 4 streams between the two nodes). Implement these in syncer and add it to the Subscribe message to specifiy whether this is a historical or a session syncer.
Implement functional tests for historical syncing using intervals
Implement functional tests for historical syncing across sessions using persisted intervals

Fix discovery tests

fix network/simulations/discovery_test.go and rework it using p2p/testing/simulations

Write JS snippet which rewrites bzz:// urls to gateway urls

We would like to provide a small JS script which can be added to webpages to rewrite bzz:// links to http://current.gateway/bzz:/ links, so the page can work for users who don't run swarm nodes or any other special swarm software.

See the details here: https://gist.github.com/gbalint/9e43a97a781b482731f1542ff683cf35

reduce the number of errors in http access

the GET request /favicon.ico should return our favicon bzz:/22481deec05d53e909e4f3933842686113927c67ab2a22c8ad5614e4e3dc505c/favicon.ico
the GET request for robots.txt should return a valid robots.txt
the default behaviour of any manifest should be to redirect to the entry for '/' and what was previously called the --defaultpath should be the entry at '/'
the GET request for just '/' (as opposed to bzz:/ etc.) should return a HTTP 200 and some minimal status info, not an error about an invalid protocol https://github.com/ethereum/go-ethereum/blob/master/swarm/api/http/server.go#L555
manifests without a default path should default to the behaviour that ?list=true induces. (Or at least it should be possible to enable this behaviour on the local node)
the GET request to /status should output a status page showing sync status, connected peers, disk usage etc. etc. ... possibly leading in future to a configuration assistant?
The configuration file should be populated with default values or should be edited manually.. currently empty entries in the config file are overwritten by any command line arguments. This leads to funky config files if the first time you run swarm you get an argument wrong. For example if I change httpaddr to '' in my config and then run swarm with --httpaddr doofus, then I'll have doofus stuck in my config until I change it manually.

Syncer in mock storage mode

Allow syncer to operate in mock storage mode, i.e., no need to pass the chunk data

Chunk validity check should be abstracted (different for content chunk, resource chunk, test-mock-storage): #263

Syncer tests should pass, see: #242

Create a custom 502 page for the gateway

In our swarm-gateways.net cluster, we run a nginx server reverse proxying to swarm nodes. If swarm nodes can't be reached, nginx displays a standard 502 Bad Gateway error message.

The gateway should have a nicer and more informative custom page.

Redesign storage Chunk structure

Chunk structure needs a redesign. Premature Initial thoughts here:

https://gist.github.com/zelig/e6f4234fc9af9a5b3aca4ad9339a45a7

Chunk struct has a field Size, which is problematic for 2 reasons:

It is redundant information because Size is always determined by Sdata
There is no clear responsibility on who has to set it and when is it set.

Rename swarm deb package

the ubuntu package for go-ethereum is conflicting with an existing package which also has an executable named swarm. bzz is also taken

Rename to:

~~bzz~~ taken
go-swarm
gbzz
go-bzz-
bzzd
bee after all you control/start/config a bee of the swarm

Evaluate and choose monitoring components

Probably we should use influxdb as the backend, but we can change this decision too.
It is also a question what protocol to use: influxdb, statsd, or graphite

Default manifest behavior

The default behaviour of any manifest should be to redirect to the entry for '/' and what was previously called the --defaultpath should be the entry at '/'

Investigate and enable CI (Travis) on ethersphere/go-ethereum

Fix overlay simulation in swarm-network-rewrite

From @holisticode on Gitter:
The overlay simulation may not be taken care of in the current rewrite branch, as it is easily missed as it is a main executable
It may be a good idea to add a test to it if we want to keep the overlay simulation
Please also add it to CI, to keep it working

Http status page for swarm nodes

We need a status page on the http interface, e.g. a on /status

Show the information gathered from the status API endpoint: #243

Maybe this can be merged with the admin page: #159

GET request /favicon.ico should return our favicon

bzz:/22481deec05d53e909e4f3933842686113927c67ab2a22c8ad5614e4e3dc505c/favicon.ico

Need proper URL handlers for browser

We need an easy way to teach browser about bzz:// bzz-raw:// bzz-list:// bzz-hash:// and bzz-immutable:// URLs.

There is one URL handler in the repository since forever but it does not work correctly.

This is also important because at the moment we use relative links "bzz:/theswarm.eth/a/b/c" in our HTML when we should be using absolute links "bzz://theswarm.eth"

One of these works on the gateway, the other works with proper URL handling... what should the correct solution be?

Http admin page for swarm nodes

Closely connected to the status page issue #158

The difference from the status page that the admin page should be interactive: it should contain parameters which can be updated.

It doesn't necessarily has to be a separate page from the status page.

Examples

update storage size
upload file

Possible issues

Some parameters are not updatable on the fly in the swarm implementaion. So this issue is not just ui development, some refactoring is possibly needed in swarm too.

We have take care of security, perhaps there is an "internal" domain, for example bzz://upload.internal and bzz://status.internal (a bit like chrome://settings/)

Clean up swarm documentation

Parts of the documentation are really old and confusing. To get started, here is a list of some things that could be changed:

Make PR for p2p/protocols in swarm-network-rewrite

swarm-network-rewrite will be huge PR to master, so the best would to start to integrate the parts which are independent and can already go to master.

First step can be p2p/protocols

Batch write logging reports incorrect numbers

Batch write always logs that 0 chunks has been written, even if there were successful writes:
DbStore: batch write (0 chunks) complete

See the code here:
https://github.com/ethersphere/go-ethereum/blob/swarm-network-rewrite-syncer/swarm/storage/dbstore.go#L622

List requests should use url scheme instead of query parameter

For the list request type we should use the bzzl:// url scheme instead of the list=true query parameter.

Reason

We decided url schemes are the way to go, see: https://gist.github.com/gbalint/ce09ada0bebc38f3a8a3dd2ce6c6299f

Question

Should we keep backward compatibility with the query parameter (at least for a while)?

Network rewrite streamer/syncer related issues

network rewrite streamer phase breakdown of tasks

multihash support in swarm

We are often asked about multihash support.

We should have a proper discussion about pros and cons at least. I suggest this issue shall be the place for that discussion until someone formulates this as an EIP (or is it SIP?)

Rebase on swarm-network-rewrite

Louis had resource update merges which are need to be synced to swarm-network-rewrite-syncer

chunker modifications

~~use context for abort etc~~
eliminate wait groups from API
- storage should be waited on by default, if not needed caller starts split in go routine
- processors quitting not needed to be waited on
~~backend for progress bar (unclear how to combine split progress with disk storage progress)~~
chunk encryption API
~~API for shannonian obfuscation for plausible deniability~~
~~API for erasure coding. two modes of operation with join:~~
- cheap/slow mode: retrieve first n hashes of intermediate chunk, fallback to parity chunks only if some not found.
- fast mode: n out of m race

Don't give error if hashes are prefixed with 0x

The hash for the site theswarm.eth is currently 2c2d2adb8fd0cba399282fb59f8219e5fbbd67ba06fcf5c8d343f5eb1c8be022

It is documented that if you are setting a content hash in ENS and you submit that hash, you are making an error. The correct hash to submit to ENS is 0x2c2d2adb8fd0cba399282fb59f8219e5fbbd67ba06fcf5c8d343f5eb1c8be022

Conversely, I have just discovered that calling the 0x hash on bzz gives an error too
See for example:
http://swarm-gateways.net/bzz:/0x2c2d2adb8fd0cba399282fb59f8219e5fbbd67ba06fcf5c8d343f5eb1c8be022/

Surely this can be fixed.

Possibly solutions:

- If you encounter a bzz:/0xHASH request, redirect HTTP 301 to bzz:/HASH
- Serve the same content at 0xHASH as you would at HASH
- Display a helpful error to the user, telling them to remove the 0x form the URL

Error pages should not contain swarm-gateways.net

Currently the error pages contain hard-coded links to swarm-gateways.net

These should be replace with straight-up bzz:// URLs.

Examples:
https://github.com/ethersphere/go-ethereum/blob/master/swarm/api/http/error_templates.go#L390
https://github.com/ethersphere/go-ethereum/blob/master/swarm/api/http/error_templates.go#L542
also line 365, 214 and possibly more.

Swarm needs Browser plugins to handle bzz URLs

With the help of a browser plugin users should be able to use and select a bzz provider, be it localhost:8500, swarm-gateways.net or infra-swarm-gateway, and use bzz:// urls natively.

mod_time in manifest should be optional or better documented

When uploading using bzz, a manifest is created which includes a timestamp of when the upload was created, so it creates a different hash each time. Uploading using bzz-raw (i.e. just uploading the raw bytes), will always lead to the same hash.
To demonstrate, if I upload the same data twice I'll get two hashes but inspecting the manifests, I see the same hash was stored under the path:

$ curl -F "my-file=data" http://localhost:8500/bzz:/
ca16a6b21ddb375fd718bb931cb039b0ff3fdaabc90b4c5e0e82604345764182

$ curl -F "my-file=data" http://localhost:8500/bzz:/
09b70636040f1a9cd88d399dc9cca3fe740e4b1283b5100e1eb57ac4d1b9c5ae

$ curl -s http://localhost:8500/bzz-raw:/ca16a6b21ddb375fd718bb931cb039b0ff3fdaabc90b4c5e0e82604345764182/ | jq .
{
  "entries": [
    {
      "hash": "61cc094e478970c7e58bf44cd1e13b2851d9cea254327d08dbdd1918b454b9f8",
      "path": "my-file",
      "size": 4,
      "mod_time": "2018-01-24T11:30:10.24894844Z"
    }
  ]
}

$ curl -s http://localhost:8500/bzz-raw:/09b70636040f1a9cd88d399dc9cca3fe740e4b1283b5100e1eb57ac4d1b9c5ae/ | jq .
{
  "entries": [
    {
      "hash": "61cc094e478970c7e58bf44cd1e13b2851d9cea254327d08dbdd1918b454b9f8",
      "path": "my-file",
      "size": 4,
      "mod_time": "2018-01-24T11:30:13.609200639Z"
    }
  ]
}

This being said, perhaps the addition of mod_time should be optional if people just want deterministic manifest creation.

this has been causing people headache so we should consider documenting it properly or make it optional

Swear contract basic structure

Swear contract maps addresses to a set of game contracts which the owner serves to keep.
Game contract specifies the stake and swear keeps it.
There should be an API call on swear (that swindle can call) and burn the stake.
More details in the paper: https://www.sharelatex.com/1452913241cqmzrpfpjkym

Enable multihash support for swarm root hashes & ENS

Goal: as described in #166 we want to be able to request swarm data using URLs of the form bzz://<multi-hash>/path/in/manifest.

The reason is that this will allow people to store multi-hashes in the ENS resolver contracts at "content" and thereby allowing swarm and ipfs and other systems to exists side by side.

This change also allows us to add to ENS Swarm content that has been uploaded with the --encrypt flag. In the current system that is not possible.

Enable retrieval of swarm-content using a multi-hash in the URL
Generate a multi-hash when uploading swarm content
Document the functionality in the swarm docs
Notify the ENS guys -> Need new resolver and new ENS tools.
Update all our own ENS names to use a multi-hash

GET request for robots.txt should return a valid robots.txt

What should be the contents?

swarm/storage: memstore cache of failed lookups

System information

Geth version: 1.5.10-unstable
OS & Version: Windows/Linux/OSX
Commit hash :55901fffe2ae3b535b5063add279862e0484671e

Expected behaviour

A and B are fresh nodes with empty datastores
Bring up node A, upload to A, bring down A, bring up A and B, connect A and B (manually), request file from B, B forwards to A, B returns file.

Actual behaviour

If file request is made too early after startup (possibly the threshold is the syncer run) the retrieval fails. Even if retrieval of other files after this threshold works, the file that failed will continue to fail until B (possibly also A, please double check this) is restarted.

Steps to reproduce the behaviour

See above…

Migrate swap package to swarm-network-rewrite

Make sure that swap works well with the network rewrite branch

p2p and p2p/protocols packages handle messages synchronously

The p2p and p2p/protocols packages are reading and then handling requests/messages synchronously.

As a result the PSS protocol is deadlocking on the sim adapter (which uses a net.Pipe).

# p2p/peer.go

func (p *Peer) readLoop(errc chan<- error) {
	defer p.wg.Done()
	for {
		msg, err := p.rw.ReadMsg()
		if err != nil {
			errc <- err
			return
		}
		msg.ReceivedAt = time.Now()
		if err = p.handle(msg); err != nil {
			errc <- err
			return
		}
	}
}

# p2p/protocols/protocol.go

// Run starts the forever loop that handles incoming messages
// called within the p2p.Protocol#Run function
func (p *Peer) Run(handler func(msg interface{}) error) error {
	for {
		if err := p.handleIncoming(handler); err != nil {
			return err
		}
	}
}

In effect the PSS protocol is handling one request/message at a time.

Let's discuss what's the best way to address the deadlock:

Add concurrent multi-message handling to the PSS protocol ?
Keep the synchronous behaviour, but make sure we have a sufficiently big read/write buffer on each connection/adapter ?
Other ideas ?

Implement metrics for swarm

This issue has been opened in order to be tracked in the ethersphere project. It's source issue is ethereum/go-ethereum#15481:

For swarm, it would be good to be able to collect stats and metrics on a node concerning storage (chunks, local DB, etc.) and chequebook properties (consumed and delivered services, service peers, balances, cheques, etc.).

This issue is a parent issue for #177, which is about evaluating the technical infrastructure for the metrics implementation.

Also related are issues #159 and #158 which may be visualizing information based on metrics

Mutable resource updates outstanding tasks

Manifests without a default path should default to list request

Manifests without a default path should default to the behaviour that bzz-list:// induces. (Or at least it should be possible to enable this behaviour on the local node)

ethersphere / swarm Goto Github PK

swarm's Issues

System information

Expected behaviour

Actual behaviour

Examples

Possible issues

Reason

Question

System information

Expected behaviour

Actual behaviour

Steps to reproduce the behaviour

Recommend Projects

Recommend Topics

Recommend Org