Giter VIP home page Giter VIP logo

Comments (8)

cobordism avatar cobordism commented on May 30, 2024

Examples of unexpected behaviour curently:


compare with/without trailing /
http://swarm-gateways.net/bzz:/b7395fd6a3165b0166cb27c7cb3f5b15be90158bc742fb5d36d420d295d8465d/img
to
http://swarm-gateways.net/bzz:/b7395fd6a3165b0166cb27c7cb3f5b15be90158bc742fb5d36d420d295d8465d/img/
the first give error 500, the second gives 300 (correctly, but links are wrong)


over/underspecified paths
http://swarm-gateways.net/bzz:/fc3f49ff5fa3ce86e97f081c9ac74751b48be3a4cccb54a7aed4c5feb2574d69/img/thumbs/THUMB_Alexey_1.png (image as expected)
http://swarm-gateways.net/bzz:/fc3f49ff5fa3ce86e97f081c9ac74751b48be3a4cccb54a7aed4c5feb2574d69/img/thumbs/THUMB_Alexey_1.pn (works)
http://swarm-gateways.net/bzz:/fc3f49ff5fa3ce86e97f081c9ac74751b48be3a4cccb54a7aed4c5feb2574d69/img/thumbs/THUMB_Alexey_1.pngg (error 404)

I bring this up because previously we had the opposite problem where the third version worked but the second did not (hint, they should both not work)


missing manifest entries
When we try to request a missing entry such has: http://swarm-gateways.net/bzz:/fc3f49ff5fa3ce86e97f081c9ac74751b48be3a4cccb54a7aed4c5feb2574d69/img/thumbs/missing.png we get a manifest... why?

from swarm.

cobordism avatar cobordism commented on May 30, 2024

For reference - earlier discussion on handling manifests

from swarm.

cobordism avatar cobordism commented on May 30, 2024

From Gitter:

Lewis Marshall @lmars 14:33

@zelig
In order to support RESTFUL APIs via client side js we need to support fallback to longest existing prefix ... This has been a conscious feature from day 1. @lmars why does this shake the rock solid foundation you have in mind?

So it isn't specifically the feature which shakes the foundation, I am trying to convince you guys that the current implementation is giving us constant headaches and head-scratching moments because we don't know what the features are, so we keep breaking those features as they are untested and we don't know when we break them until someone comes along and says they were a feature since day 1 .
My focus has been trying to come up with a solution which is simpler (both as a model, Unix filesystem, and in code, my example above), and is easier for us to reason about so that we can easily spot when the code gets broken, whereas currently the code seems very fragile.
I admit my example falls down for large directories, but this is a well researched area (e.g. ext4 supports large directories, so does IPFS which already serves Wikipedia).
Whether it's an arbitrary prefix-trie or branching on / like a filesystem, let's document all the features it should support, write some more exhaustive tests and let's stop breaking it
I'll start by listing some of the features which have been mentioned:

  • have a catch-all function where the same content is served with an arbitrary suffix added to the path (e.g. a client-side REST API can be deployed at / and then paths like /user/1 will return the app, leaving the app to further process the path and act accordingly)
  • efficiently serve sites with large directories of files like Wikipedia
  • mount a manifest like a filesystem using FUSE (either just hide paths that end in /, don't allow any paths in the manifest to contain /, add options to control the behaviour)
  • match on path?query from the URI rather than just path so that different content can be served using the query string (to support map tile APIs as mentioned by @nagydani)

EXT4: https://ext4.wiki.kernel.org/index.php/Ext4_Disk_Layout#Hash_Tree_Directories
IPFS: ipfs/notes#76

from swarm.

cobordism avatar cobordism commented on May 30, 2024

From the EXT4 link:

A linear array of directory entries isn't great for performance, so a new feature was added to ext3 to provide a faster (but peculiar) balanced tree keyed off a hash of the directory entry name. If the EXT4_INDEX_FL (0x1000) flag is set in the inode, this directory uses a hashed btree (htree) to organize and find directory entries.

So that would correspond to two different types of manifest, yes?

from swarm.

cobordism avatar cobordism commented on May 30, 2024

What should the behaviour be?

  1. a manifest contains a hash of an html file entry for the empty path and also contains an entry for ?x=y. A user requests hash-of-manifest?x=y
  2. a manifest contains a hash of an html file entry for the empty path and also contains an entry for ?x=y. A user requests hash-of-manifest/?x=y
  3. a manifest contains entries for i and images and the i manifest contains an entry for mages. A user requests images.
  4. a manifest contains a hash of an html file as default (empty string) entry and the hash of something else at /. The manifest hash is saved as name.eth. A user calls bzz:/name.eth/
  5. a manifest contains a single entry for file, a user requests fil
  6. a manifest contains a single entry for file, a user requests fileXXX
  7. a manifest contains entries for abc, abd and abe. A user requests a.
  8. a manifest contains a default (empty string) entry as well as entries for abc, abd and abe. A user requests a.
  9. a manifest contains as default entry the hash of a manifest with default entry the hash of a manifest with default entry the hash of a manifest with default entry the hash of a manifest with entry index.html. A user mounts the original manifest via FUSE.
  10. a manifest contains only a default entry - hash of a file. A user mounts the manifest with FUSE.
  11. A manifest with hash H1 contains an entry for .eth with hash H2. The domain H1.eth is registered and hash H3 is added as content. A user opens bzz://H1.eth

from swarm.

holisticode avatar holisticode commented on May 30, 2024
  1. A manifest contains an entry for a (dynamic js) single-page app, thus needs to handle everything behind a #, e.g. <host>/bzz:/<hash>#page1, <host>/bzz:/<hash>#page2, <host>/bzz:/<hash>#page3

from swarm.

cobordism avatar cobordism commented on May 30, 2024

some more thoughts https://gist.github.com/a83ad855a03190463738e93fcf6aa339

from swarm.

cobordism avatar cobordism commented on May 30, 2024

notes from yesterdays discussion:

  • We need to make sure we handle the ? correctly
  • manifests should declare explicitly what to do with overmatching (requesting fileX when manifest contains file); whether to serve the content or a 404. [Default 404 unless overmatch begins with a ? maybe?]
  • The / character that appears directly after bzz:/<hash> or bzz:/name.eth is special and we should probably automatically add it in with a redirect. Although this means that it is not possible to load name.eth but only name.eth/, but has the benefit that html links are always handled correctly whether loaded at bzz://name.eth or http://gateway/bzz:/name.eth
  • It should be possible to have the empty path resolve to a hash ("hard link") or to a string - specifying another entry in the manifest ("soft link").
  • we did not discuss URL fragments
  • we did not reach a conclusion about what the default behaviour for undermatches should be - 404 or 300 or ... but whatever the default is, we said that the manifest could provide an explicit override.
  • we will schedule another call to carry this discussion forward.
  • we did not discuss ENS

A note on mounting file systems:

  • Several team members feel that mounting a manifest as a filesystem is not of primary concern - or rather: not every manifest needs to be mountable as a filesystem.
  • Any default file in a manifest (hard link above) will be invisible to the mounted directory. [Unless further tooling is developed in which this default hash can be some form of attribute to the directory]
  • If there are (if we allow) keys ending with a / that resolve to hashes of content other than manifests, then that content will also be invisible in the mounted directory.
  • Suggestion: Any directory uploaded with swarm --recursive up should produce manifests that are mountable as filesystems.

from swarm.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.