Giter VIP home page Giter VIP logo

Comments (10)

gmaclennan avatar gmaclennan commented on August 16, 2024

Also: if a way is forked and one fork is a delete, does the indexer correctly delete the reverse lookup? We should ensure we have a test for that.

I think we may be hitting this. I am seeing features appear in Mapeo/iD which do not actually exist in the database when you try to query them. I have a feeling this is because of the following scenario:

  1. Way A is created with points B, C, D
  2. User X edits A by changing an attribute --> A'
  3. User Y deletes A (not A')
  4. X and Y replicate data. I think the lookup code in queryStream may return the deleted way when any of points B, C, D are in the bbox.

Can we write a test for this?

from osm-p2p-db.

hackergrrl avatar hackergrrl commented on August 16, 2024

I wrote a test for this in osm-p2p-server (already had the scaffolding there), but what I see is the modified way being returned: digidem/osm-p2p-server@265d2ab

from osm-p2p-db.

hackergrrl avatar hackergrrl commented on August 16, 2024

@gmaclennan I remember you describing another case involving overlapping bounding boxes, or similar. Could you describe that?

from osm-p2p-db.

gmaclennan avatar gmaclennan commented on August 16, 2024

The bugs in Mapeo definitely seem related to queryStream lookups (from the api/0.6/map endpoint). iD makes queries to this end point for tiles at zoom 16 (regardless of the UI zoom of the map). This end point returns nodes outside the bounding box if they are referenced by ways that have nodes inside the bounding box, is accordance with the spec

What I am guess is happening is that this lookup code (essentially, hyperlog-index) has a bug. My guess would be to look at a forked way like the one described above and do a bounding box query that only includes one of its nodes. We should carefully think through what should actually happen in this scenario, both when we are requesting forks and when we are not. What happens when one of the "forks" is a deleted way?

from osm-p2p-db.

hackergrrl avatar hackergrrl commented on August 16, 2024

After much digging, I believe I have an explanation. Consider a way with the following DAG structure in osm-p2p-db:

83b1221b4be413a30478fbb0d5be06395bd0c55d6a760ef4cf564531b10eccc2
  <- b9d60d6eb459b2e0fc693226c847b51995923c46c8fde6a20c3b94ff8579f929
    <- 6117e6d75ddb56985a53f3e116d3b4c0da8f2a231bd861c97d98071f02146759
      <- 86679afd770983ff67c529948ad5a053819ae1199a996d122e5927ba5d8aab10
        <- 495ee12c58537aa44fef7bb872ccf6cfebe8e23b3b490817b2e0d98bd11fccba
          <- e8c20112a059945fe8a3dec28972979b9b488cf62ab43c49556053e9fec4d450
            <- a7accc4532bb53283c7e582eee9558c418591b4c05eb8892ec3f5c233b2c55bf
              D dded60535063f8202e8ab7d549a78ce3ae787e39940e02846c818e2aba679f50
    <- ce2a6bac3aff1587415add7e0028a7da1d7a35832416bcd25306121fbaed0471

Where 83b1221.. is the root, and the others are children, grandchildren, etc. In this tree, many edits happened to the root way, eventually culminating in a deletion of the way (and all of its nodes). However, elsewhere on another machine, another edit took place from an earlier version of the way (ce2a6b..).

This results in the osm-p2p-db being in the following state:

  1. osm.get(osm_id_of_the_way) => ce2a6bac3aff1587415..
  2. osm.get(a_node_in_that_way) => null

What's happening is, after these forks exist in the same osm-p2p-db, the database considers the non-deleted way as real, but still thinks all of its nodes are deleted.

from osm-p2p-db.

gmaclennan avatar gmaclennan commented on August 16, 2024

Yes, good find. When an element is deleted via the API we check whether it is referred to by any existing elements: https://github.com/digidem/osm-p2p-server/blob/master/lib/filter_deletes.js

Of course, as you point out, it could be referred to by forked elements on other peers.

I am not entirely sure about the best way to deal with this. I think if a way exists that refers to deleted nodes, we should return those nodes in the way.

I think osm.get() should have a way of distinguishing between a node that does not exist and a node that has been deleted. The OSM API uses an attribute visible=false for elements that have been deleted (http://wiki.openstreetmap.org/wiki/Elements#Common_attributes). We should perhaps use a similar convention, e.g. if an element exists but has been deleted osm.get() should return the element but with the property visible: false.

I think this is needed anyway for the filtering of forks to work correctly e.g. two forks of a point exist, and in Mapeo you would see the most recently edited. If you delete that fork, the de-forking code will return the non-deleted fork on the next request, so to the user it looks like the delete action failed.

from osm-p2p-db.

hackergrrl avatar hackergrrl commented on August 16, 2024

from osm-p2p-db.

gmaclennan avatar gmaclennan commented on August 16, 2024

Yes, the issue of ways and relations referring to nodes/members only by OsmID is a big problem, not just for osm-p2p but for anybody dealing with historic OSM data. I know the Mapbox data team has hit this in their need to review data changes. The workaround they use I think is to use the timestamps to reconstruct which version of nodes/members were referenced by a particular way/relation. This is obviously fragile and costly, especially in a p2p system.

I think we can add versions to the way/relations in a way that remains compatible with existing clients:

  1. Internally we should store both id and version of nodes within a way and members within a relation.
  2. We should prefer version when doing a lookup in the index, but fallback to id.
  3. When a non-p2p-aware client submits a change, use the version of the nodes / members from the changeset if present in the changeset, if not set the version by selecting the most recent fork using the same algorithm that would have been used to present the most recent fork to the client.

Some issues we would hit:

  1. iD editor does not include any unchanged nodes in a changeset if you change the tags on a way.
  2. If you move a node in a way, the way itself is not included in the changeset.

The version number of a way would need to change every time a node was changed. iD might need a patch to ensure it pulls down the updated way after only a node was updated.

For relations, this would make it hard to avoid forks: if we store version numbers on relations, any update to a member would need to update the version of the relation. This would mean we would not be able to use relations for long rivers to avoid forks - any edit to any segment of the river would create a new version of the relation.

from osm-p2p-db.

gmaclennan avatar gmaclennan commented on August 16, 2024

I think treating changesets as commits introduces new problems. This would mean we would have forks at a higher level (changesets) which would be more difficult to resolve.

from osm-p2p-db.

gmaclennan avatar gmaclennan commented on August 16, 2024

See also: #21

from osm-p2p-db.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.