Giter VIP home page Giter VIP logo

Comments (4)

tillprochaska avatar tillprochaska commented on June 10, 2024

@pudo I’ve added this as an issue here mostly to keep track of it as it’s part of a bigger project in Aleph. I’d be happy to put in a PR to resolve this though!

from followthemoney.

pudo avatar pudo commented on June 10, 2024

One thing I'd consider before making this guarantee to downstream consumers is how this will intersect with entity fragment aggregation. I've looked at this before a bit in the context of making the pages of a document show up in proper order in the indexText/bodyText props, and could never quite conceptualise all the places in the pipeline that would need to be updated for this guarantee to really be held. It's also something that will be really hard to do wrt. to the statement-based data model I use downstream in nomenklatura, but that's more of a "me" problem :)

from followthemoney.

tillprochaska avatar tillprochaska commented on June 10, 2024

One thing I'd consider before making this guarantee to downstream consumers is how this will intersect with entity fragment aggregation. I've looked at this before a bit in the context of making the pages of a document show up in proper order in the indexText/bodyText props, and could never quite conceptualise all the places in the pipeline that would need to be updated for this guarantee to really be held.

Thanks for mentioning this, I wasn’t aware of the (potentially) incorrect order of indexText property values before. My (maybe naive?) assumption though is that preserving order of property values shouldn’t make anything worse (or might even be necessary in order to be able to tackle them in the future), right?

While it won’t automatically solve the potentially incorrect order of indexText values (because the values are stored in different fragments), it will solve the issue with previews of multipart emails (because all parts of the original email end up in the same fragment).

from followthemoney.

tillprochaska avatar tillprochaska commented on June 10, 2024

@pudo Do you have a preference regarding the alternative data structure?

  1. Using dict keys (or a small wrapper around dicts)
  2. Using lists (and manually checking if a value already exists when adding new values)
  3. Using a third-party ordered set implementation.

I have a personal preference for 2 to keep it simple, and I don’t think the perf overhead for O(n) membership tests would be relevant in Aleph. Not sure though about other FtM use cases like OpenSanctions?

from followthemoney.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.