Giter VIP home page Giter VIP logo

Comments (6)

edongashi avatar edongashi commented on July 28, 2024

Same idea would work for contributions. We can have a union of known contribution types and functions which work on each of those.

from pg_diffix.

cristianberneanu avatar cristianberneanu commented on July 28, 2024

We always did that. It is simpler and more efficient. So the question I see is why should we do it otherwise now?
We don't need bijection for anonymization. We also need to hash the values in the end so we can compute the seed.
Getting rid of macros would be just a bonus.

The more opaque explain is a drawback, but to find the actual AID back one could use:

select aid from table where hash(aid) = hash_value

from pg_diffix.

edongashi avatar edongashi commented on July 28, 2024

So the question I see is why should we do it otherwise now?

No strong opinion. The argument was that if it's relatively cheap to maintain then why not...

I'll continue as planned.

from pg_diffix.

cristianberneanu avatar cristianberneanu commented on July 28, 2024

Same idea would work for contributions.

That is a more nuanced discussion, as the various types are not equivalent with one another for all purposes.
In the AID case, things look more straight-forward to me.

from pg_diffix.

cristianberneanu avatar cristianberneanu commented on July 28, 2024

We also lose the nice explain features which could be a debugging and teaching utility.

One could also see this as a feature: if the AID is a name, email or phone number, one would never want that exposed directly to the analyst or system administrator. Showing the hash in debug utilities would reduce this likelihood.

The argument was that if it's relatively cheap to maintain then why not...

In my experience, things have a way of interacting in unforeseen ways. I think it pays to cull code aggressively. Especially when it comes to dealing with strings in C.

from pg_diffix.

edongashi avatar edongashi commented on July 28, 2024

if the AID is a name, email or phone number, one would never want that exposed directly to the analyst or system administrator.

Yes, in fact these aggregates won't be callable at all in sensitive tables. Maybe only as a superuser or special role if we decide later, or we'll drop them completely in the future.

I think it pays to cull code aggressively.

Okay, I agree. Let's simplify.

from pg_diffix.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.