Giter VIP home page Giter VIP logo

Comments (12)

gkirill avatar gkirill commented on July 16, 2024 12
  1. Incremental ids are usually only unique within one instance of database. If your db is sharded, then it may become difficult to use incremental ids because you may have user/1 in instance 1 and user/1 in instance 2.
  2. Another useful application of uuids is that clients can define them themselves when they create new instances, e.g. I could create new user with JavaScript and set its id right there without needing db to assign an id.

from http-api-design.

alecmev avatar alecmev commented on July 16, 2024 8
  1. The probability of a collision with UUID's is not 0 + you can't trust client-side generated ID's (their random generator could be returning the same number with every invocation, for all we know), so you still need to do the appropriate checks (and have a mechanism for denying a resource creation, if the ID collides, just like with the regular integers...)
  2. Performance varies from one DB type to another, but, for example, you still need a good old autoincrement integer in your Postgres (while the situation is even worse in MySQL and MSSQL, from what I've read)
  3. You completely ruin the aesthetics of your URL's: /memberships?user=123&team=456 vs. /memberships?user=1b2d9fb0-d232-49d5-9e60-334bc16d79bc&team=6f6f3d93-df18-495f-8de4-fa29cb2e5835

You make it sound like it's a no-brainer, when it's not. The advantages are accidental (for example, I don't care about 3rd parties analyzing our well-being using the resource ID's, because, firstly, I don't mind, and secondly, they'll find a way), and there's nothing you can do about the aesthetics, if you have no other unique identifier for a resource.

from http-api-design.

rafaelrabeloit avatar rafaelrabeloit commented on July 16, 2024 2

I was still thinking about it... If you open your API to the public, obviously you can't create the UUIDs in the client, because you can't assume that the UUIDs will be generated in the way you'd expect.

Idk about the database scope, if you consider the distributed case, though.

For all other arguments, simply append a random number with fixed length to the resource id (and persisting it with the id itself in the database, maybe as a composite key), this will mask your id, the size of your database and will prevent the attacker to iterate over all your entries, e.g.:

id + 6 digits random number: 1 + 005174 = 1005174, like /user/1005174
Even if the attacker knows the size of the random number, he won't know the number itself. So, he wouldn't know the id 2 + rand (to iterate), or the id 545684 + rand (to try to guess the database size).

I don't care about aesthetics, because I belive the APIs are for client software and not users, but a 36 char string seems like a overkill to me.
And to think that, with more and more entries in your database, the collision chance increases, makes me uneasy. So, if you think in Google parameters, the number of database entries must cause collisions, even with something as improbable as UUID...

from http-api-design.

rafaelrabeloit avatar rafaelrabeloit commented on July 16, 2024

Ok, I think I'm starting to get the idea... Thanks!

from http-api-design.

pedro avatar pedro commented on July 16, 2024

+1!

Also worth noticing uuids present another layer of defense if you forget to scope a query, which is a pretty common mistake even big companies make:

http://mashable.com/2015/04/28/twitter-earnings-selerity/

from http-api-design.

bjeanes avatar bjeanes commented on July 16, 2024

An additional non-technical reason is that as a company grows and gets attention of competitors, numeric IDs can allow people to discover the relative size of your data based on IDs of newly-created records. Analysts often use this method to estimate how much revenue a company earns too. UUIDs aren't the only solution here, but in the context of an API you'd need to use something other than the numeric ID either way, so UUID is a suitable alternative, especially in the context of the others reasons to use them.

from http-api-design.

crazytonyi avatar crazytonyi commented on July 16, 2024

+1 for code design that doesn't betray it's inner functionality. It's also worth mentioning that UUIDs and GUIDs have a defined standard/algorithm and are not simply a random series of 32 hex digits:

https://en.m.wikipedia.org/wiki/Globally_unique_identifier

from http-api-design.

alecmev avatar alecmev commented on July 16, 2024

Regarding aesthetics - yes, they don't matter on the API level, but then you still need routing in your client-side application, right? Let's take a user resource: the service I'm making allows duplicate usernames, while user's email is a private piece of info, so all I'm left with is some unique identifier, and I'd prefer it to be a short number / hash (think Trello), and not 36 char long gibberish.

IMO, this is bad (ignore the product identifier, you get the point):
bad

And this is good:
good

from http-api-design.

geemus avatar geemus commented on July 16, 2024

In our case at least we expect all the uuids to be generated by us, server side, so the client concerns did not matter. I also agree that not leaking information about how many things you might have is pretty incidental, not really important for most use cases (but matters to some people). Similarly, preventing an attacker from iterating is nice-to-have, but ideally you have enough other protections in place that you would be ok even if they knew keys, so again incidental benefit.

The biggest reason for us, I think, is that it makes it more feasible to shard later as one grows than integers. And if you don't do it sooner, rather than later, the pain/difficulty of later having to switch is pretty bad. So the hope was to head off that issue at the pass and just start with something that should work into the future. Even though each service might be able to have it's own id's, any of the individual services might still grow to the size where sharding would become necessary, so simply dividing things up might delay but I don't think would be able to for-sure prevent this from becoming an issue.

The aesthetics issue is one that bothers me as well. I don't particularly like the way they look and they are quite long. Which in some cases becomes concretely problematic, rather than just ugly, for instance due to a somewhat small limit on total size of query string (though this can be worked around by doing POST with this info in the body, it still seems not-great). I still felt using something that should be able to scale more easily (as well as having some of these nice other properties) out-weighed un-aesthetic things in a context that will mostly be written/created by computers rather than humans. I think if this were being exposed more in web pages it might well be another story.

I suppose if you feel that it is likely that your dataset would never need to grow beyond the bounds of a single database it would lessen some of these pressures, but I was unwilling to make that bet.

from http-api-design.

bjeanes avatar bjeanes commented on July 16, 2024

Instagram have an interesting blog post about their ID generation. Instagram IDs are shorter, (subjectively) more aesthetically pleasing, and shard ready.

http://instagram-engineering.tumblr.com/post/10853187575/sharding-ids-at-instagram

That might be an appropriate alternative for those seeking to avoid UUIDs.

from http-api-design.

frankieroberto avatar frankieroberto commented on July 16, 2024

Just to chip in here, I also find UUID to be pretty ugly, and their primary use case (allowing distributed clients to generate IDs with a very low chance of collisions) isn't one that I've really come across.

UUIDs imply (in the JSON at least) that they're strings, but they're actually 128 bit values, and whilst many databases / storage engines support UUIDs natively (e.g. Postgres does, but SQLite doesn't) , it's a bit less common than storing integers, and many users of your API might just store them as strings, which is probably ok, but might not scale as well?

On the other hand, 64 bit integers can't always be parsed in javascript environments as an integer if they're above 53 bits, so Twitter always includes a string version with a _str suffix (see https://dev.twitter.com/overview/api/twitter-ids-json-and-snowflake ).

from http-api-design.

geemus avatar geemus commented on July 16, 2024

Yeah, I was about to mention snowflake/twitter as another case.

Distributed id generation is definitely not part of why we wanted unique stuff. Mostly future-proofing and as a means of having consistency, other stuff is more periphery. We chose it over snowflake/etc at least in part because we use postgres and so we already had easy native support.

They are ugly though, for sure. I guess I'm just on the fence about whether that is a strong enough reason to do something more complicated, since they will mostly only be "seen" by computers. I suppose it depends on if the API is then revealed in user facing APIs, where uuids would be more unfortunate.

from http-api-design.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.