Giter VIP home page Giter VIP logo

Comments (20)

chjj avatar chjj commented on June 2, 2024 2

@vweevers, it's still trivial to optimize it with a lookup table. I don't see "legacy" as an excuse to be lazy.

Interesting that it mentions the performance issue in the code but not the docs (where it might be the most helpful).

In any case, filing an issue to have it optimized now does nothing, because we'll never know whether we're using the optimized version or the unoptimized version. @jonkoops' approach makes more sense.

@dcousens, performance in chromium seems optimal. I'm seeing a 1.4x speedup compared to the JS impl. I don't know if I'll get around to testing firefox anytime soon as I don't have good tooling to run FF headless, but I'll start working on a revision of #349 that uses atob in the browser and Buffer.from -> toString in node.

from buffer.

jonkoops avatar jonkoops commented on June 2, 2024 1

You are correct @dcousens, that is in fact what we did for Keycloak.

from buffer.

dcousens avatar dcousens commented on June 2, 2024 1

https://nodejs.org/api/globals.html#atobdata was added in Node 16, so I think we're good if we're assuming LTS for the library (we are)

from buffer.

jonkoops avatar jonkoops commented on June 2, 2024 1

Yeah, this is also an opportunity to align the implementations between Node.js and the browser, might be a good code-cleanup. Did some quick testing, but I was unable to get things to work in a 15 min time-span (had some trouble running the tests on a browser as well).

from buffer.

chjj avatar chjj commented on June 2, 2024 1

I suspect atob and btoa will have more performance overhead than a custom base64 implementation due to the need for string preprocessing (to match node.js behavior) and the extra conversion back and forth between a binary string a buffer.

I don't have proof because I haven't benchmarked it, but the implementation in #349 is probably the fastest/best we can get.

edit: On second thought, if we optimistically parse base64 with atob and fallback to preprocessing on error, it may be faster? Something like:

try {
  return Buffer.from(atob(str), 'binary');
} catch (e) {
  return Buffer.from(atob(removeInvalidCharacters(str)), 'binary')
}

I'll have a crack at it at some point.

from buffer.

chjj avatar chjj commented on June 2, 2024 1

Here are the numbers for atob (in node.js):

$ node perf/write-base64.js
BrowserBuffer#write(8192, "base64") x 1,638 ops/sec ±0.50% (93 runs sampled)

Compared to the implementation in #349:

$ node perf/write-base64.js
BrowserBuffer#write(8192, "base64") x 24,542 ops/sec ±0.02% (97 runs sampled)

This was the code used (with the optimized version of asciiWrite):

function base64Write(buf, string, offset, length) {
  return asciiWrite(buf, atob(string), offset, length);
}

If you're scratching your head like I was, look no further than the insanely broken implementation of atob in node.js itself.

This line in particular is the problem:

    const index = ArrayPrototypeIndexOf(
      kForgivingBase64AllowedChars,
      StringPrototypeCharCodeAt(input, n));

In other words, the node.js atob function runs with O(n^2) time complexity! How code this irresponsible made it into the node.js codebase is beyond me.

I'll have to benchmark this in an actual web browser to get some real numbers.

from buffer.

jonkoops avatar jonkoops commented on June 2, 2024 1

So perhaps take the Buffer.from() approach for Node.js and see if the perf for atob in browsers is acceptable? Would prevent the need to roll a bunch of custom code.

from buffer.

chjj avatar chjj commented on June 2, 2024 1

Just an update: while using atob is faster (in chromium), the same can't be said for btoa.

Current optimized branch:
BrowserBuffer#write(8192, "base64") x 49,551 ops/sec ±0.17% (65 runs sampled)
BrowserBuffer#toString(8192, "base64") x 34,617 ops/sec ±0.43% (65 runs sampled)
With atob/btoa:
BrowserBuffer#write(8192, "base64") x 70,796 ops/sec ±14.85% (61 runs sampled)
BrowserBuffer#toString(8192, "base64") x 24,838 ops/sec ±1.50% (59 runs sampled)

(Note the numbers differ from the ones in my PR because I ran this on a faster machine)

We're basically doing this:

function base64Slice (buf, start, end) {
  return btoa(latin1Slice(buf, start, end))
}

The overhead of Buffer -> binary string -> base64 is one of the things I was worried about. I guess we have to make a decision whether we want to keep the base64 serialization code, or take the perf hit and use btoa.

from buffer.

jonkoops avatar jonkoops commented on June 2, 2024 1

I have to agree with @dcousens that it's not really an issue that some of these functions are slow in Node.js, it is very clear that compatibility and performance with existing popular web APIs is an afterthought there. And it doesn't really matter anyways, as this library will almost always be used in a web context.

Thanks for your hard work @chjj, I appreciate you.

from buffer.

dcousens avatar dcousens commented on June 2, 2024

@jonkoops do we actually need it for browsers in 2024? Could we simply use atob and btoa now?

from buffer.

dcousens avatar dcousens commented on June 2, 2024

@jonkoops if you want to help review #349 🧡

from buffer.

dcousens avatar dcousens commented on June 2, 2024

Using atob/btoa is strongly preferred, if at least for the first attempt for easier code review.
We can optimize afterwards.

from buffer.

jonkoops avatar jonkoops commented on June 2, 2024

In other words, the node.js atob function runs with O(n^2) time complexity! How code this irresponsible made it into the node.js codebase is beyond me.

Oh damn, nice find! I wonder how other run-times such as Bun and Deno compare in this regard. Did you log an issue with the Node.js team?

from buffer.

dcousens avatar dcousens commented on June 2, 2024

What is the browser performance like? We could probably still rely on atob and btoa and then log the nodejs issue upstream?

from buffer.

vweevers avatar vweevers commented on June 2, 2024

In other words, the node.js atob function runs with O(n^2) time complexity! How code this irresponsible made it into the node.js codebase is beyond me.

Because Buffer exists as the better alternative (from Node.js' perspective that is). The atob function is marked as Legacy:

Stability: 3 - Legacy. Although this feature is unlikely to be removed and is still covered by semantic versioning guarantees, it is no longer actively maintained, and other alternatives are available.

Which explains this comment:

// The implementation here has not been performance optimized in any way and
// should not be.

That said, you could make the argument that this hurts web compatibility, and still file an issue.

from buffer.

dcousens avatar dcousens commented on June 2, 2024

I have added a comment in nodejs/node#38433 (comment), and I suspect that's where we can leave it. We are targeting browsers, and the fact that atob and btoa is slow in node is only a loss for node, not this library.

from buffer.

dcousens avatar dcousens commented on June 2, 2024

@chjj I think we can probably say that btoa and atob should be faster and better maintained in the long-term

from buffer.

chjj avatar chjj commented on June 2, 2024

New implementation up at #349

from buffer.

chjj avatar chjj commented on June 2, 2024

I made a PR fixing atob in node.js. Let's see if anything happens.

from buffer.

chjj avatar chjj commented on June 2, 2024

Wow their linter is strict about whitespace. Okay, I'm out of energy on that. If they accept it, they accept it, if not, we'll figure something out.

from buffer.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.