Comments (20)
@vweevers, it's still trivial to optimize it with a lookup table. I don't see "legacy" as an excuse to be lazy.
Interesting that it mentions the performance issue in the code but not the docs (where it might be the most helpful).
In any case, filing an issue to have it optimized now does nothing, because we'll never know whether we're using the optimized version or the unoptimized version. @jonkoops' approach makes more sense.
@dcousens, performance in chromium seems optimal. I'm seeing a 1.4x speedup compared to the JS impl. I don't know if I'll get around to testing firefox anytime soon as I don't have good tooling to run FF headless, but I'll start working on a revision of #349 that uses atob
in the browser and Buffer.from -> toString
in node.
from buffer.
You are correct @dcousens, that is in fact what we did for Keycloak.
from buffer.
https://nodejs.org/api/globals.html#atobdata was added in Node 16
, so I think we're good if we're assuming LTS for the library (we are)
from buffer.
Yeah, this is also an opportunity to align the implementations between Node.js and the browser, might be a good code-cleanup. Did some quick testing, but I was unable to get things to work in a 15 min time-span (had some trouble running the tests on a browser as well).
from buffer.
I suspect atob
and btoa
will have more performance overhead than a custom base64 implementation due to the need for string preprocessing (to match node.js behavior) and the extra conversion back and forth between a binary string a buffer.
I don't have proof because I haven't benchmarked it, but the implementation in #349 is probably the fastest/best we can get.
edit: On second thought, if we optimistically parse base64 with atob
and fallback to preprocessing on error, it may be faster? Something like:
try {
return Buffer.from(atob(str), 'binary');
} catch (e) {
return Buffer.from(atob(removeInvalidCharacters(str)), 'binary')
}
I'll have a crack at it at some point.
from buffer.
Here are the numbers for atob
(in node.js):
$ node perf/write-base64.js
BrowserBuffer#write(8192, "base64") x 1,638 ops/sec ±0.50% (93 runs sampled)
Compared to the implementation in #349:
$ node perf/write-base64.js
BrowserBuffer#write(8192, "base64") x 24,542 ops/sec ±0.02% (97 runs sampled)
This was the code used (with the optimized version of asciiWrite
):
function base64Write(buf, string, offset, length) {
return asciiWrite(buf, atob(string), offset, length);
}
If you're scratching your head like I was, look no further than the insanely broken implementation of atob
in node.js itself.
This line in particular is the problem:
const index = ArrayPrototypeIndexOf(
kForgivingBase64AllowedChars,
StringPrototypeCharCodeAt(input, n));
In other words, the node.js atob
function runs with O(n^2)
time complexity! How code this irresponsible made it into the node.js codebase is beyond me.
I'll have to benchmark this in an actual web browser to get some real numbers.
from buffer.
So perhaps take the Buffer.from()
approach for Node.js and see if the perf for atob
in browsers is acceptable? Would prevent the need to roll a bunch of custom code.
from buffer.
Just an update: while using atob
is faster (in chromium), the same can't be said for btoa
.
Current optimized branch:
BrowserBuffer#write(8192, "base64") x 49,551 ops/sec ±0.17% (65 runs sampled)
BrowserBuffer#toString(8192, "base64") x 34,617 ops/sec ±0.43% (65 runs sampled)
With atob/btoa:
BrowserBuffer#write(8192, "base64") x 70,796 ops/sec ±14.85% (61 runs sampled)
BrowserBuffer#toString(8192, "base64") x 24,838 ops/sec ±1.50% (59 runs sampled)
(Note the numbers differ from the ones in my PR because I ran this on a faster machine)
We're basically doing this:
function base64Slice (buf, start, end) {
return btoa(latin1Slice(buf, start, end))
}
The overhead of Buffer -> binary string -> base64
is one of the things I was worried about. I guess we have to make a decision whether we want to keep the base64 serialization code, or take the perf hit and use btoa
.
from buffer.
I have to agree with @dcousens that it's not really an issue that some of these functions are slow in Node.js, it is very clear that compatibility and performance with existing popular web APIs is an afterthought there. And it doesn't really matter anyways, as this library will almost always be used in a web context.
Thanks for your hard work @chjj, I appreciate you.
from buffer.
@jonkoops do we actually need it for browsers in 2024? Could we simply use atob
and btoa
now?
from buffer.
@jonkoops if you want to help review #349 🧡
from buffer.
Using atob
/btoa
is strongly preferred, if at least for the first attempt for easier code review.
We can optimize afterwards.
from buffer.
In other words, the node.js
atob
function runs withO(n^2)
time complexity! How code this irresponsible made it into the node.js codebase is beyond me.
Oh damn, nice find! I wonder how other run-times such as Bun and Deno compare in this regard. Did you log an issue with the Node.js team?
from buffer.
What is the browser performance like? We could probably still rely on atob
and btoa
and then log the nodejs issue upstream?
from buffer.
In other words, the node.js
atob
function runs withO(n^2)
time complexity! How code this irresponsible made it into the node.js codebase is beyond me.
Because Buffer exists as the better alternative (from Node.js' perspective that is). The atob
function is marked as Legacy:
Stability: 3 - Legacy. Although this feature is unlikely to be removed and is still covered by semantic versioning guarantees, it is no longer actively maintained, and other alternatives are available.
Which explains this comment:
// The implementation here has not been performance optimized in any way and
// should not be.
That said, you could make the argument that this hurts web compatibility, and still file an issue.
from buffer.
I have added a comment in nodejs/node#38433 (comment), and I suspect that's where we can leave it. We are targeting browsers, and the fact that atob
and btoa
is slow in node
is only a loss for node
, not this library.
from buffer.
@chjj I think we can probably say that btoa
and atob
should be faster and better maintained in the long-term
from buffer.
New implementation up at #349
from buffer.
I made a PR fixing atob in node.js. Let's see if anything happens.
from buffer.
Wow their linter is strict about whitespace. Okay, I'm out of energy on that. If they accept it, they accept it, if not, we'll figure something out.
from buffer.
Related Issues (20)
- Property storage exceeds 196607 properties, js engine: hermes HOT 1
- Uncaught SyntaxError: Cannot use import statement outside a module HOT 1
- wrt: standalone script. HOT 3
- Buffer Lib does not transpile code into ES2015 which does not allow to import it by older devices HOT 1
- "File is not a constructor" in Node 16 and below
- Buffer 4.9.2 refers to `global` which is a node-only feature. HOT 1
- [suggestion] Replace https://bundle.run/buffer with https://bundle.run/[email protected] HOT 1
- subarray returns Uint8Array which can't handle toString('uff-8') HOT 29
- Uncaught SyntaxError: Invalid or unexpected token (at VM9 [email protected]:1:2)
- Consider injecting Buffer to window in standalone script
- Publish new version to npm HOT 3
- ESM Support HOT 4
- if possible map it like filer-browserify?
- typing error on method concat with TS 5.3
- Missing alias for writeBigUint64BE / LE HOT 2
- Remove exclamation mark in header comment HOT 1
- Why is the code above version 6.0 ES6? This leads to errors running on lower versions of browsers HOT 1
- entire library won't work if BigInt syntax is unsupported
- [bug] Unknown encoding: base64url HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from buffer.