kriszyp / cbor-x Goto Github PK

View Code? Open in Web Editor NEW

233.0 233.0 32.0 699 KB

Ultra-fast CBOR encoder/decoder with extensions for records and structural cloning

License: MIT License

JavaScript 99.68% HTML 0.32%

cbor-x's People

Stargazers

Watchers

cbor-x's Issues

OOM when decode stream with `bundleStrings: true`

Sample code:

const fs = require('fs')
const { EncoderStream, DecoderStream } = require('cbor-x')

const recordNum = 10000

const enc = new EncoderStream({
  bundleStrings: true,
})

const read = () => {
  console.time('READ')

  const dec = new DecoderStream({
    bundleStrings: true,
  })

  fs.createReadStream('test.cbor')
    .on('data', (c) => console.log(c.length))
    .pipe(dec)
    .on('data', () => {})
    .on('end', () => console.timeEnd('READ'))

}

enc.pipe(fs.createWriteStream('test.cbor'))
enc.on('end', () => console.timeEnd('GEN') || read())

console.log('Generating')

console.time('GEN')

const curr = Date.now()

for (let i = 0; i < recordNum; ++i) {
  enc.write({ i, str: 'TEST_STR', ts: Date.now() })
}

enc.end()

In my test, will stuck after first chunk then OOM.

[Feature]: Generate blob parts to support embedding blobs (aka files)

scroll down to my 3rd comment: #57 (comment)

Original post

if i have something that needs to be read asyncronus or with a stream can i do that then?

I'm thinking of ways to best support very large Blob/Files tags...

Here is some wishful thinking:

import { addExtension, Encoder } from 'cbor-x'

let extEncoder = new Encoder()
addExtension({
	Class: Blob,
	tag: 43311, // register our own extension code (a tag code)
	encode (blob, encode) {
		const iterable = blob.stream() // returns a async iterators that yields Uint8Arrays
		encode(iterable); // return a generator that yields uint8array
	}
	async decode (readableByteStream) {
		const blob = await new Response(stream).blob()
		return blob
	}
})

CBOR maps generated by this module cannot be read by nlohmann/json

I am using CBOR to encode messages between my client and server in this game I'm making. I'm using this module on the server to serialize and on the client I am using nlohmann/json to parse. The client is C++ and the server is JavaScript. When I console.log the packet I am trying to send I get these bytes:

<Buffer d8 69 90 8e 63 70 6b 74 64 74 69 6d 65 66 63 6c 6f 75 64 58 66 63 6c 6f 75 64 5a 65 77 69 6e 64 58 65 77 69 6e 64 5a 62 69 64 61 78 61 79 61 7a 63 79 ... 63 more bytes>

But then when the bytes start to be parsed on the other side by nlohmann/json it fails with this message:
13:15:56+334: ERROR - Socket: Could not decode packet: [json.exception.parse_error.112] parse error at byte 1: syntax error while parsing CBOR value: invalid byte: 0xD8

I just need some help understanding how to configure this library to create valid CBOR, I've tried the objectsAsMaps: true and variableMapSize: true and useRecords: false in different combinations and it has not changed the resulting behavior

Unknown tag with map value decode issue

Given: Encoded custom tag with value like: {"type": str, "value": str}.

cbor decode method returns Tag object with proper tag and value == {} (empty map).

if value is something like ["some", "str"], tag.value is correct. not for mapping

so on custom tag value decode, list and string are decoded properly, map is not.

checked payload with cbor playground, to check encoded properly.

looks like cbor-x issue

Streams cannot handle root level null types

cbor-x Decoder stream will happily decode and push a null value in to the decoder stream, causing the stream to terminate. This can create nasty edge cases where users might control the contents of cbor sequences and are able to create large files which don't properly decode for cbor-x clients.

It's not currently possible to work around this, because cbor-x doesn't export decoder.js in package.json/exports and index.js/node.js files don't export getPosition or clearSource from Decoder so Streams cannot be more robustly implemented externally without forking/vendoring the package.

It would be great for cbor-x to provide a public interface to do partial reads, perhaps by exporting getPosition and clearSource from index/node or by adding decode.js to the package.json exports field.

DecoderStream should adopt at least one of three possible strategies to resolve this:

some similar libraries offer a { wrap: true } option with their transform stream constructors, which wraps every value in the object stream with an object { value: <any> }.
the stream could throw an error via this.destroy(), to make it explicitly incompatible with streams that have root level null, avoiding such streams appearing to end prematurely and possible messy errors from attempting to push more values after ending the stream with push(null)
the stream could push Symbol.for('null') instead of a real null value whenever the next object is a literal null.

Option 1 seems most common, and is able to support older platforms that might not have Symbol.for available, but option 3 likely has better performance, by avoiding allocating an extra object for every push.

Hidden breaking change in v0.8.3

Hi @kriszyp,

Today, I've updated the cbor-x version from 0.8.2 to 0.8.3 in my project and it caused a breaking change because of this commit: b30f65c

I could fix it that I set the useTag259ForMaps: false to be backwards compatible with 0.8.2.

So I just want to ask that in the future, please increase the major version number (to follow semver versioning) if the version contains breaking changes.

Thanks in advance and keep up the good work!

Icebob

bench test NodeJS own `v8.serializer`

NodeJS have a built in v8.serialize(value) that do also support circular refs and many structure and it's basically also what globalThis.structuredClone uses (i guess). good for nodejs <-> nodejs data cloning but not so good for browsers (as there isn't any client side decoder/encoder)

could you bench test how fast this compares to v8 serialize & deserialize?

and also compare compression differences

Explicit Map type support

For use with structured cloning, it's sometimes necessary to preserve the type difference between a javascript Object (with string keys) and a Map (with any type keys).

It would be useful in some applications to be able to explicitly support Map's as a distinct type from string-keyed objects. A spec for this including rationale is at: https://github.com/shanewholloway/js-cbor-codec/blob/master/docs/CBOR-259-spec--explicit-maps.md

cbor-x currently can't support this use case, because it doesn't check user supplied extension types until after it checks if a value is a Map, and provides it's own generic encoding.

cbor-x should do one of:

allow users to register map types, and correctly use the provided encode and decode functions, by checking for extensions before falling back to it's built in Map encoding
provide built in support for explicit maps, via an option, like it does for Set.
or throw an error if a user tries to register Map as an extension type, to make it clear why it's not working and that custom map encoding is not supported.

Global namespace pollution; `stringRefs` ?

Reading decode.js it looks like there is an undeclared variable stringRefs. Am I correct that the implementation of tags 25 and 256 is incomplete?

Perhaps the following should be commented out to avoid treating stringRefs as a property of the global object.

currentExtensions[25] = (id) => {
	return stringRefs[id]
}
currentExtensions[256] = (read) => {
	stringRefs = []
	try {
		return read()
	} finally {
		stringRefs = null
	}
}
currentExtensions[256].handlesRead = true

I'll submit a PR including that if you agree.

Keep getting warning

I keep getting this warning:

Native extraction module not loaded, cbor-x will still run, but with decreased performance. The module '/home/kuba/projects/jcubic/lips/node_modules/cbor-extract/build/Release/cbor-extract.node'

I've tried to modify the node_module/cbor-x/node-index.js, but I don't get any effect after printing the whole message.

safeKey in cbor-x encode / decode

I stumbled upon this report GHSA-9c47-m6qq-7p4h and I was curious to see how cbor-x dealt with similar cases (with records support enabled).

const record = { foo: "bar", __proto__: { isAdmin: true } };
const enc = encode(record)
const dec = decode(record)
expect(dec).toEqual({ foo: "bar", undefined: true })

This does not feel quite right - albeit the dangerous __proto__ is correctly stripped.
The encoded CBOR is d9dfff8419e0008163666f6f63626172f5 which corresponds to 57343_1([57344_1, ["foo"], "bar", true]).

I see cbor-x uses safeKey while decoding

function safeKey(key) {
	return key === '__proto__' ? '__proto_' : key // clever trick :)
}

but it seems to me there's something wrong in the encoding part (key is stripped, value is kept)

Crash using decode in uWebSockets.js HTTP

Node.js crash when using cbor-x decode in uWebSockets.js HTTP server and reading POST request data.

/*  Handler for reading data from POST and such requests. 
            You MUST copy the data of chunk if isLast is not true. 
            We Neuter ArrayBuffers on return, making it zero length. */

        res.onData((ab, isLast) => {
            console.log('HTTP payload', ab); // Payload is single chunk
            const chunk = Buffer.from(ab); // ArrayBuffer -> Buffer
            console.log('Chunk', chunk);

            const obj = decode(chunk); // Crash
            console.log('Obj', obj);
        });`

**Trace**

`HTTP payload ArrayBuffer {
  [Uint8Contents]: <7b 0a 09 22 64 22 3a 20 7b 0a 09 09 22 5f 6b 65 79 22 3a 20 22 74 65 73 74 2d 6b 65 79 31 22 0a 09 7d 0a 7d>,
  byteLength: 36
}
Chunk <Buffer 7b 0a 09 22 64 22 3a 20 7b 0a 09 09 22 5f 6b 65 79 22 3a 20 22 74 65 73 74 2d 6b 65 79 31 22 0a 09 7d 0a 7d>
FATAL ERROR: v8::FromJust Maybe value is Nothing.
 1: 0xb2e750 node::Abort() [/usr/bin/node]
 2: 0xa40252 node::FatalError(char const*, char const*) [/usr/bin/node]
 3: 0xd1f6da v8::Utils::ReportApiFailure(char const*, char const*) [/usr/bin/node]
 4: 0x7f744beaea70  [/home/user/_Dev/node/nng-proxy/node_modules/.pnpm/[email protected]/node_modules/cbor-extract/prebuilds/linux-x64/node.abi102.node]
 5: 0x7f744beae458  [/home/user/_Dev/node/nng-proxy/node_modules/.pnpm/[email protected]/node_modules/cbor-extract/prebuilds/linux-x64/node.abi102.node]
 6: 0xd7bdde  [/usr/bin/node]
 7: 0xd7d1ff v8::internal::Builtin_HandleApiCall(int, unsigned long*, v8::internal::Isolate*) [/usr/bin/node]
 8: 0x16326b9  [/usr/bin/node]
`

Browser performance vs nodejs performance?

Is there any difference?

Exception: Data read but end of buffer not reached

I am trying to decode a cbor string. I used TextEncoder to convert it to Uint8Array and used the decode function when I get this error.
index.js:138 Uncaught (in promise) Error: Data read, but end of buffer not reached
at checkedRead (index.js:138)
at Object.decode (index.js:86)

Please help.

SuperJSONatural - New kid in the game worth to take a look at it

https://www.npmjs.com/package/superjsonatural (2.95x lighter)

Performances ( 1000 x 16kB )

NPM Package	Packing	Unpacking
OURS	182ms	144ms
CBOR-X	202ms (+8%)	246ms (+70%)

better structured clone support

I just publish this test page today to test out different binary package

Maybe this is something you wish to improve on?
(you can also click on each ❌ to see more details - and see the result in the console for more detail)

A summary of what i think should be supported

should be able to encode/decode -0
should be able to encode really large BigInts structuredClone(BigInt('0x' + 'FF'.repeat(1024)))
should support Object literals Object(2n)
Better handle sparse arrays.
Support cloning ArrayBuffer and return it as an ArrayBuffer (not as an Uint8Array with an offset)
support circular ref: var input = {}; input.input = input, var input = [0]; input[0] = input
Support more instances of Error classes

I don't know about Blob/Files.
I think it would be better to just encode a references point to some blob index rather than trying encode a hole file into memory. So first you would encode the structure, emit all the data and then transfer/pipe the blobs over the wire later in your own way. kind of how the web has a transferable list when using postMessage

Edge case with nested records sharing the same keys

Hi there. I am having an issue with encoding custom extensions inside a record when both the wrapping record and the wrapped extension share exactly the same keys. Example (note: standard setup of extension and config of encoder omitted for readability):

class MyExtension {
  constructor(value) {
    this.key = value;
  }
}
const extensionData = new MyExtension("foo");
const recordData = { key : extensionData }
const encoded = encode(recordData); // wrong encoded CBOR: d9dfff8319e00081636b6579d9a606d9e0008163666f6f

In the above example everything works fine if we change the property key of either the record or the extension (say we use key_ext for the extension). Now the encoded byte stream is correct: d9dfff8319e00081636b6579d9a606d9dfff8319e00181676b65795f65787463666f6f

Note that this issue can be reproduced with any number of keys - as long as the enclosing record has the same keys as the enclosed object. Actually this seems to be unrelated to extensions and can be reproduced with plain records and a new Encoder that supports records:

{"key":{"key":"foo"}} => d9dfff8319e00081636b6579d9e0008163666f6f // NOT OK
{"key":{"key_alt":"foo"}} => d9dfff8319e00081636b6579d9dfff8319e00181676b65795f616c7463666f6f // OK

At first glance the issue seem to be rooted in writeObject(...) https://github.com/kriszyp/cbor-x/blob/master/encode.js#L663 where the transition is not renewed unless the enclosed object has different keys (didn't go much farther than that as I am not sure what "transitions" are used for).

Many thanks!

Invalid input causes crash in Node

Hi,

First of all, nice work!

I am using this module to decode CBOR data, but to keep backward compatibility in my application I also want to accept JSON input.

Unfortunately, when I try to decode JSON data with CBOR node crashes:
Input in hex: 7b2273657269616c6e6f223a2265343a30222c226970223a223139322e3136382e312e3335222c226b6579223a226770735f736563726574227d

FATAL ERROR: v8::FromJust Maybe value is Nothing.
 1: 0xb02cd0 node::Abort() [node]
 2: 0xa1812d node::FatalError(char const*, char const*) [node]
 3: 0xceb46a v8::Utils::ReportApiFailure(char const*, char const*) [node]
 4: 0x7f2460728a50  [/home/lefteris/code/node_modules/cbor-extract/prebuilds/linux-x64/node.abi93.node]
 5: 0x7f2460728449  [/home/lefteris/code/node_modules/cbor-extract/prebuilds/linux-x64/node.abi93.node]
 6: 0xd4709b  [node]
 7: 0xd4831a  [node]
 8: 0xd487f6 v8::internal::Builtin_HandleApiCall(int, unsigned long*, v8::internal::Isolate*) [node]
 9: 0x15cdef9  [node]
[1]    16663 abort (core dumped)  node index.js --log-level=info -v

I suspect it is an issue with out of memory

Of course, I will attempt to detect JSON vs CBOR and not pass the input through CBOR path, but for stability and security reasons it would be best if invalid input would not cause a crash. I know that my inputs will always be small in terms of object/array size.

Is there a reliable way to detect if input is CBOR?
Can the module be updated so invalid input never causes a crash? Possibly by adding an option to limit how large arrays/objects can be?

Thanks in Advance

feat: support shared/web workers

window and global is not defined:

cbor-x/decode.js

Line 760 in 2248b47

let glbl = typeof window == 'object' ? window : global

Need use self in workers

Fails to round-trip encode/decode dates

Thank you for these libraries! Unfortunately both msgpackr and cbor-x currently fail on various dates.

For cbor-x, it seems like we start to lose milliseconds due to floating point errors for dates > 2038:

    // Expected: 2039-07-05T16:22:35.792Z
    // Received: 2039-07-05T16:22:35.791Z

    // Expected: 2038-08-06T00:19:02.911Z
    // Received: 2038-08-06T00:19:02.910Z

For msgpackr, it doesn't seem to have that same floating point error, but it does "wrap around" after 2106:

   // Expected: 2106-08-05T18:48:20.323Z
   // Received: 1970-06-29T12:20:04.323Z

   // Expected: 2110-02-18T14:51:07.995Z
   // Received: 1974-01-12T08:22:51.995Z

FWIW, this comment helped me fix the issue with cbor-x by using string encoding: kriszyp/msgpackr#41 (comment)

By encoding as ISO 8601, cbor-x seems to handle any date I throw at it. But the default behavior feels like a bug - the libraries currently don't properly encode the full range of JS date objects.

IMHO, the default behavior should be to correctly encode any date without any loss, with an option to optimize for message size. But I realize that would be a breaking change - it would be nice if the docs at least mentioned the current limitations of date encoding.

Remove unused esbuild from dependencies

Is the esbuild package used? If not, I guess we might want to remove it from package.json/dependencies as it's size is about 8MB in the production image.

Console warning intended for browsers may be printed in NodeJS

Hello! Thank you for the great library!

I am seeing the following warning in NodeJS:

  console.warn
    For browser usage, directly use cbor-x/decode or cbor-x/encode modules. The argument 'filename' must be a file URL object, file URL string, or absolute path string. Received 'http://blah/node.cjs'

I have been using this library in an environment where the window object is polyfilled by a test environment (JSDOM) for Jest-based tests. For complicated reasons, we cannot really alter the Jest environment to suppress the message.

Maybe something like could work as the guard for the warning:

   // taken from https://github.com/flexdinesh/browser-or-node/blob/ae67a84b7cdc65021a198b18f16c3bdbf4b480d8/src/index.js
    const isBrowser =
      typeof window !== "undefined" && typeof window.document !== "undefined";

    const isNode =
      typeof process !== "undefined" &&
      process.versions != null &&
      process.versions.node != null;
    
    isBrowser && !isNode

I don't have a solid alternative. Given that names on the global space are not reserved and window might exist in a NodeJS environment 😢 .

Let me know what you think!

Canonical CBOR encoding

Hey!
This speed at which this library encodes is just marvelous, awesome project!
I noticed however, that this speed comes at the cost of binary size, since for simple "tags" cbor-x doesn't always choose the shortest possible, e.g.

encode(Buffer.alloc(0)) // returns <Buffer 58 00>

while other implementations return the shorter (and canonical) <Buffer 40>.
I would love for this to be corrected, cbor-x is just too fast! 😃 But I can totally understand if this is something that's been done for speed's sake or would require major effort.

Failed decode with large binary data

Hi,

There is an issue when I try to encode/decode a larger Buffer data. The same data works with MsgPack.

I made a repro example on repl.it: https://replit.com/@icebob/cbor-issue#index.js

Add benchmark with Sia

Reference https://github.com/pouya-eghbali/sia#performance

BigUInt64Array is not working on iPhones

Safari on iPhones don't implement BigUInt64Array. Using cbor-x with browsers on these device would cause the web app to break. Similar issue can be found on ChromeLab's jsbi project here:
tweag/asterius#792

Might be some way to avoid this problem?

Distribute ESM

I'm looking in the dist folder but can't find any ESM module, so i can't use import in browsers...

whenever i try do use /+esm with jsdeliver then it tries to be a smartas and import everything that it needs... including Buffer which i do not want/need.

https://cdn.jsdelivr.net/npm/[email protected]/dist/index.js/+esm

I do not know if it's b/c you are using Buffer that makes it wanna import it.

I really dislike that NodeJS added buffer onto the global namespace in the first place instead of depending on it like everything else ppl should really be using import Buffer from 'node:buffer' or async import...

i think you maybe can circumvent this if you instead use: const Buffer = globalThis.Buffer at the very top... but i'm not sure... really wished you could just remove all of nodejs Buffer stuff...

Cbor-x isn't Websocket friendly

this.cborEncoder = new CBOR.Encoder({useRecords: false, useTag259ForMaps: false});

const encoded:  Uint8Array  = this.cborEncoder.encode(event);
const decoded:  Uint8Array  = this.cborEncoder.decode(encoded);

//'decoded' is the same as 'event' and zero error occured, so success!

That example works very well, but when you try to send encoded through Engine.io/Socket.io to server, it sends whole ArrayBuffer (~8.2kB), so I'll tried this:

const encoded:  Uint8Array  = this.cborEncoder.encode(event);
const decoded:  Uint8Array  = this.cborEncoder.decode(encoded);
const encoded_trimmed = new Uint8Array(encoded.buffer.slice(encoded.byteOffset, encoded.byteLength));

But then backend somtimes successfully decode message, sometimes yells Error:

Error: Unexpected end of buffer reading string

Unexpected end of CBOR data

How to use with Axios?

How do I use this with Axios? I am fiddling with response type setting to stream, arraybuffer, blob and trying do decode/encode in transformRequest and transformResponse methods of Axios client but cbor is not taking the data in.

Decoding support for little endian typed arrays

Would it be possible for cbor-x to support big endian typed array? It looks like only little endian typed arrays are supported, currently.

1.4.1 missing on deno.land/x

could you maybe remove all NodeJS Buffer related stuff?

make it so it only rely on Uint8Array instead...

bson

Hi Kris,

your code is tbh quite complex. Is there any way to figure out how we could implement a fast bson serializer/deserializer?

Decoding of CBOR encoded float number returns wrong values

Hello,

I have the float number 12.75 encoded as [0xF9, 0x4A, 0x60] using the online CBOR encoder cbor.me. Now I want to decode these 3 bytes using cbor-x.

If I run the following code

import {decode} from "cbor-x";
console.log( decode(new Uint8Array([0xF9, 0x4A, 0x60])) );

it returns Infinity instead of the number 12.75. Do you have any explanation for this or could you please fix this?

Thank you very much :)

Edit: I just checked the RFC and it states here that always the shortest possible representation should be returned.
There is an example in the RFC: 0xf94580 is also decoded by cbor-x to Infinity instead of 5.5. And encoding 5.5 using cbor-x returns a Uint8Array with 5 bytes instead of 3 bytes as required by the RFC.

Problems with cbor-x streaming decoding

I feel like there's a bunch of little related issues with Decoder which require an exhaustive test driven development approach if cbor-x is going to continue offering a streaming decoder or iterator interface in the future. I have an iterator implementation written which seems to work well aside from these underlying problems.

incomplete strings are decoded as a string with the correct length, padded with \u0000 null codepoints for the bytes which weren't available in the chunk being decoded. This needs to throw an incomplete error
multibyte varint integer numbers when not all the bytes needed for decode, needs to throw incomplete error, instead throws a "RangeError: Offset is outside the bounds of the DataView"
incomplete buffers/byte arrays, similar to strings, are padded with 0 nul bytes for the section that is missing in the chunk being processed, but need to throw an incomplete error
bigints with missing bytes throw the same RangeError as regular ints
arrays with explicit up front length are padded with integer 0 values when values are missing from the chunk
incomplete Set chunks are decoded as a plain empty object {} (so weird??)
partial objects sometimes decode in funky ways with "0" keys and 0 values, and sometimes decode as an empty object

#6 provides a minimal test of the above problems.

Things that still need testing:

what happens to an array with prefixed length, where the length itself is incomplete?
as above with object/map?
what about indefinite length objects/map, arrays, buffers, strings where there is no break stop code?
Is there a behaviour difference when these problems are nested deeply inside of maps, arrays, or other nesting structures?

Things to think about:

Is there a better approach to resumable stream decoding where we might not need to throw away all the work done to decode a chunk, only to do it again next time a chunk arrives? When large objects come through streams like network sockets which have relatively small packet sizes (on the order of 1.5kb) cbor-x may end up attempting to fully decode an object dosens, hundreds, maybe thousands of times before it has enough information to return a value? It would be great to be able to hold everything that was decodable in memory, and pause, and resume when another chunk comes in that can expand the buffer's length.

Error: Unknown type function

Hi again,

There is a strange error when I'm using CBOR. But the error is not consistent and I can't reproduce it in my env, only on Github Actions CI.
Do you have an idea what does Unknown type function mean?

[2021-05-18T18:03:23.335Z] WARN  node2/TRANSPORTER: Invalid incoming GOSSIP_REQ packet. Error: Unknown type function
    at encode (/home/runner/work/moleculer/moleculer/node_modules/cbor-x/dist/node.cjs:1188:11)
    at /home/runner/work/moleculer/moleculer/node_modules/cbor-x/dist/node.cjs:1224:6
    at encode (/home/runner/work/moleculer/moleculer/node_modules/cbor-x/dist/node.cjs:1096:7)
    at /home/runner/work/moleculer/moleculer/node_modules/cbor-x/dist/node.cjs:1224:6
    at encode (/home/runner/work/moleculer/moleculer/node_modules/cbor-x/dist/node.cjs:1096:7)
    at /home/runner/work/moleculer/moleculer/node_modules/cbor-x/dist/node.cjs:1224:6
    at encode (/home/runner/work/moleculer/moleculer/node_modules/cbor-x/dist/node.cjs:1096:7)
    at encode (/home/runner/work/moleculer/moleculer/node_modules/cbor-x/dist/node.cjs:1105:8)
    at /home/runner/work/moleculer/moleculer/node_modules/cbor-x/dist/node.cjs:1224:6
    at encode (/home/runner/work/moleculer/moleculer/node_modules/cbor-x/dist/node.cjs:1096:7)

I'm trying to isolate the problem and get more information.

cbor-x doesn't support integer keys

CBOR map keys can be number but this library converts them to string.

As a result encode(decode(data)) != data if there's a number in map.

I wanted to modify a cbor data, but it totally messed up with it.

 [
      {
        '0': [ [Array], [Array] ],
        '1': [ [Array], [Array] ],
        '2': 571741,
        '11': <Buffer 6f d3 c7 48 fa b5 6a 25 59 ba 43 b1 2d 27 1e 36 2c d1 65 19 0d 67 27 b6 6c a3 f4 e0 31 1a d6 44>,
        '13': [ [Array] ]
      },
      {
        '0': [ [Array] ],
        '3': [],
        '5': [ [Array] ],
        '6': [
          <Buffer 59 13 3a 01 00 00 32 32 32 32 33 22 33 22 32 32 32 32 32 33 22 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 33 22 32 32 ... 4875 more bytes>
        ]
      },
      true,
      null
    ]

Support for cbor-x on plain v8 javascript engine

Hi @kriszyp

I am trying to use this package on env other than node and browser which doesn't have global or any browser/node based functions.
Executing on V8 gives undefined error for the global object declared here https://github.com/kriszyp/cbor-x/blob/master/decode.js#L825

It looks like the glbl object can be made as a map that maps => native object function. Can we do something like this or am i missing something?

let glbl = {
'Uint8Array': Uint8Array,
'Uint8ClampedArray': Uint8ClampedArray,
'Uint16Array': Uint16Array,
'Uint32Array': Uint32Array,
'BigUint64Array': BigUint64Array,
'Int8Array': Int8Array,
'Int16Array': Int16Array,
'Int32Array': Int32Array,
'BigInt64Array': BigInt64Array,
'Float32Array': Float32Array,
'Float64Array': Float64Array,
'RegExp': RegExp
};

sorting

I'm required to be quite strict and follow this rule:
https://cbor-wg.github.io/CBORbis/draft-ietf-cbor-7049bis.html#section-4.2.1-2.3.1

if i don't then my program complains that my cbor data is invalid...

TypeScript reporting issues in shipped `index.d.ts` file

Hello!

When building a project I am seeing the following compiler issues from TS:

node_modules/cbor-x/index.d.ts:1:58 - error TS2305: Module '"./decode.js"' has no exported member 'clearSource'.

1 export { Decoder, decode, addExtension, FLOAT32_OPTIONS, clearSource, roundFloat32, isNativeAccelerationEnabled } from './decode.js'
                                                           ~~~~~~~~~~~

node_modules/cbor-x/index.d.ts:7:24 - error TS2304: Cannot find name 'Options'.

7  constructor(options?: Options | { highWaterMark: number, emitClose: boolean, allowHalfOpen: boolean })
                         ~~~~~~~

node_modules/cbor-x/index.d.ts:10:24 - error TS2304: Cannot find name 'Options'.

10  constructor(options?: Options | { highWaterMark: number, emitClose: boolean, allowHalfOpen: boolean })
                          ~~~~~~~

Looking at the file, it does look like these issues are accurate since there is no Options and it seems as though decode.d.ts needs to be updated to declare and export the clearSource function.

Packed CBOR Support

Do you have any plans to support this? https://datatracker.ietf.org/doc/draft-ietf-cbor-packed/

I could be up for contributing this, WDYT?

record encoding creating duplicate define-record entries

Here's my simple repro:
https://deno-playground.mahardi.me?id=OWYwNTYyN2M

That produces:

83 d8 69 84 82 64 6e 61 6d 65 65 76 61 6c 75 65 19 69 00 63 6f 6e 65 01 d8 69 84 82 64 6e 61 6d 65 65 76 61 6c 75 65 19 69 01 63 74 77 6f 02 d9 69 01 82 65 74 68 72 65 65 03

Which inspected looks like this:
http://cbor.me/?bytes=83%20d8%2069%2084%2082%2064%206e%2061%206d%2065%2065%2076%2061%206c%2075%2065%2019%2069%2000%2063%206f%206e%2065%2001%20d8%2069%2084%2082%2064%206e%2061%206d%2065%2065%2076%2061%206c%2075%2065%2019%2069%2001%2063%2074%2077%206f%2002%20d9%2069%2001%2082%2065%2074%2068%2072%2065%2065%2003

support string objects

the structured clone algoritm supports cloning string objects and having same reference.
I expected this to work similar to globalThis.structuredClone but it did not...

Example code:

const {Encoder} = mod
let obj = Array(1000).fill(new String('a'.repeat(1024)))

var structures = []
let encoder = new Encoder({ structuredClone: true });
// I also expect this to be serialized to a very small array buffer as the string is an instance
// so all items in the array should only be references point to the same obj
let serialized = encoder.encode(obj) // size : 1051651 bytes !!!!
let copy = encoder.decode(serialized)
console.log(copy)

expected clone result

var str = new String(...)
clone = [str, str, ....n]
clone[0] === clone[1]

actual clone result

[["a","b","c"],["a","b","c"],...]

just doing this can lower the size to something like ~6000 bytes

const {Encoder} = mod
const ref = {x: 'a'.repeat(1024)} // create an object
let obj = Array(1024).fill(ref) // fill with same object

var structures = []
let encoder = new Encoder({ structuredClone: true });
let serialized = encoder.encode(obj);
let copy = encoder.decode(serialized);
console.log(serialized.byteLength)

Compilation error: Cannot find module './unpack'

I have a Typescript project with this dependency: "cbor-x": "^0.9.4"

When importing the encoder:

import { encode } from 'cbor-x/encode';

the typescript compiler outputs this error:

$ ./node_modules/.bin/tsc
node_modules/.pnpm/[email protected]/node_modules/cbor-x/encode.d.ts:2:47 - error TS2307: Cannot find module './unpack' or its corresponding type declarations.

2 export { addExtension, FLOAT32_OPTIONS } from './unpack'

Maybe the export of './unpack' does not need to be there? Or It should be exported from '/decoder' and not './unpack'?

encode.d.ts

import { Decoder } from './decode'
export { addExtension, FLOAT32_OPTIONS } from './unpack'
export class Encoder extends Decoder {
	encode(value: any): Buffer
}
export function encode(value: any): Buffer

Add an option to disable the compiled reader

Hey there, I’ve been using cbor-x in Cloudflare Workers quite successfully for a while now, but ran into a niche error when trying to do some bulk work:

EvalError: Code generation from strings disallowed for this context
    at new Function (<anonymous>)

The Workers runtime spits this out because it disables the use of eval() and new Function(source) to avoid security issues.

I’ve patched the package locally (just removing the whole if (this.slowReads++ >= 3) block here, but it would be nice to have an option in the package that disables that code path (I also noticed another potential user was concerned about the security of it while researching this issue). I’m more than happy to eat the reduced performance.

Sort record keys

Currently

    const obj1 = { a: "A", b: "B", c: "C" };
    const obj2 = { c: "C", b: "B", a: "A" };

serialize to different CBOR bytes because keys are sorted differently (although the objects are deep-equal).

I'm not sure what the specs say in this regard, but it would be very useful to have the possibility (perhaps via an option flag) to get record keys sorted lexicographically so that equal objects would produce the same octet stream. This could be useful in a number of contexts - e.g. signatures, hashing...

It could be something as trivial as

Object.keys(record)
.sort()
.reduce((obj, key) => {
   obj[key] = record[key];
   return obj;
 }, {});

Encoder options for record extension

Thank you so much for this awesome library (and for the integration with the lmdb package).

I have a question regarding Encoder setup with records / addExtension (which works great) and serialization of Map objects
Basically I want to be serialize / deserialize custom records (= classes) preserving Map instances and not converting them into plain objects. For this to work I have to explicitly pass { mapsAsObjects: true } to the Encoder used by extensions (which has the record extension enabled automatically as the docs say) otherwise Map objects are serialized as plain objects. The docs say "This is disabled by default if useRecords is enabled (Maps are preserved since they are distinct from records)" which seem to imply that useRecords preserves Maps by default, but that does not seem the case. Also, it seems counter-intuitive to set { mapsAsObjects: true } to produce the effect of not converting Map to objects (maybe it's a naming issue?). That made me think there might be some option-handling issue. Lastly, the docs state that "structured cloning also enables preserving certain typed objects like Error, Set, RegExp and TypedArray instances, using registered CBOR tag extensions". However, this does not seem to be needed to encode Set etc. with a new Encoder (I was trying to figure out if enabling this option was going to preserve Map but it doesn't) even though the option is not set at all, AFAICT.

Many thanks!

No encoding support for 64 bit format integers

Looking at encode.js, it appears that there is no support for 64-bit integers. If encoding a large integer value (greater than 0xFFFFFFFF, the encoding logic does not use the option a 9-byte head.

This means, for example, that Number.MAX_SAFE_INTEGER (0x1FFFFFFFFFFFFF on 64-bit platforms) cannot be round-tripped.

Restrict decoding

How can I restrict decoding a message in case where encoded message contain some values of types other than regular json supports (like ES6 Map/Set or sth), and discard it in decoded result, for example set it to undefined

Besides, very impressive work!

Is there a registered media type for when using the structured clone extension

Discussed in #65

^{Originally posted by lmaccherone January 16, 2023}
I have noticed that if you encode with the structured clone extension but attempt to decode without it, cbor-x throws an error. I'm building an API where every response body is encoded in cbor-x with the structured clone extension, and I wonder how best to tell clients precisely what encoding is being used. RFC 6839 says you should not use an unregistered suffix like application/vnd.my-app+cbor-sc, so for now; I'm just using application/vnd.my-app+cbor and giving good documentation. I'm also catching the error thrown by cbor-x and explicitly mentioning that you need to use the structured clone extension in my error response. Still, I wonder if there is a valid way to specify it in the Content-Type header.

kriszyp / cbor-x Goto Github PK

cbor-x's People

Stargazers

Watchers

Forkers

cbor-x's Issues

Performances ( 1000 x 16kB )

expected clone result

actual clone result

Discussed in #65

Recommend Projects

Recommend Topics

Recommend Org