Comments (5)
Hi @acromarco. Reading data across compatible schemas requires a resolver. See #383 (comment) for more information and an example.
from avsc.
Reading data across compatible schemas requires a resolver.
Thank you for your answer! Somehow I thought/expected that "basic schema evolution" works out the box.
For me the documentation at https://github.com/mtth/avsc/wiki/Advanced-usage#schema-evolution created the impression that creating a resolver is only needed for special cases for increasing performance.
Would it be technical possible using Avro to allow schema evolution without creating resolvers?
When I create a resolver it's now possible to read the old schema but not anymore the new schema:
const resolver = typeVersion2.createResolver(typeVersion1);
// works fine now, cool :-) !
const deSerialized2 = typeVersion2.fromBuffer(buf, resolver);
expect(deSerialized2).toEqual({ ...dummyObjectToSerialize, newField: 'myDefault' });
const dummyObjectToSerialize2 = { name: 'Albert', newField: 'myValue' };
const buf2 = typeVersion2.toBuffer(dummyObjectToSerialize2);
// works fine
const deSerialized3 = typeVersion2.fromBuffer(buf2);
expect(deSerialized3).toEqual(dummyObjectToSerialize2);
// throws "trailing data" error :-(
const deSerialized4 = typeVersion2.fromBuffer(buf2, resolver);
expect(deSerialized4).toEqual(dummyObjectToSerialize2);
Reading a buffer from a new schema using the resolver results in an "trailing data" error :-(.
So, what is the recommended way to decode an buffer that can be from multiple schema versions?
Is it necessary to add a kind of "schema-version" field in order to use a resolver or not?
This could get messy after some iterations of schema evolution.
Also what should I do when the reader don't know about the new schema? Just imagine an old client that tries to read data that is written with a new extended schema?
// throws "trailing data" error :-(
const deSerialized5 = typeVersion1.fromBuffer(buf2); // buf2 contains new extended schema
expect(deSerialized5).toEqual(dummyObjectToSerialize);
This fails also with the "trailing data" error. I would expect that this works automatically because all required fields are in the data. How should an old client knows about a new schema version in order to create resolvers?
Sorry for all the "dumb" questions. I'm new to Avro and maybe my expectations are wrong.
from avsc.
Is the following explanation correct?
Decoding avro-encoded data requires to know exactly the same schema that was used for encoding.
This is caused by the efficient binary nature of avro.
The encoded data doesn't contain enough "structural" or "metadata" that would allow a mapping/decoding to a slightly different (compatible) schema like one with additional optional fields.
Therefore, a decoding client must create a resolver which is created from the encoding schema and the actual compatible client schema.
This means that in practice, to support reading data from multiple and possibly unknown compatible schemas , the avro-encoded data needs to be accompanied directly by the encoding schema or a schema version and a method to look up the corresponding encoding schema (e.g. schema registry). Such a schema version must be provided outside the actual avro-encoded data, because otherwise there would be no way to read it.
from avsc.
Yes, that's right.
from avsc.
Thank you!
I will close this issue now, as it was never a bug but just my misunderstanding of how Avro works.
However, maybe it's possible to make the documentation in the future more foolproof by:
- Making more clear that you need for deserialization exactly the same schema used for serialization.
- Adding more examples that are not mixed with optimization concerns. The current section https://github.com/mtth/avsc/wiki/Advanced-usage#schema-evolution was in this regard confusing for me.
from avsc.
Related Issues (20)
- Bun support HOT 1
- Update `snappy` examples in wiki for `snappy` 7.x.x (async) HOT 1
- How to convert decoded avro data into JSON? HOT 1
- IDL not exporting types for array of union HOT 3
- Support ?-syntax for optional fields in avdl HOT 2
- can schema support dynamic keys? HOT 1
- Invalid Avro header does not raise error event HOT 1
- "new SlowBuffer" is deprecated since Node v.6 --> cannot use it with VITE5 and VUE3 HOT 3
- Doesn't handle trailing 0s from buffer. HOT 1
- Unwrapping unions when deserialising HOT 4
- Unable to consume messages produced by Java application with AVRO schema HOT 9
- Using type.isValid() with a union of records HOT 3
- long encoding/decoding is not reversible for some large but safe js ints HOT 1
- Convert String Representations into Logical Type HOT 1
- Which version of the Avro specification does the latest version of avsc (5.7.7) implement? HOT 1
- Not being able to use records that use the "bytes" type field HOT 2
- Using seprately declared enum in union in record. HOT 7
- Avro.types.LogicalType's _copy implementation HOT 1
- "Error: trailing data" when using custom Long type in Avro with KafkaJS Confluent Schema Registry HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from avsc.