Giter VIP home page Giter VIP logo

serde-diff's Introduction

serde-diff

A small helper that can

  1. Serialize the fields that differ between two values of the same type
  2. Apply previously serialized field differences to other values of the same type.

The SerdeDiff trait impl can serialize field paths recursively, greatly reducing the amount of data that needs to be serialized when only a small part of a struct/enum has changed.

Build Status Crates.io Docs.rs

Usage

On a struct or enum:

#[derive(SerdeDiff, Serialize, Deserialize)]

Serialize & apply differences for various formats:

rmp_serde (MessagePack - very small messages)

let msgpack_data = rmp_serde::to_vec_named(&Diff::serializable(&old, &new))?;
let mut deserializer = rmp_serde::Deserializer::new(msgpack_data.as_slice());
Apply::apply(&mut deserializer, &mut target)?;

bincode (very fast serialize/deserialize)

let bincode_data = bincode::serialize(&Diff::serializable(&old, &new))?;
bincode::config().deserialize_seed(Apply::deserializable(&mut target), &bincode_data)?;

serde_json

        let json_data = serde_json::to_string(&Diff::serializable(&old, &new))?;
        let mut deserializer = serde_json::Deserializer::from_str(&json_data);
        Apply::apply(&mut deserializer, &mut target)?;

Built-in type support

  • Primitive types
  • std::time::{Duration, SystemTime}
  • IP addresses in std
  • Vec
  • HashMap (thanks @milkey-mouse)
  • BTreeMap (thanks @milkey-mouse)
  • Fixed-size arrays (thanks @Boscop)
  • Tuples (thanks @Boscop)

Simple example

Cargo.toml

[dependencies]
serde-diff = "0.3"
serde = "1"
serde_json = "1" # all serde formats are supported, serde_json is shown in this example

main.rs

use serde_diff::{Apply, Diff, SerdeDiff};
use serde::{Serialize, Deserialize};
#[derive(SerdeDiff, Serialize, Deserialize, PartialEq, Debug)]
struct TestStruct {
    a: u32,
    b: f64,
}

fn main() {
    let old = TestStruct {
        a: 5,
        b: 2.,
    };
    let new = TestStruct {
        a: 8, // Differs from old.a, will be serialized
        b: 2.,
    };
    let mut target = TestStruct {
        a: 0,
        b: 4.,
    };
    let json_data = serde_json::to_string(&Diff::serializable(&old, &new)).unwrap();
    let mut deserializer = serde_json::Deserializer::from_str(&json_data);
    Apply::apply(&mut deserializer, &mut target).unwrap();


    let result = TestStruct {
        a: 8,
        b: 4.,
    };
    assert_eq!(result, target);
}

Derive macro attributes

Opaque structs:

#[derive(SerdeDiff, Serialize, Deserialize, PartialEq)]
#[serde_diff(opaque)] // opaque structs are serialized as a unit and fields do not need to implement SerdeDiff
struct DoesNotRecurse {
    value: ExternalType, 
}

Opaque fields:

#[derive(SerdeDiff, Serialize, Deserialize, PartialEq)]
struct WrapperStruct {
    #[serde_diff(opaque)]
    value: ExternalType, // opaque fields only need to implement Serialize + Deserialize + PartialEq,
}

Skip fields:

#[derive(SerdeDiff, Serialize, Deserialize, PartialEq)]
struct WrapperStruct {
    #[serde_diff(skip)]
    value: ExternalType,
}

Generics:

#[derive(SerdeDiff, Serialize, Deserialize, PartialEq, Debug)]
struct GenericStruct<T>
where
    T: SerdeDiff,
{
    a: T,
}

Enums:

#[derive(SerdeDiff, Serialize, Deserialize, PartialEq, Debug)]
enum TestEnum {
    Structish { x: u32, y: u32 },
    Enumish(i32, i32, i32),
    Unitish,
}

Contribution

All contributions are assumed to be dual-licensed under MIT/Apache-2.

License

Distributed under the terms of both the MIT license and the Apache License (Version 2.0).

See LICENSE-APACHE and LICENSE-MIT.

serde-diff's People

Contributors

aclysma avatar atul9 avatar boscop avatar byron avatar cleancut avatar crzysdrs avatar kabergstrom avatar milespossing avatar milkey-mouse avatar timonpost avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

serde-diff's Issues

serde-diff panics on first diff msg upon ws reconnect

I'm sending diffs to my frontend via websockets, and I auto-reconnect to the backend because I'm often restarting it during development.
Normal msgs are received perfectly fine after the reconnect but serde-diff panics on the first diff msg it receives after a ws reconnect.
It seems to be happening in the deserializing code for Option:

output @ frontend.js:175
frontend.js:476 Uncaught RuntimeError: unreachable
    at __rust_start_panic (wasm-function[7100]:0x1fae53)
    at rust_panic (wasm-function[7094]:0x1fabb0)
    at _ZN3std9panicking20rust_panic_with_hook17hb92129b68ed9b10fE (wasm-function[7089]:0x1fa8d4)
    at _ZN3std9panicking11begin_panic17h742ec3e2a1b90089E (wasm-function[2101]:0xd142f)
    at _ZN71_$LT$core..option..Option$LT$T$GT$$u20$as$u20$serde_diff..SerdeDiff$GT$5apply17h1c3c59d79ddb2975E (wasm-function[637]:0x304c2)
    at _ZN64_$LT$$u5b$T$u3b$$u20$_$u5d$$u20$as$u20$serde_diff..SerdeDiff$GT$5apply17hde05dc65b8d09068E (wasm-function[636]:0x2f81b)
    at _ZN77_$LT$core_types..views..SessionBoardView$u20$as$u20$serde_diff..SerdeDiff$GT$5apply17hd9c10f0374c45c18E (wasm-function[2098]:0xcff1e)
    at _ZN65_$LT$serde_diff..Apply$LT$T$GT$$u20$as$u20$serde..de..Visitor$GT$9visit_seq17ha60d341d7bbe4841E (wasm-function[1171]:0x8c09c)
    at _ZN94_$LT$$RF$mut$u20$bincode..de..Deserializer$LT$R$C$O$GT$$u20$as$u20$serde..de..Deserializer$GT$17deserialize_tuple17he6c1b01e599ea424E (wasm-function[3041]:0x11ae57)
    at _ZN94_$LT$$RF$mut$u20$bincode..de..Deserializer$LT$R$C$O$GT$$u20$as$u20$serde..de..Deserializer$GT$15deserialize_seq17hcb00bbed63102822E (wasm-function[3040]:0x11adb4)

__cargo_web_snippet_dc2fd915bd92f9e9c6a3bd15174f1414eee3dbaf @ frontend.js:775
stdweb::webcore::initialization::initialize::{{closure}}
std::panicking::rust_panic_with_hook
std::panicking::begin_panic
<core::option::Option<T> as serde_diff::SerdeDiff>::apply
<[T; _] as serde_diff::SerdeDiff>::apply
<core_types::views::SessionBoardView as serde_diff::SerdeDiff>::apply
<serde_diff::Apply<T> as serde::de::Visitor>::visit_seq
<&mut bincode::de::Deserializer<R,O> as serde::de::Deserializer>::deserialize_tuple
<&mut bincode::de::Deserializer<R,O> as serde::de::Deserializer>::deserialize_seq
<serde_diff::Apply<T> as serde::de::DeserializeSeed>::deserialize
bincode::internal::deserialize_from_custom_seed
bincode::internal::deserialize_seed
<frontend::app::Model as yew::html::Component>::update
<yew::html::scope::UpdateComponent<COMP> as yew::scheduler::Runnable>::run
yew::scheduler::Scheduler::put_and_try_run
yew::html::scope::Scope<COMP>::update
yew::html::scope::Scope<COMP>::send_message
yew::html::ComponentLink<COMP>::callback::{{closure}}
yew::callback::Callback<IN>::emit
yew::services::websocket::WebSocketService::connect::{{closure}}
<F as stdweb::webcore::callfn::CallMut<(A12,)>>::call_mut

Reduce serialisation size

Thanks for this awesome crate. I just exactly need this for serializing entity changes before sending them over the network. Though I have one big concern. The size of the JSON that it outputs.

Are there any plans to improve this in the future? I can work to invest some time into this.

Currently, we have those tag names:

enum DiffCommandField {
    Enter,
    Value,
    Remove,
    AddKey,
    EnterKey,
    RemoveKey,
    Exit,
}

pub enum DiffPathElementValue<'a> {
    /// A struct field
    #[serde(borrow)]
    Field(Cow<'a, str>),
    FieldIndex(u16),
    CollectionIndex(usize),
    AddToCollection,
}

Those fields will be passed into the end serialization result. I think that this is completely unnessiscarly.

I am thinking about solutions to make those fields smaller or allow the user to specify their preference. For example, we can use the enum tag index instead of enum tag name. This would fit into a single byte.

Do you have any idea on how to make this smaller and more compact?

Apply Option difference to `None` target doesn't work as expected

Example to reproduce the issue

use serde::{Deserialize, Serialize};
use serde_diff::{Apply, Diff, SerdeDiff};

#[derive(SerdeDiff, Serialize, Deserialize, PartialEq, Debug, Clone)]
struct TestStruct {
    a: Option<i32>,
}

fn main() {
    let old = TestStruct { a: Some(5) };
    let new = TestStruct { a: Some(10) };
    let mut target = TestStruct { a: None };
    let expected = new.clone();

    let diff = Diff::serializable(&old, &new);
    let json_diff = serde_json::to_string(&diff).unwrap();
    let mut deserializer = serde_json::Deserializer::from_str(&json_diff);

    Apply::apply(&mut deserializer, &mut target).unwrap();

    assert_eq!(target, expected);
}

Expected result

For target to equal TestStruct { a: Some(10) }

Actual result

Target equals TestStruct { a: None }

Broken Build

The build appears to be broken on both Ubuntu 20.04 and macOS Big Sur.
Build Logs:

โฏ cargo b
   Compiling serde-diff v0.4.0 (/home/miles/Dev/repos/serde-diff)
error[E0603]: module `export` is private
   --> src/difference.rs:400:37
    |
400 |                 let value = &serde::export::from_utf8_lossy(value);
    |                                     ^^^^^^ private module
    |
note: the module `export` is defined here
   --> /home/miles/.cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.119/src/lib.rs:275:5
    |
275 | use self::__private as export;
    |     ^^^^^^^^^^^^^^^^^^^^^^^^^

error: aborting due to previous error

For more information about this error, try `rustc --explain E0603`.
error: could not compile `serde-diff`

To learn more, run the command again with --verbose.

It would seem that this was caused by this commit. I'm proposing changing export to __private. It does seem that it isn't the developer's intent that these methods be externally available, but for now this will fix the build.

Problems understanding error: `trailing characters`

The error occurs on the line Apply::apply. I guess the deserializer, which takes serde_json::to_string(&Diff::serialize...) gets wrong json as input. I've uploaded the json and small code to https://github.com/SirWindfield/serde-diff-error. The error occurs on line 62:

thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Error("trailing characters", line: 1, column: 552)', src/main.rs:62:36
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Derived "apply" implementation for enums is broken

When diffing an enum like:

#[derive(SerdeDiff, Serialize, Deserialize, Debug, PartialEq, Clone)]
enum Value {
    Str(String),
    Int(i32),
}

Let's say we have the diff Value::Str("A") -> Value::Str("B").

The diff implementation will produce the following sequence of commands:

[{"Enter":{"EnumVariant":"Str"}},{"Enter":{"FieldIndex":0}},{"Value":"B"},"Exit"]

However, the apply implementation does not consume the final Exit command, causing the rest of the command stream to get out of sync.

This is because of this code:

fn apply<'de, A>(
&mut self,
seq: &mut A,
ctx: &mut serde_diff::ApplyContext,
) -> Result<bool, <A as serde_diff::_serde::de::SeqAccess<'de>>::Error>
where
A: serde_diff::_serde::de::SeqAccess<'de>, {
let mut __changed__ = false;
match (self, ctx.next_path_element(seq)?) {
(this, Some(serde_diff::DiffPathElementValue::FullEnumVariant)) => {
ctx.read_value(seq, this)?;
__changed__ = true;
}
#(#apply_match_arms)*
_ => ctx.skip_value(seq)?,
}
Ok(__changed__)
}

Specifically it's because next_path_element is only called once, instead of being called until it finds an Exit command. It makes sense to only call it once because an enum can only have one variant, but since the variant still counts as a path segment, an additional Exit should be consumed.

Split up lib.rs

I had an idea to split up lib.rs. For this I had the following structure in mind:

| src
| --- lib.rs
| --- difference.rs
| --- apply.rs
| --- implementation.rs
| --- counting-serialize.rs

The current lib.rs is quite big. It will be easier to read when dividing the code into multiple files.

What do you think? Would you bother if I create a PR for this?

SerdeDiff uses generic parameter `A`.

SerdeDiff derive macro uses generic parameter A.
It might be better to rename it to __A or __FirstParam or something like that, to free up this name for end user?

Enum variants are serialized as strings even when FieldPathMode == Index

        let x = QueryResult::Test(1);
        let y = QueryResult::Test(2);

        let diff = serde_diff::Config::new()
            .with_field_path_mode(FieldPathMode::Index)
            .serializable_diff(&x, &y);
        let z = ron::to_string(&diff).unwrap();
        println!("{}", z);

Expected output:

[Enter(EnumVariantIndex(1)),Enter(FieldIndex(0)),Value(2),Exit]

Actual output:

[Enter(EnumVariant("Test")),Enter(FieldIndex(0)),Value(2),Exit]

This crate looks unmaintained

Looking at the issues, PR and latest commit, this crate looks unmaintained, which is a pity.

Is there a fork that is maintained?

Unknown length error with rpm-serde.

I use rmp-serde crate for serialisation. When I perform the difference serialisation I get the error

thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: UnknownLength'

If a None value is passed to serialized_seq rmp-serde throws this error. See:

https://github.com/3Hren/msgpack-rust/blob/master/rmp-serde/src/encode.rs#L516

This is what serde-diff does:
https://github.com/amethyst/serde-diff/blob/master/src/lib.rs#L403

When serde-diff calls unwrap() this will throw an error. Though with bincode it won't panic.

use serde_diff::{Apply, Diff, SerdeDiff, FieldPathMode, Config};
use serde::{Serialize, Deserialize};

#[derive(SerdeDiff, Serialize, Deserialize, PartialEq, Debug)]
struct TestStruct {
    a: u32,
    b: f64,
}

fn main () {
    let old = TestStruct {
        a: 5,
        b: 2.,
    };
    let new = TestStruct {
        a: 8, // Differs from old.a, will be serialized
        b: 2.,
    };

    let diff = Config::new()
        .with_field_path_mode(FieldPathMode::Index)
        .serializable_diff(&old, &new);

    let msgpack_data = rmp_serde::to_vec(&diff).unwrap();
//    let bincode_data = bincode::serialize(&diff).unwrap();
}

Incorrect handling of map like structures

See the applied test where "map" like structure is tested.

running 4 tests
test tests::test_targeted ... ok
test tests::test_array ... ok
test tests::test_tuple ... ok
test tests::test_option ... FAILED

failures:

---- tests::test_option stdout ----
thread 'tests::test_option' panicked at 'unexpected DiffCommand Value or Remove', src/apply.rs:96:49
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace


failures:
    tests::test_option

test result: FAILED. 3 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

error: test failed, to rerun pass '--lib'
diff --git a/src/tests.rs b/src/tests.rs
index 082846b..83d117c 100644
--- a/src/tests.rs
+++ b/src/tests.rs
@@ -1,7 +1,9 @@
 use crate as serde_diff;
 use crate::{Apply, Diff, SerdeDiff};
 use serde::{Deserialize, Serialize};
+use std::collections::HashMap;
 use std::fmt::Debug;
+use std::iter::FromIterator;
 
 #[derive(SerdeDiff, Serialize, Deserialize, PartialEq, Debug, Copy, Clone)]
 struct TestStruct {
@@ -59,6 +61,18 @@ fn test_option() {
         Some(TestStruct { a: 52, b: 32. }),
         Some(TestStruct { a: 42, b: 12. }),
     );
+    roundtrip(
+        HashMap::from_iter([
+            (1, TestStruct { a: 1, b: 1. }),
+            (2, TestStruct { a: 2, b: 2. }),
+            (3, TestStruct { a: 3, b: 3. }),
+        ]),
+        HashMap::from_iter([
+            (1, TestStruct { a: 1, b: 1. }),
+            (3, TestStruct { a: 4, b: 4. }),
+            (4, TestStruct { a: 1, b: 1. }),
+        ]),
+    );
 
     partial(
         Some(TestStruct { a: 5, b: 2. }),

Make Option impl more readable (use same code as for array impl)

The code for Option

let mut self_iter = self.iter();
let mut other_iter = other.iter();
let mut idx = 0;
let mut need_exit = false;
let mut changed = false;
loop {
let self_item = self_iter.next();
let other_item = other_iter.next();
match (self_item, other_item) {
(None, None) => break,
(Some(_), None) => {
let mut num_to_remove = 1;
while self_iter.next().is_some() {
num_to_remove += 1;
}
ctx.save_command::<()>(&DiffCommandRef::Remove(num_to_remove), true, true)?;
changed = true;
}
(None, Some(other_item)) => {
ctx.save_command::<()>(
&DiffCommandRef::Enter(DiffPathElementValue::AddToCollection),
false,
true,
)?;
ctx.save_command(&DiffCommandRef::Value(other_item), true, true)?;
need_exit = true;
changed = true;
}
(Some(self_item), Some(other_item)) => {
ctx.push_collection_index(idx);
if <T as SerdeDiff>::diff(self_item, ctx, other_item)? {
need_exit = true;
changed = true;
}
ctx.pop_path_element()?;
}
}
idx += 1;
}
if need_exit {
ctx.save_command::<()>(&DiffCommandRef::Exit, true, false)?;
}

would be more readable if written like the code for arrays
let mut need_exit = false;
let mut changed = false;
for (idx, (self_item, other_item)) in self.iter().zip(other.iter()).enumerate() {
ctx.push_collection_index(idx);
if <T as $crate::SerdeDiff>::diff(self_item, ctx, other_item)? {
need_exit = true;
changed = true;
}
ctx.pop_path_element()?;
}
if need_exit {
ctx.save_command::<()>(&DiffCommandRef::Exit, true, false)?;
}

Iterate over differences

Hello!
It seems like the output of algorithm can be serialized, however, I see no way of iterating over changes in order to process them manually.

When deriving `SerdeDiff`, allow specifying if the impl should make the type `opaque`

  1. Currently, one has to write simple_serde_diff!(Foo); to make Foo opaque during diffing.
    It would be nicer (regarding API-design) to be able to specify it when deriving SerdeDiff like:
#[derive(SerdeDiff)]
#[serde_diff(opaque)]
struct Foo { ... }
  1. For making individual fields opaque (instead of their whole type in all occurrences), currently the syntax is #[serde_diff(inline)], but #[serde_diff(opaque)] would be more intuitive/obvious for someone reading the code, and would be the same attribute as for a whole type (but on a field).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.