Comments (11)
The performance comes from more than encode_fnum/2
These map lookups, in the main encode/3
, are eliminated:
prop = props.field_props[tag]
val = Map.get(struct, prop.name_atom)
Validation and the call to empty_val?/1
(which currently does 5 comparisons per value) are merged, more accurate and faster, by becoming something like:
def encode_field(_tag, nil, value)] do
<<>>
end
def encode_field(tag, :uint32, 0) do
<<>>
end
def encode_field(tag, :uint32, value) when value in > 0 and value <= 4_294_967_296 do
[...encode ..]
end
... other types
def encode_field(_tag, type, value) do
# fail validation
end
Some of these gains are possibly possible without macroing the entire encode function (the fnum value could be stored in the props, for example).
from protobuf.
I'm trying to do some optimizing instead of supporting taking validations away. After all, data validity is very important.
from protobuf.
I agree. I started looking at generating an encoder. If you take Encoder.encode/3
, it's doing a lot of things at runtime that can be pushed to compile team. The downside is a larger beam file.
This is what I have so far. For now, I only support proto3 since that's all we need:
https://github.com/2nd/protobuf-elixir/blob/feature/generate_encoder/lib/protobuf/generator.ex
Just want to add more baseline tests before adding more features (oneof, performance optimizations, ...)
For my Everything
test struct, the numbers look good (and the validation is even more precise as I added specific range checks for integers):
Protobuf.generator 100000 11.81 µs/op
Jason.encode 100000 16.43 µs/op
Poison.encode 100000 21.93 µs/op
Protobuf.encode 100000 29.40 µs/op
from protobuf.
The result seems good, but I doubt if it has the same performance for many situations. Like what if most fields in the struct are not literal values, like
a = get_from_func_a()
b = get_from_func_b()
struct = %Foo{a: a, b: b}
But one thing you inspire me is maybe we can encode the fields with default values in compiling time because we often have some empty fields.
I prefer treating the macro solution as the last one we can use. Before that, I'll look into other possibilities. (Maybe the real last one is NIF XD)
btw, what I'm trying to do is trying to reduce some validations because some functions in Encoder may include the validations already.
from protobuf.
I don't understand..it doesn't matter if the values are literals or not. The macro code expands to something like:
def encode(struct) do
:erlang.iolist_to.binary([
Generator.encode_field(<<11>>, :uint32, struct.id),
Generator.encode_field(<<18>>, :string, struct.name)
])
end
11 and 18 just being the precomputed tag+type encoding. This precomputation is one example of things that don't need to happen on each call to encode/1
. Given a proto of:
message Whatever {
uint32 id = 1;
string name = 2;
}
The encode_fnum/2
for these is always the same (11 and 18) so why do it over and over again?
from protobuf.
@karlseguin Yes, you're right. It works for encode_fnum/2
. I thought the macro will try to handle all literal values when I saw encode_field
.
from protobuf.
I move the validations to encoding, which improve the encoding performance by about half:
# use bench/script/bench.exs, but change time to 1m and disable HTML
Operating System: Linux
CPU Information: Intel(R) Xeon(R) CPU E5-2676 v3 @ 2.40GHz
Number of Available Cores: 2
Available memory: 3.86 GB
Elixir 1.6.5
Erlang 20.3
Benchmark suite executing with the following configuration:
warmup: 2 s
time: 1 min
parallel: 1
inputs: none specified
Estimated total run time: 2.07 min
# before
Name ips average deviation median 99th %
google_message1_proto3 Encode 18.03 K 55.47 μs ±84.48% 53 μs 72 μs
google_message1_proto2 Encode 14.57 K 68.63 μs ±72.05% 65 μs 93 μs
# after
Name ips average deviation median 99th %
google_message1_proto3 Encode 35.84 K 27.90 μs ±212.31% 26 μs 37 μs
google_message1_proto2 Encode 23.75 K 42.11 μs ±170.63% 39 μs 51 μs
The code is on master already.
from protobuf.
The latest benchmark result is
Name ips average deviation median 99th %
google_message1_proto3 Encode 53.08 K 18.84 μs ±364.79% 17 μs 28 μs
google_message1_proto2 Encode 34.89 K 28.66 μs ±251.24% 26 μs 37 μs
@karlseguin Could you verify the performance on your benchmarks?
from protobuf.
Yes. I see a similar change. From the initially reported 29µs/op to 19. Nice work!
from protobuf.
btw, decoding is faster too:
# before
Name ips average deviation median 99th %
google_message1_proto2 Decode 28.59 K 34.98 μs ±110.21% 33 μs 49 μs
google_message1_proto3 Decode 28.55 K 35.03 μs ±97.93% 33 μs 49 μs
# after
Name ips average deviation median 99th %
google_message1_proto2 Decode 51.85 K 19.29 μs ±280.78% 18 μs 29 μs
google_message1_proto3 Decode 51.77 K 19.32 μs ±278.03% 18 μs 30 μs
from protobuf.
@tony612 hi, I noticed that after optimization, proto3 allows nil
to be encoded as basic types (string, int32 etc) when previously it was raising errors from validator.
iex(1)> Example.new(message: "") |> Example.encode |> Example.decode
%Example{message: ""}
iex(2)> Example.new(message: nil) |> Example.encode |> Example.decode
%Example{message: ""}
Is it intentional or bug?
from protobuf.
Related Issues (20)
- How to use type specs? HOT 2
- Invalid field number 0 when decoding binary data HOT 3
- JSON decoding should work with "." in FieldMask path
- Can't create Firestore.V1.Value HOT 3
- Decoding Invalid Strings sometimes raise MatchError instead of Protobuf.DecodeError
- How to encode oneof fields? HOT 1
- The plugin one gets with "$ mix escript.install hex protobuf" seems to be old or broken HOT 2
- Let's release 1.0.0 HOT 2
- Encoding oneof attribute results in invalid struct HOT 2
- There's something wrong to get `.[...]` type
- Add option to use type_check during struct generation HOT 2
- (UndefinedFunctionError) function Protobuf.Encoder.encode/2 is undefined or private when running benchmarks HOT 1
- Error while trying to compile: google/cloud/secretmanager/v1/resources.proto HOT 1
- Any support unusable
- Consideration in generating options in Messages in OTP 26 HOT 3
- Inconsistent behaviour when encoding oneof HOT 3
- Can JSON decoder be loose on constraints about float? HOT 1
- `Protobuf.encode/1` does not actually verify struct in some cases HOT 2
- lib/elixirpb/pb_extension.pb.ex is missing in the package
- Warning on Elixir 1.16 + OTP 25 HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from protobuf.