Comments (8)
For reference, shared dictionary compression examples in HTTP:
https://chromium.googlesource.com/chromium/src/+/53.0.2744.1/net/sdch/README.md (vcdiff, the first example of this)
https://learn.microsoft.com/en-us/deployedge/learnmore-zsdch-compression (zstd)
https://chromestatus.com/feature/5124977788977152 (originally just brotli)
Those last two are related, and link to a shared draft rfc
from grpc-go.
Here's an example for how I will probably work around this until there's a better way. Feel free to steer me another direction :)
Protobuf
syntax = "proto3";
package zstd;
option go_package = "repro/zstd";
// Compressed is a wrapper for zstd encoded data in lieu of better ways of sharing metadata
message Compressed {
string category = 1;
string namespace = 2;
bytes data = 3;
}
Codec
// codec is a dictionary-aware compression implementation
//
// Given a category and namespace, it knows how to find the correct dictionary
// to compress with. For decompression, it reads the dictionary ID from the
// zstd payload.
//
// This could later be augmented to have decompression IDs category:namespace
// aware vs. assuming they're globally scoped.
type codec struct {
Category string
Namespace string
}
func (c *codec) Marshal(v any) ([]byte, error) {
return marshal(c.Category, c.Namespace, v)
}
func (c *codec) Unmarshal(data []byte, v any) error {
return unmarshal(data, v)
}
func (c *codec) Name() string {
return "zstd-wrapper"
}
// marshal and compress 'v', wrap in [zstd.Compressed], and return the result
// of marshalling that
//
// Skip wrapping if 'v' is already a [zstd.Compressed]
func marshal(category, namespace string, v any) ([]byte, error) {
switch vv := v.(type) {
case *zstd.Compressed:
return proto.Marshal(vv)
case proto.Message:
data, err := proto.Marshal(vv)
if err != nil {
return nil, err
}
container := &zstd.Compressed{
Category: category,
Namespace: namespace,
// TODO: Imagine there was compression happening before this
Data: data,
}
fmt.Println("using the codec for compression")
return proto.Marshal(container)
default:
return nil, fmt.Errorf("unsure what to do for type: %T", v)
}
}
// unmarshal data into a [zstd.Compressed], then unmarshal it's data field into 'v'
func unmarshal(data []byte, v any) error {
switch vv := v.(type) {
case *zstd.Compressed:
return proto.Unmarshal(data, vv)
case proto.Message:
var container zstd.Compressed
if err := proto.Unmarshal(data, &container); err != nil {
return err
}
fmt.Println("using the codec for decompression")
// TODO: Imagine there was decompression happening before this
return proto.Unmarshal(container.Data, vv)
default:
return fmt.Errorf("unsure what to do for type: %T", v)
}
}
Usage
package main
import (
"context"
"fmt"
"log"
"net"
zstd "repro/gen"
"google.golang.org/grpc"
"google.golang.org/grpc/credentials/insecure"
"google.golang.org/grpc/encoding"
pb "google.golang.org/grpc/examples/helloworld/helloworld"
"google.golang.org/protobuf/proto"
)
const addr = "localhost:50051"
type server struct {
pb.UnimplementedGreeterServer
}
func (s *server) SayHello(ctx context.Context, in *pb.HelloRequest) (*pb.HelloReply, error) {
return &pb.HelloReply{Message: "Hello " + in.GetName()}, nil
}
func main() {
// Register default codec without category/namespace for decompression
encoding.RegisterCodec(&codec{})
lis, err := net.Listen("tcp", addr)
if err != nil {
log.Fatalf("failed to listen: %v", err)
}
s := grpc.NewServer()
pb.RegisterGreeterServer(s, &server{})
go func() {
if err := s.Serve(lis); err != nil {
log.Fatalf("failed to serve: %v", err)
}
}()
// Now connect from client
conn, err := grpc.Dial(addr, grpc.WithTransportCredentials(insecure.NewCredentials()))
if err != nil {
log.Fatalf("did not connect: %v", err)
}
defer conn.Close()
c := pb.NewGreeterClient(conn)
r, err := c.SayHello(
context.Background(),
&pb.HelloRequest{Name: "coxley"},
// TODO: This would ideally be cached and retrieved via an accessor somewhere
grpc.ForceCodec(&codec{Category: "test", Namespace: "foo"}),
)
if err != nil {
log.Fatalf("could not greet: %v", err)
}
fmt.Printf("Result: %v\n", r)
}
from grpc-go.
Another option here would be to pass UseCompressor
and SetSendCompressor
an optional any
argument that gets passed to the compressor as an extra parameter to Compress
. This would not be a backward compatible change (it could be if we made it a variadic parameter, but that's a bit ugly from an API documentation POV). With this approach, Decompress
can't really be parameterized, so the compressor would need to encode whatever the decompressor needs in its header.
If we wanted to pass the metadata around, then we could pass the metadata to both sides (outgoing metadata to the compressor, and incoming metadata to the decompressor). This then has the unfortunate effect of requiring the metadata to be polluted in order to parameterize the compressor.
Instead of passing metadata, we could pass the context directly. This would enable parameterization via the context instead of the metadata. I'm not sure we want to encourage blocking on this path, though, and passing a context might do that.
from grpc-go.
And with interesting timing, this recent one: https://developer.chrome.com/blog/shared-dictionary-compression
from grpc-go.
From a quick reading, it looks like none of these approaches requires decompression to have access to anything besides the compressed data -- is that correct? So we don't have any known use case that requires passing the incoming metadata to the decompressor?
from grpc-go.
@dfawley I think that could be true. Some cases may need to have multiple decompressor "names" for ID space scoping if that's the direction we go, though. (eg: zstd-groupA
, zstd-groupB
)
At least in zstd
, the dictionary ID is limited to 4 bytes. I'm not sure how other companies "scope" their dictionary registries, but it's at least a consideration. Maybe not end of the world.
from grpc-go.
My proposed change, then would be:
package grpc
func UseCompressor(name string, compressorOptions ...any) CallOption {}
func SetSendCompressor(ctx context.Context, name string, compressorOptions ...any) error {}
package encoding
type Compressor interface {
Compress(/*see below*/, ...any) /*see below*/
...
}
Essentially this simply adds "...any" to all the places where we set and invoke the compressor.
Regarding the Compressor interface, we are planning some changes to the encoding package to support scatter/gather and memory re-use. Those will be covered in #6619 and I would propose incorporating these changes into the work there.
How does all of that sound?
from grpc-go.
@dfawley Sorry for the late reply — my Github notifications organization is abysmal.
I think that sounds good as long as it does for you!
from grpc-go.
Related Issues (20)
- Improve the xDS bootstrap package HOT 2
- xds/bootstrap: `client_features` should be set completely by the client implementation HOT 1
- License File seems to be missing the name of copyright owner HOT 2
- Why is the service config passed as a JSON-String just to get converterted to a struct anyway? HOT 7
- Cardinality violations should use error code “unimplemented” HOT 2
- GitHub Action: branch protection checks are skipped, and also not blocking merges HOT 3
- grpc.NewClient with namedpipe on Windows throws resolverError HOT 2
- User agent becomes grpc-go/1.64. on server side of grpc gateway HOT 2
- xds: move functionality from `xds/internal` to `internal/xds`
- stubserver: add support to optionally pass in a `grpc.Server` or `xds.GRPCServer` HOT 2
- Github Action: Codecov action is broken and is failing silently HOT 1
- Upgrade to using math/rand/v2 to get perf enhancements HOT 2
- xds: tests shouldn't rely on the presence of an entry in the `authorities` field of the bootstrap configuration with an empty key
- Experimental API related to metadata HOT 4
- Linter rule for using context.Background() without a timeout in tests HOT 2
- gRPC is incompatible with tls.Listener HOT 2
- Closing connection takes up to 15 minutes. HOT 4
- Feature Request: expose handleRawConn or add ServeConn HOT 19
- Flaky test: TimerAndWatchStateOnErrorCallback
- xds: bootstrap config is not emitted to logs in a human readable way
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from grpc-go.