Comments (4)
A streaming API for a deserializer is hard to support, since the internal state of the decoder needs to be kept throughout all operations so that it can be resumed later on. This comes with a performance hit for non-streaming decoding, and would complicate the codebase. Implementing it would require changes to the parser statemachine, which is all implemented at the C level (so no, Cython wouldn't work here).
Rather, I recommend implementing message framing at a higher level. A simple protocol would be length prefix framing (see e.g. this blog post for reference). This has a few benefits besides simplifying our codebase:
- It lets you easily swap out the serialization mechanism without changing the framing (so you could try out e.g. json, pickle, quickle, etc...) instead.
- They let you set limits on the message size received, so a client can't send a giant message that would crash the server. This is harder to handle at the msgspec level, but easier at the framing level (where you can set a max size on the total message).
from msgspec.
Rather, I recommend implementing message framing at a higher level.
@jcrist cool, this def makes sense to me.
Are you suggesting that a framing protocol could be added to this project or you're suggesting client code should implement it in its own code base?
Thanks again for the in depth answers btw.
from msgspec.
This would be something you'd handle in your own codebase. A naive asyncio implementation might be (untested, please note I don't have these apis memorized):
async def write(stream, msg: bytes) -> None:
n = len(msg)
stream.write(n.to_bytes(4, "big"))
socket.write(msg)
await stream.drain()
async def read(stream) -> bytes:
prefix = await stream.readexactly(4)
n = int.from_bytes(prefix, "big")
return await stream.readexactly(n)
from msgspec.
@jcrist if you're ok with it I might also put an example of this in the docs PR for #25 just so any newcomers have an example to work off. I'll probably link to the protobuf post you sent as well. Imo it'd be pretty handy to have some examples for multiple async frameworks as well.
from msgspec.
Related Issues (20)
- Support types.MappingProxyType HOT 3
- Add either `init_omit_defaults` or `omit_none` HOT 5
- Consider making `DecodeError` and `ValidationError` inherit from `ValueError` HOT 1
- Docs page on testing
- json schema generation - differences between pydantic and msgspec HOT 3
- Allow conversion to collection from generator HOT 2
- Porting guide for users coming from `orjson`
- Converting dicts into list with key-reuse HOT 3
- Collecting multiple validation/constraint errors at once HOT 1
- Allow `omit_defaults` to exclude fields when encoded value is `{}` (empty dict)
- Duplicate key detection
- Allow unknown tags, defaulting to tagged base
- Implementing optional bytes type for json. HOT 1
- Update annotation parsing to work with PEP 649 in Python 3.13
- `omit_defaults` does not omit tuples and frozensets HOT 2
- Field Alias Overrides in Subclasses Not Reflected in __struct_encode_fields__
- Convert builtin types to numpy HOT 1
- Is it possible to have the decoding of union of all subclasses of a struct
- Subclasses of frozen Structs causing mypy error: `Cannot inherit non-frozen dataclass from a frozen one` HOT 2
- Datetime without timezone are decoded as str with msgspec.msgpack. in 0.18.6 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from msgspec.