Comments (11)
and formatted more nicely:
class FastReader(Construct):
def _parse(self, stream, context):
return stream.read()
def _build(self, obj, stream, context):
stream.write(obj)
SegBody = Struct(None,
UBInt16('size'),
Field('data', lambda ctx: ctx['size'] - 2),
)
Seg = Struct('seg',
Literal('\xff'),
Byte('kind'),
Switch('body', lambda c: c['kind'],
{
SOS: FastReader('data'),
},
default = Embed(SegBody),
)
)
JPEG = Struct('jpeg',
Literal('\xff\xd8'),
GreedyRange(Seg),
)
from construct.
Hi,
I'm not sure about the FastReader, as I still don't grok that section of Construct yet.
There is a PascalString, in construct.macros, which takes a length_field as a kwarg. An example usage:
>>> from construct import PascalString, UBInt16
>>> s = PascalString("hurp", length_field=UBInt16("length"))
>>> s.parse("\x00\x05Hello")
'Hello'
Thanks for your comments. Let me know if you have any patches you wish to contribute.
from construct.
Hi - the issue with the PascalString is that the length field doesn't include the bytes that make up the length field. In several protocols, we get fields like this, 0x0004babe, so the length (4) include the first 2 bytes.
from construct.
@akvadrako: this could be done like so
>>> s=PascalString("data", ExprAdapter(ULInt16("length"),
... lambda val, ctx: val + 2, lambda val, ctx: val - 2))
>>> s.parse("\x05\x00helloxxxx")
'hel'
>>> s.build("foo")
'\x05\x00foo'
on the other hand, your straight forward solution is better.
as per your FastReader
class -- i would consider it bad design. i understand you simply wanted to read everything in, but it's not predictable (can't tell how much it will read or write) and thus not symmetric. for instance, the following construct would work only in one direction:
Struct("a",
FastReader("blob"),
UBInt32("x"),
)
you would be able to build anything you want, but you'll never be able to parse it back.
from construct.
I suggested a variant to PascalString because length+data is common in network protocols and apparently JPEG too.
FastReader is the best we can do with construct's internals. Your example wouldn't work with RepeatUntil and Range either. I'm not sure it should - since constructs need to know about future constructs and you'll get ambiguity:
Struct("a",
GreedyRange("b"),
GreedyRange("c"),
)
Probably better to make a FastReadUntil('BOUNDARY').
from construct.
Length + data is perfectly serviced by PascalString; the case where the length of the length is included in the length is actually rather uncommon though. Maybe a new String subclass is needed for it.
As far as "fast" reading, why not examine other optimizations first? There are optimization opportunities in Construct core, I think.
from construct.
@MostAwesomeDude: no need to subclass, it would be much simpler to just define a InclusivePascalString "macro" that takes care of subtracting/adding the size of the length field from the length.
@akvadrako: your "fast" reader isn't any faster than the plain old Field except that it doesn't check the length. since this greedy construct can only appear once at the end of a data structure, it don't suppose it would make much difference in terms of speed. also, my tests back in the day showed that psycho can speed up parsing by a tenfold.
on the other hand, as you said, it poses a problem of breaking the symmetry between parsing and building... but i think it's inherent to the pattern and there isn't any real solution.
from construct.
it's much faster - construct is unusable for parsing JPEG images without it - where 99% of the data is an unbounded blob at the end of the file.
from construct.
if you're using GreedyRange
, then yes, it would be much faster. i was talking about Field
. on the other hand, Field
must have a predetermined length, so it's not suitable for your purpose.
what do you mean, though, that 99% of the file is a blob? doesn't it have an internal structure? if so, i assume you have no real interest in it, so you may want to use OnDemand
, so it will actually be read only when asked for.
from construct.
Yes, you are correct. OnDemand doesn't help though, because it requires a known length.
from construct.
well, i just had an idea: assuming you're working on a file/stringIO, you can write a construct that simply returns the remaining length till EOF. e.g.
p=stream.tell()
stream.seek(0, 2)
p2=stream.tell()
stream.seek(p)
return p2-p
and then you could combine it with Field
and OnDemand
.
from construct.
Related Issues (19)
- LICENSE
- Printing a container causes a ValueError
- CString.sizeof(context) throws "can't calculate size" SizeofError HOT 1
- Discussion/Mailing List HOT 3
- Differential Encoding HOT 1
- MetaFiled requires length attribute to be present on build HOT 3
- TextualAdapter builds incorrect 0 and negative numbers
- STICKY: upstream moved
- Password ? Install faild
- adaptors doc not displaying correctly on github
- Lazy(Bytes(this.length)) always has zero length inside GreedyRange HOT 1
- [Question] Is it active now? HOT 3
- I want a document
- Cant pass byte param to Padding, gets converted to int
- pretty printing long Collections HOT 1
- Anchor requires key presence in container HOT 2
- Install? HOT 1
- Python 3 Support HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from construct.