scott-griffiths / bitstring Goto Github PK
View Code? Open in Web Editor NEWA Python module to help you manage your bits
Home Page: https://bitstring.readthedocs.io/en/stable/index.html
License: MIT License
A Python module to help you manage your bits
Home Page: https://bitstring.readthedocs.io/en/stable/index.html
License: MIT License
It would be nice to have "hex"-like properties that work on BitStrings even
when length is not evenly divisible by 4. The new properties(rhex and lhex?)
would ideally right or left justify the data (depending on the property
called), pad with zeros to ensure length is multiple of 4, and then return
the corresponding hex value.
>>> b = BitString('0b11')
>>> b.rhex
'0x3'
>>> b.lhex
'0xc'
Original issue reported on code.google.com by [email protected]
on 8 Sep 2009 at 4:51
These are effectively the same:
a.append(b)
b.prepend(a)
but the performance could be wildly different, depending on how much data
has to be bit-shifted to the correct alignment before joining.
We should make append() and prepend() private, then the new append() and
prepend() can call whichever one is most appropriate for the input they're
given.
Original issue reported on code.google.com by [email protected]
on 15 Jul 2009 at 2:09
>>> a = BitString('0b11')
>>> a.replace('0b1', a)
2
>>> a
BitString('0b11')
So nothing changed when replacing one bit with two. But using a copy
everything works:
>>> a.replace('0b1', a[:])
2
>>> a
BitString('0xf')
Original issue reported on code.google.com by [email protected]
on 28 Jul 2009 at 3:34
Have:
deletebits(bits, bitpos)
Need:
deletebytes(bytes, bytepos)
Original issue reported on code.google.com by [email protected]
on 13 Feb 2009 at 12:16
Using the filename= method of initialisation is fine, but we really should
accept a file object that has been opened elsewhere. This would be more in
keeping with standard library behaviour. So:
>>> f = open('somefile', 'rb')
>>> s = BitString(f)
would be roughly equivalent to
>>> s = BitString(filename='somefile')
except that you'd still have the file object in scope if you wanted it.
We should also add a tofile function, that also takes a file object as a
parameter. It should write the BitString to the file in chunks, to avoid it
being read into memory unnecessarily. This doesn't really come into its own
until the immutable arrays come along, but could still be useful for
copying a chunk of one file to another.
Original issue reported on code.google.com by [email protected]
on 4 Aug 2009 at 8:59
Currently we have a module-level join function:
bitstring.join(bsl)
We could copy the standard library by implementing one that uses the
BitString instance as the separator:
BitString('0b1').join(bsl) # Put a '1' bit between every item in bsl.
Easy enough to do, but would it actually be useful to anyone? And should we
deprecate the current join function?
Original issue reported on code.google.com by [email protected]
on 18 Jun 2009 at 1:34
For example:
a = pack('8*uint:16', range(8))
The '*' isn't strictly necessary but it makes the intent a fair bit
clearer. Brackets could also be employed:
b = BitString('3*(bit:1, uint:7)', '1', 34, '0', 12, '1', 33
and finally with a variable factor:
c = b.unpack('n*(bin:1, uint:7)', n=3)
Original issue reported on code.google.com by [email protected]
on 30 Aug 2009 at 7:55
Raising IndexError if a slice index is greater than the length of the
BitString was the intended behaviour, but doesn't match the usual sequence
slicing behaviour.
For slices with indices that exceed the length of the container they are
silently changed to be the length (this is only true for slices and not for
indexing).
Original issue reported on code.google.com by [email protected]
on 5 May 2009 at 9:01
Much nicer if findall() returns a generator because if you only need the
first few results you won't need to find them all.
Original issue reported on code.google.com by [email protected]
on 7 May 2009 at 4:44
Not sure if interpret is the best name. Serves a similar purpose to read or
peek, but doesn't depend on the current bitpos.
a, b, c = s.interpret('10:uint8, +5, hex4, 100:se')
is equivalent to:
s.bitpos = 10
a = s.read('uint8')
s.bitpos += 5
b = s.read('hex4')
s.bitpos = 100
c = s.read('se')
Note that '+5' means advance 5 bits, '-5' means retreat 5 bits and '5'
means return next 5 bits as a BitString.
Could also allow one token to be indeterminate length. This would then
consume the rest of the BitString.
a, b, c = s.interpret('oct9, bin, uint12')
Note that if the first bit position isn't given then it defaults to zero.
Also I think that I might have to think carefully about what the flexible
size item does when multiple bit positions and movements are present...
If you need the rest of the BitString just as a BitString then use the
'rest' token (A better name?)
a, b = s.read('uint32, rest')
Here I'm using it in the read function, as I think it would work well there
too (and peek of course).
Original issue reported on code.google.com by [email protected]
on 17 Jul 2009 at 5:36
To complement the bit shift functions ( <<, <<=, >>, >>= ) it would be nice
to have some bit rotation functions.
s.ror(12) # rotate bits to the right by 12
s.rol(10) # rotate bits to the left by 10
>>> s = BitString('0b001111')
>>> s.ror(1)
BitString('0b100111')
It would be consistent to have startbit and endbit parameters too, which
then leaves open the question of whether startbit and endbit are needed for
the ordinary shift operations. As they couldn't be used for the operators,
we could have:
s.shl(bits, startbit, endbit) # a bit like s[startbit:endbit] << bits
# except it's done in-place.
s.slr(bits, startbit, endbit)
Original issue reported on code.google.com by [email protected]
on 17 Jul 2009 at 10:10
The python style guide suggests avoiding the use of properties where their
use could be computationally expensive.
I've mostly ignored that advice here so expressions like
{{{
a = BitString()
a.bin = '10001001101'
print(a.bin)
}}}
can be more expensive than they look. At present it doesn't even cache the
binary representation.
Original issue reported on code.google.com by [email protected]
on 21 Dec 2008 at 11:06
This would bring its interface closer to replace and split.
list(a.findall(s, count=n))
should be equivalent to
list(a.findall(s))[:n]
I think that there is also case for renaming the split function's maxsplit
parameter to count also. It's nice not to have to remember the difference...
Original issue reported on code.google.com by [email protected]
on 2 Jun 2009 at 1:35
>>> b = BitString(data='\x30', length=2, offset=2)
>>> b.prepend(b)
BitString('0x3')
It should be '0xf' (0b1111). Not good.
Original issue reported on code.google.com by [email protected]
on 11 Jun 2009 at 8:00
There remain some problems analysing very large files. The magic number is
probably 4GB.
For example, using findbytealigned a BitString initialised with a filename
of a very large file may raise an OverflowError (at least it does for me).
Might be platform dependent.
Original issue reported on code.google.com by [email protected]
on 16 Jan 2009 at 4:45
Might be nice to have some shorthand for common types, for example:
byte -> bytes:1
bit -> bits:1
short -> uint:16
long -> uint:32
quad -> uint:64
etc.
Or we could go for something more snappy. These are lifted from Perl's pack:
c -> int:8
C -> uint:8
s -> int:16
S -> uint:16
l -> int:32
L -> uint:32
q -> int:64
Q -> uint:64
So we internally translate from
>>> s = bitstring.pack('C, l, Q', 10, 100, 1000)
>>> a, b, c = s.unpack('Q, C, l')
to
>>> s = bitstring.pack('uint:8, int:32, uint:64, 10, 100, 1000)
>>> a, b, c = s.unpack('uint:64, uint:8, int:32')
Original issue reported on code.google.com by [email protected]
on 13 Aug 2009 at 4:32
At the moment the whole of the remaining BitString is returned as the final
item from split(). If you have specified maxsplit, then this probably isn't
what you wanted (the final item could be huge!)
list(a.split(delimiter, maxsplit=n))
should give the same result as
list(a.split(delimiter))[:n]
(but hopefully much more quickly!)
Original issue reported on code.google.com by [email protected]
on 2 Jun 2009 at 1:26
You can't use .int, .uint, .se and .ue properties on BitStrings initialised
using filename.
The work-around is just to copy the whole BitString and use the copy.
Original issue reported on code.google.com by [email protected]
on 22 Jan 2009 at 4:46
For example:
a = BitString(filename='foo')
a.append('0xff')
will fail.
Original issue reported on code.google.com by [email protected]
on 18 Feb 2009 at 8:40
This may or may not make sense...
Allow lists to initialise BitString objects, by evaluating each element as
a bool. e.g.
>>> a = BitString([True, False, 7, [False], '0', 'hello', []])
>>> a.bin
'0b1011110'
Original issue reported on code.google.com by [email protected]
on 27 Apr 2009 at 5:04
Summary says it all really. Want to be able to say
t = s.slice(a, b, c)
instead of having to use
t = s[a:b:c]
Original issue reported on code.google.com by [email protected]
on 28 May 2009 at 10:09
Improvements to current ones:
find(bs, bytealigned=True, startbit=None, endbit=None)
split(delimiter, bytealigned=True, startbit=None, endbit=None)
And some new ones:
replace(old, new, bytealigned=True, startbit=None, endbit=None)
count(bs, bytealigned=True, startbit=None, endbit=None)
rfind(bs, bytealigned=True, startbit=None, endbit=None)
Original issue reported on code.google.com by [email protected]
on 17 Mar 2009 at 12:21
To try to get better symmetry between pack and unpack it would be nice to
return a dictionary with unpack.
>>> f = 'hex:32=start_code, uint:12=width, uint:12=height'
>>> s = pack(f, start_code='0x000001b3', width=352, height=288)
>>> s.unpack(f)
{'height': 288, 'start_code': '0x000001b3', 'width': 352}
Which is fine and lovely, but what happens if there is also a list being
returned?
>>> s.unpack('hex:32, uint:12, uint:12=height')
Should it return a tuple of a list and dictionary? Seems a bit extreme...
Original issue reported on code.google.com by [email protected]
on 30 Aug 2009 at 8:06
s = BitString('0b111')
s.truncatestart(2)
s.truncateend(1) # asserts
Original issue reported on code.google.com by [email protected]
on 6 Apr 2009 at 11:30
For example, allow
a = BitString('0b000b10b111') # a.bin == 0b001111
b = BitString('0xff0xe2') # b.hex == 0xffe2
This will allow constructions like
a += '0b0' + '0b1' + '0b1110'
which currently fail as the strings are concatenated first.
Note that we can't combine '0x' and '0b' strings (unfortunately) because
'0b' is valid hex as well as being the binary indicator. Annoying that.
Original issue reported on code.google.com by [email protected]
on 16 Feb 2009 at 2:32
i.e. reversebits(startbit, endbit) would reverse the bits in the slice
[startbit:endbit] in place.
This would let you write things like:
>>> a = BitString('0x01020408')
>>> for i in range(a.length/8):
... a.reversebits(i*8, (i+1)*8)
>>> a.hex
'0x80402010'
Original issue reported on code.google.com by [email protected]
on 27 Apr 2009 at 1:32
Would need to use the Python 3.0 notation (prefix of '0o' or '0O') rather
than the '0' prefix.
a = BitString('0o777')
b = BitString(oct='777')
Original issue reported on code.google.com by [email protected]
on 16 Feb 2009 at 4:29
Instead of
>>> a = s.readbits(10)
>>> b = s.readbits(4)
Why not
>>> a, b = s.readbits(10, 4)
You could then also write things like
>>> [x.uint for x in s.readbits(5, 6, 5)]
Would need to modify readbits, peekbits, readbytes, peekbytes (but not
peekbit etc.)
Original issue reported on code.google.com by [email protected]
on 29 Jun 2009 at 3:08
Currently using the stride is not allowed when slicing a BitString. This is
primarily because it just isn't very useful - each item is just a single bit.
Suggestion is to use the stride to indicate the *size* of the items being
sliced. For example using a stride of 8 would make the start and stop
indices into byte indices:
>>> a = BitString('0xabcdef')
>>> print a[0:16]
'0xabcd'
>>> print a[0:16:1]
'0xabcd'
>>> print a[0:2:8]
'0xabcd'
>>> print a[1:2:4]
'0xcd'
I think that the notation a[x:y:8] is cleaner than the equivalent (and
frequently used) a[x*8:y*8].
Negative strides are interesting too. a[::-1] would be the reversed bit
BitString, whereas a[::-8] would reverse the byte order.
What could possibly go wrong?
Original issue reported on code.google.com by [email protected]
on 24 Apr 2009 at 3:57
Some of these should be appropriate:
__invert__
__mul__
__lshift__
__rshift__
__hex__
__oct__
__imul__
__ilshift__
__irshift__
__setitem__
Original issue reported on code.google.com by [email protected]
on 17 Feb 2009 at 2:53
For example:
s += BitString(uint=12, length=8)
could be written as
s += 'uint8 12'
while
s = BitString('0x12') + BitString(ue=4) + BitString('0b1')
becomes
s = BitString('0x12, ue4, 0b1')
Lots of questions as to what the best format is. Separator could be ',' or
':' (or either). Is 'ue4' better than 'ue=4' or 'ue 4'?
Of course the one that wouldn't work is the 'data' initialiser, as it would
be impossible to work out when the data ended...
Original issue reported on code.google.com by [email protected]
on 28 Jun 2009 at 8:50
In particular if there are no bytes before the delimiter then it should
yield an empty BitString as the first item, but it fails to do so.
It could just be the documentation for split() that is incorrect and the
behaviour is intended.
Original issue reported on code.google.com by [email protected]
on 7 Jan 2009 at 2:36
Many operations that return a new BitString don't alter the underlying data
in any way, often just needing a slice of it. Currently the data is always
copied, which could be rather expensive in some cases.
Suggestion is to improve memory and computational efficiency by allowing a
BitString's internal byte data store to reference another BitString's data
rather than taking a copy.
Original issue reported on code.google.com by [email protected]
on 17 Jan 2009 at 10:54
[deleted issue]
Rationale: In Python 2.6 there's also a bin() function, but it can't be
overloaded in the same way as oct() and hex() (i.e. treating leading zeros
as significant).
In Python 3.0 it's even worse as the oct() and hex() won't work either.
Overall I think it's better to have a consistent interface across
hex/oct/bin as well as across Python 2.x/3.x, so the only way to go is to
get rid of hex() and oct().
Original issue reported on code.google.com by [email protected]
on 16 Jun 2009 at 4:50
Rather than
>>> h = s.readbits(12).hex
use
>>> h = s.read('hex12')
Then we can start joining them:
>>> start_code, width, height = s.read('hex32, uint12, uint12')
Needs to work for peek() as well as read() of course.
Original issue reported on code.google.com by [email protected]
on 29 Jun 2009 at 3:02
It would be nice to be able to use the '0x' and '0b' prefixes to specify
hex and binary without the explicit initialiser. For example:
s = BitString('0xff')
t = BitString('0b0001')
instead of
s = BitString(hex='0xff')
t = BitString(bin='0b0001')
Also, this could be used in functions that require a BitString argument:
s.append('0b0')
t.findbytealigned('0x47')
instead of
s.append(BitString(bin='0'))
t.findbytealigned(BitString(hex='0x47'))
Original issue reported on code.google.com by [email protected]
on 13 Feb 2009 at 11:35
Some code won't run under Python 2.4.
In particular the 'a if c else b' construction is used.
It wouldn't be too much work to get the unit tests to pass for python 2.4.
Original issue reported on code.google.com by [email protected]
on 21 Jan 2009 at 5:14
The int, uint properties are bit-wise big-endian. To support other
endianness suggest we add:
intle - little endian int. Must be a multiple of 8 bits long
uintle - little endian uint. Must be a multiple of 8 bits long
intbe - synonym for int
uintbe - synonym for uint
Suggest that we don't add explicit support for bit-wise little-endian
interpretations. (use reversebits() or [::-1] slice)
Also having things like hexle or binle would just get very confusing!
For example:
s = BitString(intle=104, length=16)
(or)
s = BitString('intle16=104')
assert s.intle == 104
s.intle = 950
i = s.read('intle16')
assert i == 950
assert s[::-8].int == 950
Original issue reported on code.google.com by [email protected]
on 16 Jul 2009 at 4:52
Any length set when creating a file-based BitString will be ignored when,
for example, displaying the BitString as a hex string.
Original issue reported on code.google.com by [email protected]
on 22 Jan 2009 at 4:48
b = BitString(data='\x28\x28', offset=1)
b.append('0b0') # asserts
It also fails for prepend, and probably more. The assert itself isn't all
that important so programs should still function if you use -O.
Need to add unit test to cover this!
Original issue reported on code.google.com by [email protected]
on 12 Mar 2009 at 4:14
For example:
>>> s = BitString('0o777', length=1, offset=1)
>>> s
BitString('0b11')
...which clearly doesn't have a length of 1.
Original issue reported on code.google.com by [email protected]
on 6 Jun 2009 at 7:32
__setitem__ can be used to replace a slice of a BitString with another, but
can't be used to insert. This should be possible:
a = BitString('0x0011223344')
a[16:16] = '0xff'
print a # 0x0011ff223344
But instead it raises an IndexError. Of course you can still use insert()
to do this.
Original issue reported on code.google.com by [email protected]
on 27 Apr 2009 at 8:45
The first parameter of split() could be an integer, which would then mean
that it would return a generator for constant sized chunks. For example
for byte in s.split(8):
do_something_with(byte)
Original issue reported on code.google.com by [email protected]
on 5 Jun 2009 at 8:30
The option the initialise a BitString with a filename isn't fully
implemented yet.
If you want to analyse a file the suggested method is still to do something
like:
s = BitString(data=open('filename', 'rb').read())
which obviously isn't going to work very well if the file is very large.
If you need to analyse 20GB files (as I occasionally do) then feel free to
try the filename initialiser, but the interface and functionality have yet
to be finalised.
Original issue reported on code.google.com by [email protected]
on 21 Dec 2008 at 11:40
c = BitString('0x1122334455667788')
c.bitpos = 40
c.append('0b1').prepend('0x6666666') # asserts in _assertsanity()
Original issue reported on code.google.com by [email protected]
on 20 Mar 2009 at 4:21
Generally it's not possible to use integers to initialise a BitString
without providing a length, which means that it can be more cumbersome than
hex or bin initialisation.
However, if a slice is being specified then we already have a default
length so that this could make sense:
>>> a = BitString('0x000000')
>>> a[8:16] = 100
>>> print a
'0x006400'
If the signed or unsigned integer doesn't fit then a ValueError would be
raised.
Original issue reported on code.google.com by [email protected]
on 1 May 2009 at 3:04
What steps will reproduce the problem?
Python 2.6.1 (r261:67515, Jan 22 2009, 11:41:14)
[GCC 4.0.1 (Apple Inc. build 5484)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import bitstring
>>> bs = bitstring.BitString('0x900dbeef')
>>> bs.bitpos
0
>>> bs.advancebits(0)
>>> bs.bitpos
0
>>>
What is the expected output? What do you see instead?
The docstring for advancebits() says:
"""Advance position by bits.
bits -- Number of bits to increment bitpos by. Must be >= 0.
Raises ValueError if bits is negative or if bitpos goes past the end
of the BitString.
"""
The doc text for bits should read "Must be > 0." The last sentence is correct.
What version of the product are you using? On what operating system?
Path: .
URL: http://python-bitstring.googlecode.com/svn/trunk
Repository Root: http://python-bitstring.googlecode.com/svn
Repository UUID: 442ccf1e-c85e-11dd-94fd-9de6169c3690
Revision: 288
Node Kind: directory
Schedule: normal
Last Changed Author: python.bitstring
Last Changed Rev: 285
Last Changed Date: 2009-04-24 03:38:13 +1000 (Fri, 24 Apr 2009)
Please provide any additional information below.
This may be the most trivial issue I have ever raised...sorry. It's in good
faith, I promise.
I believe the same bug occurs in the docstrings for advancebytes and the
retreat* methods.
Original issue reported on code.google.com by [email protected]
on 1 May 2009 at 4:49
Personally I dislike the name byteswap() (as used in the array module) as
it doesn't really say what's going on - i.e. which bytes are being swapped
with which. bytereverse() is closer to the truth.
Suggestion:
To change endianness of 2-byte data:
>>> s.reversebytes(size=2)
So base it on reversebits(), which could also change to have a size parameter.
def reversebytes(startbit=None, endbit=None, size=0)
def reversebits(startbit=None, endbit=None, size=0)
A size==0 implies that the whole slice just gets reversed, which is
backward compatible with the current reversebits().
Examples:
s = BitString('0x0011002200330044')
s.bytereverse() # 0x4400330022001100
s.bytereverse(size=2) # 0x1100220033004400
s.bytereverse(size=4) # 0x2200110044003300
s.bytereverse(size=3) # 0x001100330022 (the rest gets truncated)
s.bytereverse(size=1) # Unchanged - no effect
I'm not sure I like the name of the 'size' parameter, but I can't think of
anything better right now.
Original issue reported on code.google.com by [email protected]
on 10 Jul 2009 at 10:37
Example:
import bitstring
format = 'bits:4=BL_OFFT, uint:12=width, uint:12=height'
d = {'BL_OFFT': '0b1011', 'width': 352, 'height': 288}
s = bitstring.pack(format, **d)
No output expected. Instead, got a ValueException:
Traceback (most recent call last):
File "trybs.py", line 4, in <module>
s = bitstring.pack(format, **d)
File "C:\Python26\lib\site-packages\bitstring.py", line 2663, in pack
s.append(_init_with_token(name, length, value))
File "C:\Python26\lib\site-packages\bitstring.py", line 101, in
_init_with_tok
en
b = BitString(value)
File "C:\Python26\lib\site-packages\bitstring.py", line 576, in __init__
func(d, offset, length)
File "C:\Python26\lib\site-packages\bitstring.py", line 1115, in _setauto
self.append(_init_with_token(*token))
File "C:\Python26\lib\site-packages\bitstring.py", line 107, in
_init_with_tok
en
raise ValueError("Can't parse token name %s." % name)
ValueError: Can't parse token name bl_offt.
Note lower-case name 'bl-offt' in last line of output.
Changing the key to lower-case in the format and dictionary allowed the
example to run.
I'm using r456 in Subversion. Python version 2.6.2 on Windows XP.
Original issue reported on code.google.com by [email protected]
on 8 Sep 2009 at 2:19
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.