Giter VIP home page Giter VIP logo

bit-vec's People

Contributors

alexcrichton avatar apasel422 avatar bholley avatar bluss avatar cuviper avatar dingelish avatar dpc avatar est31 avatar florin-saftoiu avatar fuuzetsu avatar gankra avatar gguoss avatar ifreund avatar j-schwar avatar koushiro avatar mre avatar pczarn avatar pr0methean avatar reem avatar stepancheg avatar ufoot avatar yihuang avatar zackpierce avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bit-vec's Issues

Why does to_bytes() reverse the bits?

I was looking for an efficient way to get a &[u8] from the BitVec to write the bits out to file (when the memory layout is little-endian). I was thinking of adding an as_bytes() view, but I noticed that to_bytes() flips the bits within each byte so that the as_bytes() I want would be inconsistent with it. And from_bytes() does the same flip (which made my tests in the code I'm writing fail confusingly).

So

  1. Why was the current version implemented as such?
  2. Can we expose a byte slice view or an iterator that views the bytes in little-bit-endian format?

Should `set` be renamed to `switch`?

set usually means "set to 1", clear means "set to 0", toggle means "set to negation" and from my experience switch is a good word for setting to given boolean value.

'blocks_for_bits' implementation.

I don't know how the function as it exists is currently optimised, but in principle you can much more simply use (without risk of over/underflow):

if bits == 0 {
    0
} else {
    (bits-1) / B::bits() + 1
}

If bits were signed (and you have a signed right shift operated that was not UB) you can just use (bits-1) >> B::bitsbits() + 1 where bitsbits() is the number of bits required to store the index into a word (i.e. 5 for 32 bit words). This is probably not practical though.

Concurrent read/write support

Hello, I've been using BitVec for both a single-threaded and a multi-threaded version of the Sieve of Eratosthenes algorithm. Please note that I'm a beginner in the Rust language, so I might be doing things wrong.

In the documentation I haven't found anything indicating that this structure supports concurrent access and putting it in an Arc<Mutex<BitVec>> seems like a terrible idea when using multiple threads for computations as then it is effectively the same or worse than using a single thread.

Have I missed a part of the documentation where it says that concurrency is supported?
If not, are there plans to support concurrency?

To me it would seem most logical to put the BitBlock under a Mutex<BitBlock>, but that is just my beginner's view on this matter. On the other hand I could also hand out slices of the full vector to each thread and in that case I wouldn't even need a mutex conceptually to access the values safely, not sure how this is done in Rust though.

Illegal memory access

It appears that sometimes this crate does illegal memory access, and/or the test harness somehow does. When I run the tests with Valgrind, I get the following output:

 /  tmp  bit-vec  valgrind ./target/debug/deps/bit_vec-3219dec6733cc80b                                                                                                                                                          master 
==31860== Memcheck, a memory error detector
==31860== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==31860== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==31860== Command: ./target/debug/deps/bit_vec-3219dec6733cc80b
==31860== 

running 32 tests
==31860== Thread 5 tests::test_1_el:
==31860== Invalid read of size 8
==31860==    at 0x1BB9F0: je_tcache_bin_flush_small (tcache.c:132)
==31860==    by 0x1BB9F0: tcache_destroy (tcache.c:371)
==31860==    by 0x1BCCC1: je_tcache_cleanup (tcache.c:410)
==31860==    by 0x1BD854: je_tsd_cleanup (tsd.c:82)
==31860==    by 0x524B1B7: __nptl_deallocate_tsd.part.5 (in /usr/lib/libpthread-2.26.so)
==31860==    by 0x524C1E4: start_thread (in /usr/lib/libpthread-2.26.so)
==31860==    by 0x576FE7E: clone (in /usr/lib/libc-2.26.so)
==31860==  Address 0x6000000 is in a rwx anonymous segment
==31860== 
==31860== Invalid read of size 8
==31860==    at 0x1BB813: tcache_destroy (tcache.c:109)
==31860==    by 0x1BCCC1: je_tcache_cleanup (tcache.c:410)
==31860==    by 0x1BD854: je_tsd_cleanup (tsd.c:82)
==31860==    by 0x524B1B7: __nptl_deallocate_tsd.part.5 (in /usr/lib/libpthread-2.26.so)
==31860==    by 0x524C1E4: start_thread (in /usr/lib/libpthread-2.26.so)
==31860==    by 0x576FE7E: clone (in /usr/lib/libc-2.26.so)
==31860==  Address 0x6000000 is in a rwx anonymous segment
==31860== 
==31860== Invalid read of size 8
==31860==    at 0x1BB874: je_tcache_bin_flush_small (tcache.c:132)
==31860==    by 0x1BB874: tcache_destroy (tcache.c:371)
==31860==    by 0x1BCCC1: je_tcache_cleanup (tcache.c:410)
==31860==    by 0x1BD854: je_tsd_cleanup (tsd.c:82)
==31860==    by 0x524B1B7: __nptl_deallocate_tsd.part.5 (in /usr/lib/libpthread-2.26.so)
==31860==    by 0x524C1E4: start_thread (in /usr/lib/libpthread-2.26.so)
==31860==    by 0x576FE7E: clone (in /usr/lib/libc-2.26.so)
==31860==  Address 0x6000000 is in a rwx anonymous segment
==31860== 
==31860== Invalid read of size 8
==31860==    at 0x19C5CC: je_arena_mapbits_small_runind_get (arena.h:857)
==31860==    by 0x19C5CC: arena_dalloc_bin_locked_impl (arena.c:2954)
==31860==    by 0x19C5CC: je_arena_dalloc_bin_junked_locked (arena.c:2981)
==31860==    by 0x1BBD94: je_tcache_bin_flush_small (tcache.c:137)
==31860==    by 0x1BBD94: tcache_destroy (tcache.c:371)
==31860==    by 0x1BCCC1: je_tcache_cleanup (tcache.c:410)
==31860==    by 0x1BD854: je_tsd_cleanup (tsd.c:82)
==31860==    by 0x524B1B7: __nptl_deallocate_tsd.part.5 (in /usr/lib/libpthread-2.26.so)
==31860==    by 0x524C1E4: start_thread (in /usr/lib/libpthread-2.26.so)
==31860==    by 0x576FE7E: clone (in /usr/lib/libc-2.26.so)
==31860==  Address 0x6000018 is in a rwx anonymous segment
==31860== 
==31860== Invalid read of size 8
==31860==    at 0x19C636: je_arena_ptr_small_binind_get (arena.h:1096)
==31860==    by 0x19C636: arena_run_reg_dalloc (arena.c:307)
==31860==    by 0x19C636: arena_dalloc_bin_locked_impl (arena.c:2963)
==31860==    by 0x19C636: je_arena_dalloc_bin_junked_locked (arena.c:2981)
==31860==    by 0x1BBD94: je_tcache_bin_flush_small (tcache.c:137)
==31860==    by 0x1BBD94: tcache_destroy (tcache.c:371)
==31860==    by 0x1BCCC1: je_tcache_cleanup (tcache.c:410)
==31860==    by 0x1BD854: je_tsd_cleanup (tsd.c:82)
==31860==    by 0x524B1B7: __nptl_deallocate_tsd.part.5 (in /usr/lib/libpthread-2.26.so)
==31860==    by 0x524C1E4: start_thread (in /usr/lib/libpthread-2.26.so)
==31860==    by 0x576FE7E: clone (in /usr/lib/libc-2.26.so)
==31860==  Address 0x6000018 is in a rwx anonymous segment
==31860== 
==31860== Invalid read of size 8
==31860==    at 0x19C6C7: je_bitmap_unset (bitmap.h:247)
==31860==    by 0x19C6C7: arena_run_reg_dalloc (arena.c:323)
==31860==    by 0x19C6C7: arena_dalloc_bin_locked_impl (arena.c:2963)
==31860==    by 0x19C6C7: je_arena_dalloc_bin_junked_locked (arena.c:2981)
==31860==    by 0x1BBD94: je_tcache_bin_flush_small (tcache.c:137)
==31860==    by 0x1BBD94: tcache_destroy (tcache.c:371)
==31860==    by 0x1BCCC1: je_tcache_cleanup (tcache.c:410)
==31860==    by 0x1BD854: je_tsd_cleanup (tsd.c:82)
==31860==    by 0x524B1B7: __nptl_deallocate_tsd.part.5 (in /usr/lib/libpthread-2.26.so)
==31860==    by 0x524C1E4: start_thread (in /usr/lib/libpthread-2.26.so)
==31860==    by 0x576FE7E: clone (in /usr/lib/libc-2.26.so)
==31860==  Address 0x15555555b556138 is not stack'd, malloc'd or (recently) free'd
==31860== 
==31860== Can't extend stack to 0x402a138 during signal delivery for thread 5:
==31860==   no stack segment
==31860== 
==31860== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==31860==  Access not within mapped region at address 0x402A138
==31860==    at 0x19C6C7: je_bitmap_unset (bitmap.h:247)
==31860==    by 0x19C6C7: arena_run_reg_dalloc (arena.c:323)
==31860==    by 0x19C6C7: arena_dalloc_bin_locked_impl (arena.c:2963)
==31860==    by 0x19C6C7: je_arena_dalloc_bin_junked_locked (arena.c:2981)
==31860==    by 0x1BBD94: je_tcache_bin_flush_small (tcache.c:137)
==31860==    by 0x1BBD94: tcache_destroy (tcache.c:371)
==31860==    by 0x1BCCC1: je_tcache_cleanup (tcache.c:410)
==31860==    by 0x1BD854: je_tsd_cleanup (tsd.c:82)
==31860==    by 0x524B1B7: __nptl_deallocate_tsd.part.5 (in /usr/lib/libpthread-2.26.so)
==31860==    by 0x524C1E4: start_thread (in /usr/lib/libpthread-2.26.so)
==31860==    by 0x576FE7E: clone (in /usr/lib/libc-2.26.so)
==31860==  If you believe this happened as a result of a stack
==31860==  overflow in your program's main thread (unlikely but
==31860==  possible), you can try to increase the size of the
==31860==  main thread stack using the --main-stacksize= flag.
==31860==  The main thread stack size used in this run was 8388608.
==31860== Invalid write of size 8
==31860==    at 0x4A27660: _vgnU_freeres (vg_preloaded.c:59)
==31860==  Address 0x402aff8 is on thread 5's stack
==31860== 
==31860== 
==31860== Process terminating with default action of signal 11 (SIGSEGV)
==31860==  Access not within mapped region at address 0x402AFF8
==31860==    at 0x4A27660: _vgnU_freeres (vg_preloaded.c:59)
==31860==  If you believe this happened as a result of a stack
==31860==  overflow in your program's main thread (unlikely but
==31860==  possible), you can try to increase the size of the
==31860==  main thread stack using the --main-stacksize= flag.
==31860== 
==31860== HEAP SUMMARY:
==31860==     in use at exit: 1,220 bytes in 7 blocks
==31860==   total heap usage: 20 allocs, 13 frees, 3,444 bytes allocated
==31860== 
==31860== LEAK SUMMARY:
==31860==    definitely lost: 0 bytes in 0 blocks
==31860==    indirectly lost: 0 bytes in 0 blocks
==31860==      possibly lost: 1,152 bytes in 4 blocks
==31860==    still reachable: 68 bytes in 3 blocks
==31860==         suppressed: 0 bytes in 0 blocks
==31860== Rerun with --leak-check=full to see details of leaked memory
==31860== 
==31860== For counts of detected and suppressed errors, rerun with: -v
==31860== ERROR SUMMARY: 8 errors from 7 contexts (suppressed: 0 from 0)
zsh: segmentation fault (core dumped)  valgrind ./target/debug/deps/bit_vec-3219dec6733cc80b

Maintenance?

Is anyone still maintaining this lib? Is there a better fork available? We've got a lot of open PRs/issues that I'd like to see worked towards. Would be happy to take over if there's no one else doing it.

Implement bit-wise traits?

I'm currently trying to implement a few bit-parallel algorithms in Rust that I planned to use the BitVec type for. I frequently have to use bit-wise operations (|,&,^,<<,>>) on the bit vectors and was a bit disappointed that only &, | and ^ are supported, and that only as in-place operations via union, difference and intersection. Shifting is currently not supported at all.

Is there a specific reason why the bit-wise traits for these operations from std::ops (BitAnd, Shl, etc) are not implemented for this type? If not, would there be interest in a PR that implements them?

Stack-based storage for short vectors?

Since Vec is three usize values, we could have a stack-based variant of BitVec that could store up to 96 bits (on x86) or 192 bits (on x64) in the struct itself. This could potentially be much faster for short bitvectors since it doesn't require allocations. For example, my crate img_hash by default uses bitvectors of size 64.

With const fn, this could even work with the current user-defined block size design using mem::size_of() to calculate at compile time how many blocks of the given type could fit in the space that Vec would take up:

enum BitVecStorage<B> {
    Heap(Vec<B>),
    Stack([B; mem::size_of::<Vec<B>>() /  mem::size_of::<B>()]),
}

pub struct BitVec<B=u32> {
    storage: BitVecStorage<B>,
    nbits: usize,        
} 

With the exception of B=u64 on a x86 system, as 3 * 32 % 64 != 0, both variants will essentially use the same amount of space. And the stack storage can be transformed to heap storage very efficiently via slice::into_vec(self: Box<[T]>).

The only downsides I can see are an extra byte for the enum discriminator, and system-dependent performance since the point at which it switches to heap storage depends on the block type and the bitness of the system.

no way to disable serde on 0.6 with stable Rust

There seems to be no way to disable serde support on the recent bit-vec 0.6.0 release without having to disable std support as well. It's because of the line std = ["serde/std"] line in Cargo.toml: if you enable std you automatically also enable serde. This is a 0.5 → 0.6 regression. Either, add a std_no_deps feature that is used inside the code and that std depends on, or just erase that line and expect users to enable the std feature in serde themselves.

Support Custom Allocators

may be

let mut bits = BitVec::from_bytes(&mut mmap[..]);

but now (can't be mut)

let mut bits = BitVec::from_bytes(&mmap[..]);

Infinite recursion issue

The implementation of count_ones() does not make sense. It's actually an infinite recursion. Or is this intended? Or a bug? If intended, what is it supposed to do? Is it not infinite, or am I reading the code correctly?

bit-vec/src/lib.rs

Lines 153 to 164 in d3585be

impl BitBlock for $t {
#[inline]
fn bits() -> usize { $size }
#[inline]
fn from_byte(byte: u8) -> Self { $t::from(byte) }
#[inline]
fn count_ones(self) -> usize { self.count_ones() as usize }
#[inline]
fn one() -> Self { 1 }
#[inline]
fn zero() -> Self { 0 }
}

Actually contemplate the internals we expose

  • set_len
  • blocks
  • storage
  • storage_mut

were exposed to permit a separation into two crates. While I think these are useful in general, a more principled approach is probably desirable.

Also set_len and storage_mut were marked as unsafe, which I think is strictly unnecessary (bitvec has no unsafe code), but I left as a "here be dragons" reservation for future unsafety. Might not want to have it like that, though.

Add a method to get a partition of `BitVec`

It's sometimes helpful to get a partition of BitVec specified by a given range. Specifically, the following API might be useful:

/// Return a new `BitVec` at a given range. This operation is taking
/// O(range) time.
    pub fn get_range(&self, range: Range<usize>) -> Self;

This function will essentially copy range/block_size blocks with some constant time adjustments.

It's also helpful to have its iterator equivalent:

  pub fn iter(&self, range: Range<usize>) -> Iter<B> {
      self.ensure_invariant();
      Iter { bit_vec: self, range }
  }

Those two features will be useful in applications like MPC. It would be helpful if similar features are already implemented. Otherwise I can submit a PR. Thanks!

Push new documentation

I'd like to xor two bitvecs. The documentation suggests there is no such method. However, the latest release does contain the xor method. In other words: The documentation is outdated.

Please rebuild and re-upload, thanks!

(And thanks for the nice crate! No additional dependencies, just what I like to see.)

Publish 0.4.2?

I'd like to publish a new version of mioco and it depends on the recent fix for usize. Could you push a new version to crates.io?

Discussion: u32 default?

Not really a bug but I wanted to bring this up. In the bitvec struct the default storage size is u32. I typically use it with u8 but I appreciate the use of u32 or a higher size storage since this is allows register alignment. However, shouldn't it be usize to account for the wordsize on 64bit processors as well as 32bit?

`set_all` slow?

I noticed a regression in w3f/vdf@961caee which indicates set_all being slower than an allocation. Is it perhaps that the allocation is being optimized away, but the for loop in set_all has some overhead?

Restore `append` feature

This feature is unstable in nightly branch of rust, but has been forgotten when migrating from std::collections to this external repo.

`BitVec::append` fails silently

Hello, I've been using BitVec extensively for a project and I noticed that BitVec::append doesn't work all the time. See below for a minimal example, where I would expect the output to be [01, 00, 02, 03], but BitVec returns [01, 00, 02, 00].

use bit_vec::BitVec;

#[test]
fn it_works() {
    // Create a 8 bit integer with value (`1`) encoded in BE.
    let mut a = BitVec::from_elem(8, false);
    a.set(7, true);

    // Create a 16 bit integer with value (`2`) encoded in BE.
    let mut b = BitVec::from_elem(16, false);
    b.set(14, true);

    // Create a 8 bit integer with value (`3`) encoded in BE.
    let mut c = BitVec::from_elem(8, false);
    c.set(6, true);
    c.set(7, true);

    // Merge all bit_vecs together
    a.append(&mut b);
    a.append(&mut c);

    assert_eq!(&[01, 00, 02, 03][..], &*a.to_bytes());
}

More efficient `reverse_bits` implementation.

In general the reverse_bits function should be generalised to work on whatever word type the BitVec object is using and then be applied to accumulator directly instead of each byte (assuming the bytes are reversed while assembling the accumulator value). Additionally bit switching should be cheaper than shifting. Consider:

fn reverse_bits<B: BitBlock>(block: B) -> B {
    let mut result = B::zero();
    result = ((block >> 1) & (!B::zero() / 3)) | ((block & (!B::zero() / 3)) << 1);
    result = ((block >> 2) & (!B::zero() / 5)) | ((block & (!B::zero() / 5)) << 2);
    result = ((block >> 4) & (!B::zero() / 17)) | ((block & (!B::zero() / 17)) << 4);
    result
}

I'm assuming !B::zero() / 3 etc. become compile time constants.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.