ericboesch / bitset Goto Github PK
View Code? Open in Web Editor NEWThis project forked from tyler/bitset
Bitset implementation for Ruby
License: MIT License
This project forked from tyler/bitset
Bitset implementation for Ruby
License: MIT License
I'm working with some large bitsets and I'm experiencing some issues:
ruby 2.7.6p219 (2022-04-12 revision c9c2245c0a) [x86_64-darwin21]
Creating a bitset larger than 2**31 fails:
[1]dev(main)> Bitset.new(2**30); # Is fine
[2]dev(main)> Bitset.new(2**31);
RangeError: integer 2147483648 too big to convert to `int'
from (pry):2:in `initialize'
The cause is probably the NUM2INT used here: https://github.com/ericboesch/bitset/blob/master/ext/bitset/bitset.c#L73
Working around it with the if
branch of that method, passing an array gets interesting:
[10]dev(main)> b = Bitset.new([nil] * (2**30)); # Is fine
[13]dev(main)> b.size
=> 1073741824
[14]dev(main)> b = Bitset.new([nil] * (2**31)); # Does not crash!
[16]dev(main)> b.size # The size gets signed though, probaby due to https://github.com/ericboesch/bitset/blob/master/ext/bitset/bitset.c#L80
=> -2147483648
[17]dev(main)> b.set(1); # Indexes get set out of bounds
IndexError: Index out of bounds
from (pry):17:in `set`
[18]dev(main) > b # printing it segfaults
=> [redacted]/1.2.0/lib/bitset.rb:19: [BUG] Segmentation fault at 0xffffffff80081270
ruby 2.7.6p219 (2022-04-12 revision c9c2245c0a) [x86_64-darwin21]
-- Crash Report log information --------------------------------------------
See Crash Report log file under the one of following:
* ~/Library/Logs/DiagnosticReports
* /Library/Logs/DiagnosticReports
for more details.
Don't forget to include the above Crash Report log file in bug reports.
-- Control frame information -----------------------------------------------
c:0053 p:---- s:0296 e:000295 CFUNC :to_s
c:0052 p:0019 s:0292 e:000289 METHOD /[redacted]/ruby/2.7.6/lib/ruby/gems/2.7.0/gems/bitset-1.2.0/lib/bitset.rb:19
I guess the NUM2INT and INT2NUM's are the cause here. My CRuby is limited, but switching to NUM2ULL and ULL2NUM or SIZET2NUM and NUM2SIZET (and changing int's to size_t's) could perhaps work?
I've observed a leak in memory while using this gem. Valgrind shows that it's happening on to_s.
92 bytes in 6 blocks are definitely lost in loss record 12,037 of 19,580
malloc (at /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
*rb_bitset_to_s (bitset.c:308)
vm_call_cfunc_with_frame (vm_insnhelper.c:2514)
vm_call_cfunc (vm_insnhelper.c:2539)
vm_call_method (vm_insnhelper.c:3053)
vm_sendish (vm_insnhelper.c:4023)
vm_exec_core (insns.def:801)
On some compilers (e.g. g++7.3.0) the library compiles, but doesn't load, because the methods marked inline
don't get exported into the .so
e.g. with inline uint64_t xor(uint64_t a, uint64_t b) { return a ^ b; }
, nm bitset.so
shows:
0000000000000f50 T _init
U xor
0000000000002950 T assign_bit
00000000000012d0 T bitset_free
0000000000002030 T bitset_new
0000000000002060 T bitset_setup
0000000000204058 B cBitset
U calloc@@GLIBC_2.2.5
0000000000204008 b completed.7696
00000000000011f0 t deregister_tm_clones
U difference
The definitions need to be switched to static to ensure they're exported, like so:
static uint64_t xor(uint64_t a, uint64_t b) { return a ^ b; }
see https://github.com/QuoineFinancial/bitset which did just that
Attempting to build the extension with Clang 16 fails like this:
current directory: /home/somers/.gem/ruby/3.1/gems/bitset-1.2.0/ext/bitset
/usr/local/bin/ruby31 extconf.rb
creating Makefile
current directory: /home/somers/.gem/ruby/3.1/gems/bitset-1.2.0/ext/bitset
make DESTDIR\= sitearchdir\=./.gem.20230916-8087-i84bj2 sitelibdir\=./.gem.20230916-8087-i84bj2 clean
current directory: /home/somers/.gem/ruby/3.1/gems/bitset-1.2.0/ext/bitset
make DESTDIR\= sitearchdir\=./.gem.20230916-8087-i84bj2 sitelibdir\=./.gem.20230916-8087-i84bj2
compiling bitset.c
bitset.c:83:33: warning: function 'raise_index_error' could be declared with attribute 'noreturn' [-Wmissing-noreturn]
static void raise_index_error() {
^
bitset.c:318:18: warning: implicit conversion loses integer precision: 'long' to 'int' [-Wshorten-64-to-32]
int length = RSTRING_LEN(s);
~~~~~~ ^~~~~~~~~~~~~~
/usr/local/include/ruby-3.1/ruby/internal/core/rstring.h:52:27: note: expanded from macro 'RSTRING_LEN'
#define RSTRING_LEN RSTRING_LEN
^
bitset.c:509:16: warning: implicit conversion loses integer precision: 'long' to 'int' [-Wshorten-64-to-32]
int alen = RARRAY_LEN(index_array);
~~~~ ^~~~~~~~~~~~~~~~~~~~~~~
/usr/local/include/ruby-3.1/ruby/internal/core/rarray.h:68:36: note: expanded from macro 'RARRAY_LEN'
#define RARRAY_LEN rb_array_len /**< @alias{rb_array_len} */
^
bitset.c:605:5: error: incompatible function pointer types passing 'VALUE (VALUE, VALUE)' (aka 'unsigned long (unsigned long, unsigned long)') to parameter of type 'VALUE (*)(VALUE)' (aka 'unsigned long (*)(unsigned long)') [-Wincompatible-function-pointer-types]
rb_define_method(cBitset, "size", rb_bitset_size, 0);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/local/include/ruby-3.1/ruby/internal/anyargs.h:287:135: note: expanded from macro 'rb_define_method'
#define rb_define_method(klass, mid, func, arity) RBIMPL_ANYARGS_DISPATCH_rb_define_method((arity), (func))((klass), (mid), (func), (arity))
^~~~~~
/usr/local/include/ruby-3.1/ruby/internal/anyargs.h:276:1: note: passing argument to parameter here
RBIMPL_ANYARGS_DECL(rb_define_method, VALUE, const char *)
^
/usr/local/include/ruby-3.1/ruby/internal/anyargs.h:254:72: note: expanded from macro 'RBIMPL_ANYARGS_DECL'
RBIMPL_ANYARGS_ATTRSET(sym) static void sym ## _00(__VA_ARGS__, VALUE(*)(VALUE), int); \
^
bitset.c:649:5: error: incompatible function pointer types passing 'VALUE (VALUE, VALUE)' (aka 'unsigned long (unsigned long, unsigned long)') to parameter of type 'VALUE (*)(VALUE)' (aka 'unsigned long (*)(unsigned long)') [-Wincompatible-function-pointer-types]
rb_define_method(cBitset, "reverse", rb_bitset_reverse, 0);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/local/include/ruby-3.1/ruby/internal/anyargs.h:287:135: note: expanded from macro 'rb_define_method'
#define rb_define_method(klass, mid, func, arity) RBIMPL_ANYARGS_DISPATCH_rb_define_method((arity), (func))((klass), (mid), (func), (arity))
^~~~~~
/usr/local/include/ruby-3.1/ruby/internal/anyargs.h:276:1: note: passing argument to parameter here
RBIMPL_ANYARGS_DECL(rb_define_method, VALUE, const char *)
^
/usr/local/include/ruby-3.1/ruby/internal/anyargs.h:254:72: note: expanded from macro 'RBIMPL_ANYARGS_DECL'
RBIMPL_ANYARGS_ATTRSET(sym) static void sym ## _00(__VA_ARGS__, VALUE(*)(VALUE), int); \
^
3 warnings and 2 errors generated.
*** Error code 1
My environment is:
> freebsd-version
15.0-CURRENT
> clang --version
FreeBSD clang version 16.0.6 (https://github.com/llvm/llvm-project.git llvmorg-16.0.6-0-g7cbf1a259152)
Target: x86_64-unknown-freebsd15.0
Thread model: posix
InstalledDir: /usr/bin
> ruby --version
ruby 3.1.4p223 (2023-03-30 revision 957bb7cb81) [amd64-freebsd15]
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.