Giter VIP home page Giter VIP logo

natalie's Introduction

Natalie

GitHub build status MIT License justforfunnoreally.dev badge

Natalie is a work-in-progress Ruby implementation.

It provides an ahead-of-time compiler using C++ and gcc/clang as the backend. Also, the language has a REPL that performs incremental compilation.

demo screencast

There is much work left to do before this is useful. Please let me know if you want to help!

Helping Out

Contributions are welcome! You can learn more about how I work on Natalie via the hacking session videos on YouTube.

The easiest way to get started right now would be to find a method on an object that is not yet implemented and make it yourself! Also take a look at good first issues. (See the 'Building' and 'Running Tests' sections below for some helpful steps.)

We have a very quiet Discord server -- come and hang out!

Building

Natalie is tested on macOS and Ubuntu Linux. Windows is not yet supported.

Natalie requires a system Ruby (MRI) to host the compiler, for now.

Prerequisites:

  • git
  • autoconf
  • automake
  • libtool
  • GNU make
  • gcc or clang
  • Ruby 3.1 or higher with dev headers
    • Using rbenv to install Ruby is preferred.
    • Installing rbenv-aliases along with rbenv helps with matching Ruby versions like 3.1 to the latest patch release.
    • If not using rbenv or another version manager, you'll need the ruby and ruby-dev package from your system.
  • ccache (optional, but recommended)
  • compiledb (optional, but recommended)

Install the above prerequisites on your platform, then run:

git clone https://github.com/natalie-lang/natalie
cd natalie
rake

Troubleshooting Build Errors

  • Don't use sudo! If you already made that mistake, then you should sudo rm -rf build and try again.
  • If you get an error about file permissions, e.g. unable to write a file to somewhere like /usr/lib/ruby, or another path that would require root, then you have a couple options:
    • Use a tool like rbenv to install a Ruby version in your home directory. Gems will also be installed there. Run rbenv version to see which version is currently selected. Run rbenv shell followed by a version to select that version.
    • Specify where to install gems with something like:
      mkdir -p ~/gems
      export GEM_HOME=~/gems
      
      You'll just have to remember to do that every time you open a new terminal tab.
  • If you get an error about missing bundler, then your operating system probably didn't install it alongside Ruby. You can run gem install bundler to get it.

NOTE: Currently, the default build is the "debug" build, since Nataile is in active development. But you can build in release mode with rake build_release.

Usage

REPL:

bin/natalie

Run a Ruby script:

bin/natalie examples/hello.rb

Compile a file to an executable:

bin/natalie -c hello examples/hello.rb
./hello

Using With Docker

docker build -t natalie .                                            # build image
docker run -it --rm natalie                                          # repl
docker run -it --rm natalie -e "p 2 * 3"                             # immediate
docker run -it --rm -v$(pwd)/myfile.rb:/myfile.rb natalie /myfile.rb # execute a local rb file
docker run -it --rm --entrypoint bash natalie                        # bash prompt

Running Tests

To run a test (or spec), you can run it like a normal Ruby script:

bin/natalie spec/core/string/strip_spec.rb

This will run the tests and tell you if there are any failures.

If you want to run all the tests that we expect to pass, you can run:

rake test

Lastly, if you need to run a handful of tests locally, you can use the test/runner.rb helper script:

bin/natalie test/runner.rb test/natalie/if_test.rb test/natalie/loop_test.rb

What's the difference between the 'spec/' and 'test/' directories?

The files in spec/ come from the excellent ruby/spec project, which is a community-curated repo of test files that any Ruby implementation can use to compare its conformance to what MRI (Matz's Ruby Interpreter) does. We copy specs over as we implement the part of the language that they cover.

Everything in test/ is stuff we wrote while working on Natalie. These are tests that helped us bootstrap certain parts of the language and/or weren't covered as much as we would like by the official Ruby specs. We use this to supplement the specs in spec/.

Copyright & License

Natalie is copyright 2023, Tim Morgan and contributors. Natalie is licensed under the MIT License; see the LICENSE file in this directory for the full text.

Some parts of this program are copied from other sources, and the copyright belongs to the respective owner. Such copyright notices are either at the top of the respective file, in the same directory with a name like LICENSE, or both.

file(s) copyright license
bigint.* 983 Unlicense
delegate.rb Yukihiro Matsumoto BSD
dtoa.c David M. Gay, Lucent Technologies custom permissive
ipaddr.rb Hajimu Umemoto and Akinori Musha BSD
find.rb Kazuki Tsujimoto BSD
linenoise S. Sanfilippo and P. Noordhuis BSD
minicoro.h Eduardo Bart MIT
pp.rb Yukihiro Matsumoto BSD
prettyprint.rb Yukihiro Matsumoto BSD
shellwords.rb Akinori MUSHA BSD
spec/* Engine Yard, Inc. MIT
uri.rb Akira Yamada BSD
uri/* Akira Yamada BSD
version.rb Engine Yard, Inc. MIT
zlib Jean-loup Gailly and Mark Adler zlib license

See each file above for full copyright and license text.

natalie's People

Contributors

ai-mozi avatar alimpfard avatar andrykonchin avatar awesomekling avatar borisromanov avatar cian911 avatar crystalsage avatar davidot avatar duncan-britt avatar fncontroloption avatar g-flat avatar hendiadyoin1 avatar herwinw avatar imustafin avatar jbampton avatar jcs avatar kddnewton avatar linusg avatar mateusdeap avatar nanobowers avatar ohadrau avatar richardboehme avatar robertbendun avatar ryangjchandler avatar seven1m avatar stevegeek avatar tekknolagi avatar thegrizzlydev avatar timcraft avatar watzon avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

natalie's Issues

Implement remaining Float methods

Much of Float has been implemented, but there's still some methods missing and some specs disabled:

  • spec/core/float/angle_spec.rb
  • spec/core/float/arg_spec.rb
  • spec/core/float/constants_spec.rb
  • spec/core/float/denominator_spec.rb
  • spec/core/float/infinite_spec.rb
  • spec/core/float/magnitude_spec.rb
  • spec/core/float/modulo_spec.rb
  • spec/core/float/multiply_spec.rb
  • spec/core/float/numerator_spec.rb
  • spec/core/float/phase_spec.rb
  • spec/core/float/plus_spec.rb
  • spec/core/float/rationalize_spec.rb

Implement Kernel#loop method

I had to use while true just now because I guess I forgot to implement loop. Whoops!

It appears that loop should be implemented as a method on the Kernel module:

irb(main):001:0> method(:loop)
=> #<Method: Object(Kernel)#loop()>
irb(main):002:0> method(:loop).owner
=> Kernel

Just two methods that I had to remove to make my program run by natalie

srand and Math.hypot

The rest is working well -- my program compiles and works correctly!

However it's much slower than usual ruby. I'll leave the code in case you are interested in a benchmark:

def find_path distances
  struct = Struct.new :i, :adjacent
  nodes = distances.size.times.map{ |i| struct.new i, [] }
  i = distances.map(&:max).index distances.map(&:max).max
  j = distances[i].index distances[i].max
  nodes[i].adjacent.push nodes[j]
  nodes[j].adjacent.push nodes[i]
  transposed = nodes.reject{ |n| n.adjacent.empty? }.map{ |n| distances[n.i] }.transpose.map(&:min)
  puts "pathing"
  until nodes.map(&:adjacent).map(&:size).reduce(:+) / 2 == nodes.size - 1
    candidate = nodes.select{ |n| n.adjacent.empty? }.max_by{ |n| transposed[n.i] }
    a = nodes.find{ |_| _.adjacent.size == 1 }
    prev, ways = nil, []
    while b = a.adjacent.find{ |_| _ != prev }
      prev = a
      ways.push [a, b, distances[candidate.i][a.i] + distances[candidate.i][b.i] - distances[a.i][b.i]]
      a = b
    end
    a, b, _ = ways.min_by(&:last)
    a.adjacent.delete_at a.adjacent.index b
    b.adjacent.delete_at b.adjacent.index a
    a.adjacent.push candidate; candidate.adjacent.push a
    b.adjacent.push candidate; candidate.adjacent.push b
    distances[candidate.i].each_with_index{ |e, i| transposed[i] = e if e < transposed[i] }
  end
  puts "fixing"
  1.times do
    a = nodes.find{ |_| _.adjacent.size == 1 }
    p1, p2 = nil, a
    *mid, e = ARGV[0].to_i.times.map{ p1, p2 = p2, p2.adjacent.find{ |_| _ != p1 }; p2 }
    while e
      best = mid.permutation.min_by{ |perm| [a,*perm,e].each_cons(2).map{ |a,b| distances[a.i][b.i] }.reduce(:+) }
      if mid != best
        a.adjacent[a.adjacent.index{ |_| _ == mid.first }] = best.first
        [a, *best, e].each_cons(3){ |a,b,c| b.adjacent = [a,c] }
        e.adjacent[e.adjacent.index{ |_| _ == mid.last }] = best.last
      end
      a, *mid, e = *best, e, e.adjacent.find{ |_| _ != best.last }
    end
  end
  a = nodes.find{ |_| _.adjacent.size == 1 }
  prev, ways = nil, []
  while b = a.adjacent.find{ |_| _ != prev }
    ways.push [a, b].tap{ prev, a = a, b }.map(&:i)
  end
  [*ways.map(&:first), ways.last.last]
end

size, w = 100, 5
# vertices = Array.new(ARGV[1].to_i){ [rand(1...size), rand(1...size)] }
# vertices = [[50, 55], [50, 45], [70, 50], [35, 25], [35, 75]]
vertices = Array.new(ARGV[1].to_i-1){ |i| Array.new(ARGV[1].to_i-1){ |j| [i*10+10, j*10+10] } }.flatten(1)

p find_path vertices.map{ |v1| vertices.map{ |v2| (v1[0] - v2[0])**2 + (v1[1] - v2[1])**2 } }
$ time ./2 5 15
real	0m9.207s
user	0m7.694s
sys	0m1.349s
$ time ruby 2.rb 5 15
real	0m0.239s
user	0m0.172s
sys	0m0.054s

Build fails on macOS (arm64)

Cool project!

Unfortunately when I try to build from source I get the following errors:

[  6%] Building CXX object CMakeFiles/natalie-base.dir/src/fiber_value.cpp.o
<inline asm>:3:8: error: unexpected token in argument list
        mov 12(%esp), %eax
              ^
<inline asm>:4:7: error: unexpected token in argument list
        mov 8(%esp), %ecx
             ^
<inline asm>:5:7: error: unexpected token in argument list
        mov 4(%esp), %edx
             ^
<inline asm>:6:7: error: unknown token in expression
        push %ebx
             ^
<inline asm>:6:7: error: invalid operand
        push %ebx
             ^
<inline asm>:7:7: error: unknown token in expression
        push %ebp
             ^
<inline asm>:7:7: error: invalid operand
        push %ebp
             ^
<inline asm>:8:7: error: unknown token in expression
        push %esi
             ^
<inline asm>:8:7: error: invalid operand
        push %esi
             ^
<inline asm>:9:7: error: unknown token in expression
        push %edi
             ^
<inline asm>:9:7: error: invalid operand
        push %edi
             ^
<inline asm>:10:6: error: unknown token in expression
        mov %esp, (%ecx)
            ^
<inline asm>:10:6: error: invalid operand
        mov %esp, (%ecx)
            ^
<inline asm>:11:7: error: unknown token in expression
        mov (%edx), %esp
             ^
<inline asm>:11:6: error: invalid operand
        mov (%edx), %esp
            ^
<inline asm>:12:6: error: unknown token in expression
        pop %edi
            ^
<inline asm>:12:6: error: invalid operand
        pop %edi
            ^
<inline asm>:13:6: error: unknown token in expression
        pop %esi
            ^
<inline asm>:13:6: error: invalid operand
        pop %esi
            ^
fatal error: too many errors emitted, stopping now [-ferror-limit=]
20 errors generated.
make[4]: *** [CMakeFiles/natalie-base.dir/src/fiber_value.cpp.o] Error 1
make[3]: *** [CMakeFiles/natalie-base.dir/all] Error 2
make[2]: *** [all] Error 2
make[1]: *** [build_debug] Error 2
make: *** [build] Error 2

Is there an alternative to fiber_value.* that supports arm64?

Implement methods of Integer

Our main approach is to work on one spec from ruby/spec at a time, implementing as much of the spec as possible, then check it off the list:

  • abs_spec.rb
  • allbits_spec.rb
  • anybits_spec.rb
  • bit_and_spec.rb
  • bit_length_spec.rb
  • bit_or_spec.rb
  • bit_xor_spec.rb
  • case_compare_spec.rb
  • ceil_spec.rb
  • chr_spec.rb
  • coerce_spec.rb
  • comparison_spec.rb
  • complement_spec.rb
  • constants_spec.rb
  • denominator_spec.rb
  • digits_spec.rb
  • div_spec.rb
  • divide_spec.rb
  • divmod_spec.rb
  • downto_spec.rb
  • dup_spec.rb
  • element_reference_spec.rb
  • equal_value_spec.rb
  • even_spec.rb
  • exponent_spec.rb
  • fdiv_spec.rb
  • floor_spec.rb
  • gcd_spec.rb
  • gcdlcm_spec.rb
  • gt_spec.rb
  • gte_spec.rb
  • integer_spec.rb
  • lcm_spec.rb
  • left_shift_spec.rb
  • lt_spec.rb
  • lte_spec.rb
  • magnitude_spec.rb
  • minus_spec.rb
  • modulo_spec.rb
  • multiply_spec.rb
  • next_spec.rb
  • nobits_spec.rb
  • numerator_spec.rb
  • odd_spec.rb
  • ord_spec.rb
  • plus_spec.rb
  • pow_spec.rb
  • pred_spec.rb
  • rationalize_spec.rb
  • remainder_spec.rb
  • right_shift_spec.rb
  • round_spec.rb
  • size_spec.rb
  • sqrt_spec.rb
  • succ_spec.rb
  • times_spec.rb
  • to_f_spec.rb
  • to_i_spec.rb
  • to_int_spec.rb
  • to_r_spec.rb
  • to_s_spec.rb
  • truncate_spec.rb
  • try_convert_spec.rb
  • uminus_spec.rb
  • upto_spec.rb

Implement Fibers

I was hoping to hold off working on Fibers, but I believe we need them to be able to implement Enumerator properly. Thus this is blocking #21.

From my research, I believe fibers should be implemented using getcontext and friends. Should be fun!

Move logic out of array.cpp into array_value.cpp, etc. and use auto-generated bindings

I've started moving the logic out of loose functions into methods on the Value classes. Bindings from Ruby to C++ will all be generated by the new binding_gen.rb script.

This should make it easier to read and understand where something should go and saves a lot of work grabbing the arguments and other stuff you need from the Ruby side.

Yay! ๐ŸŽ‰

  • src/array.cpp
  • src/class.cpp
  • src/encoding.cpp
  • src/exception.cpp
  • src/false_class.cpp
  • src/float.cpp
  • src/hash.cpp
  • src/integer.cpp
  • src/io.cpp
  • src/matchdata.cpp
  • src/module.cpp
  • src/nil_class.cpp
  • src/object.cpp
  • src/proc.cpp
  • src/range.cpp
  • src/regexp.cpp
  • src/string.cpp
  • src/symbol.cpp
  • src/true_class.cpp

Docs on running specs needed

Disclaimer: I'm a complete ruby noob :)

Just wrote my first couple of lines of code for natalie & found several issues (#195, #201, #217) referring to specs. While I did manage to get it to run some tests, the whole process was & still is really unclear to me, as I couldn't find any docs.

Specifically:

  1. From various other PR's I noticed that spec files are added on a case by cases basis:
  • Are these copied from https://github.com/ruby/spec?
  • If so: why not use a submodule or separate clone of that repository? Manually copying files initially & then again for every upstream change seems incredibly cumbersome and error prone. We've had a good experience with using a plain checkout of the upstream test262 repository for Serenity's LibJS, both for local development and CI.
  1. Simply doing what the GitHub actions workflow causes thousands of unstaged git changes. Surely that's not the right way for testing locally? Or is the whole idea that you only run a subset of known working tests locally (part of the natalie repo) and all of them on CI?
    https://github.com/seven1m/natalie/blob/d012e0f86304f4d1282dc77f6d588c49e31ecc40/.github/workflows/run_all_specs.yml#L21-L29
  2. Is there a way to specify a subdirectory or specific spec file? I had to edit this line manually:
    https://github.com/seven1m/natalie/blob/bc22ef43ae1200eb57f33ac7459b3e3c9209f48f/spec/support/ruby_spec_runner.rb#L20
  3. Is there a way of getting detailed test failure output? I only got this:
    Start running specs spec/core/string/lstrip_spec.rb
    Ran spec spec/core/string/lstrip_spec.rb
    {"Successful Examples"=>2, "Failed Examples"=>2, "Errored Examples"=>2, "Compile Errors"=>0, "Crashes"=>0, "Timeouts"=>0}
    
  4. How do I install the needed dependencies for the spec runner? The command from the CI workflow (bundle exec ruby spec/support/ruby_spec_runner.rb) failed with:
    spec/support/ruby_spec_runner.rb:1:in `require': cannot load such file -- concurrent (LoadError)
        from spec/support/ruby_spec_runner.rb:1:in `<main>'
    
    and running bundle add concurrent-ruby didn't fix that. gem install concurrent-ruby + ruby spec/support/ruby_spec_runner.rb worked however. Again, no idea what I'm doing. :)

Thanks!

+= operator (and friends) on accessor does not compile

class Foo
  def initialize
    @foo = 0
  end

  attr_accessor :foo
end

Foo.new.foo += 1

Compile error:

/Users/tim/pp/natalie/lib/natalie/compiler/pass1.rb:663:in `process_op_asgn2': expected || (RuntimeError)

...same applies for -=, *=, etc.

Implement methods of String

Our main approach is to work on one spec from ruby/spec at a time, implementing as much of the spec as possible, then check it off the list:

  • allocate_spec.rb
  • append_spec.rb
  • ascii_only_spec.rb
  • b_spec.rb
  • byteindex_spec.rb
  • byterindex_spec.rb
  • bytesize_spec.rb
  • byteslice_spec.rb
  • bytesplice_spec.rb
  • bytes_spec.rb
  • capitalize_spec.rb
  • casecmp_spec.rb
  • case_compare_spec.rb
  • center_spec.rb
  • chars_spec.rb
  • chilled_string_spec.rb
  • chomp_spec.rb
  • chop_spec.rb
  • chr_spec.rb
  • clear_spec.rb
  • clone_spec.rb
  • codepoints_spec.rb
  • comparison_spec.rb
  • concat_spec.rb
  • count_spec.rb
  • crypt_spec.rb
  • dedup_spec.rb
  • delete_prefix_spec.rb
  • delete_spec.rb
  • delete_suffix_spec.rb
  • downcase_spec.rb
  • dump_spec.rb
  • dup_spec.rb
  • each_byte_spec.rb
  • each_char_spec.rb
  • each_codepoint_spec.rb
  • each_grapheme_cluster_spec.rb
  • each_line_spec.rb
  • element_reference_spec.rb
  • element_set_spec.rb
  • empty_spec.rb
  • encode_spec.rb
  • encoding_spec.rb
  • end_with_spec.rb
  • eql_spec.rb
  • equal_value_spec.rb
  • force_encoding_spec.rb
  • freeze_spec.rb
  • getbyte_spec.rb
  • grapheme_clusters_spec.rb
  • gsub_spec.rb
  • hash_spec.rb
  • hex_spec.rb
  • include_spec.rb
  • index_spec.rb
  • initialize_spec.rb
  • insert_spec.rb
  • inspect_spec.rb
  • intern_spec.rb
  • length_spec.rb
  • lines_spec.rb
  • ljust_spec.rb
  • lstrip_spec.rb
  • match_spec.rb
  • modulo_spec.rb
  • multiply_spec.rb
  • new_spec.rb
  • next_spec.rb
  • oct_spec.rb
  • ord_spec.rb
  • partition_spec.rb
  • plus_spec.rb
  • prepend_spec.rb
  • replace_spec.rb
  • reverse_spec.rb
  • rindex_spec.rb
  • rjust_spec.rb
  • rpartition_spec.rb
  • rstrip_spec.rb
  • scan_spec.rb
  • scrub_spec.rb
  • setbyte_spec.rb
  • size_spec.rb
  • slice_spec.rb
  • split_spec.rb
  • squeeze_spec.rb
  • start_with_spec.rb
  • string_spec.rb
  • strip_spec.rb
  • sub_spec.rb
  • succ_spec.rb
  • sum_spec.rb
  • swapcase_spec.rb
  • to_c_spec.rb
  • to_f_spec.rb
  • to_i_spec.rb
  • to_r_spec.rb
  • to_s_spec.rb
  • to_str_spec.rb
  • to_sym_spec.rb
  • tr_spec.rb
  • tr_s_spec.rb
  • try_convert_spec.rb
  • uminus_spec.rb
  • undump_spec.rb
  • unicode_normalized_spec.rb
  • unicode_normalize_spec.rb
  • unpack1_spec.rb
  • unpack_spec.rb
  • upcase_spec.rb
  • uplus_spec.rb
  • upto_spec.rb
  • valid_encoding_spec.rb

Note: there is another issue for String#unpack here: #667

Issue with processing default arguments with an argument splat

After playing around a bit with implementing some more Enumerable methods I think I found a natalie bug with the handling of default arguments. The following code prints 2 in Ruby and 1 in Natalie:

def foo(a = 1, *args)
  puts a
end
foo(2)

I tried to understand where to look for this issue but I did not really find the position where this issue occurs. Maybe a small hint could help me finding the issue here :^)

Implement methods of Hash

Our main approach is to work on one spec from ruby/spec at a time, implementing as much of the spec as possible, then check it off the list:

  • allocate_spec.rb
  • any_spec.rb
  • assoc_spec.rb
  • clear_spec.rb
  • clone_spec.rb
  • compact_spec.rb
  • compare_by_identity_spec.rb
  • constructor_spec.rb
  • deconstruct_keys_spec.rb
  • default_proc_spec.rb
  • default_spec.rb
  • delete_if_spec.rb
  • delete_spec.rb
  • dig_spec.rb
  • each_key_spec.rb
  • each_pair_spec.rb
  • each_spec.rb
  • each_value_spec.rb
  • element_reference_spec.rb
  • element_set_spec.rb
  • empty_spec.rb
  • eql_spec.rb
  • equal_value_spec.rb
  • except_spec.rb
  • fetch_spec.rb
  • fetch_values_spec.rb
  • filter_spec.rb
  • flatten_spec.rb
  • gte_spec.rb
  • gt_spec.rb
  • hash_spec.rb
  • has_key_spec.rb
  • has_value_spec.rb
  • include_spec.rb
  • index_spec.rb
  • initialize_spec.rb
  • inspect_spec.rb
  • invert_spec.rb
  • keep_if_spec.rb
  • key_spec.rb
  • keys_spec.rb
  • length_spec.rb
  • lte_spec.rb
  • lt_spec.rb
  • member_spec.rb
  • merge_spec.rb
  • new_spec.rb
  • rassoc_spec.rb
  • rehash_spec.rb
  • reject_spec.rb
  • replace_spec.rb
  • ruby2_keywords_hash_spec.rb
  • select_spec.rb
  • shift_spec.rb
  • size_spec.rb
  • slice_spec.rb
  • sort_spec.rb
  • store_spec.rb
  • to_a_spec.rb
  • to_hash_spec.rb
  • to_h_spec.rb
  • to_proc_spec.rb
  • to_s_spec.rb
  • transform_keys_spec.rb
  • transform_values_spec.rb
  • try_convert_spec.rb
  • update_spec.rb
  • values_at_spec.rb
  • value_spec.rb
  • values_spec.rb

Hash cannot be used after calling `Hash#each` and returning inside the block

The following program throws a RuntimeException:

a = { a: 2 }
a.first
a[:b] = 1

When using Hash#first the Enumerable method first is called with looks like this (shortened):

def first
  each do |arg|
    return arg
  end
end

If you try to insert a value to the hash, it will fail because m_is_iterating is still set to true. When returning in such a case we throw a LocalJumpError and rescue it. This means the macro block handler that hash uses (NAT_RUN_BLOCK_AND_POSSIBLY_BREAK_WHILE_ITERATING_HASH) does not know that it stopped iterating. I guess a simple fix would be to rescue the local jump error when running the block?

However, I'm not sure if "normal" block execution macros also need to handle this and if my proposed solution is the most elegant way of doing it...

Issue with `super` and `alias`

When using super in an aliased method natalie crashes.

Example:

module Base
  def chunk
    puts "called Base#chunk"
  end
end


class Child
  include Base

  def chunk
    puts "called Child#chunk"
    super
  end
  alias chunk_while chunk
end

p Child.new.chunk_while

Ruby output:

called Child#chunk
called Base#chunk
nil

Natalie output:

called Child#chunk
Traceback (most recent call last):
        1: from example.rb:18:in `<main>'
example.rb:12:in `chunk': super: no superclass method `chunk' for #<Child:0x55f253565120> (NoMethodError)

Instead of a module we can also use Base as a super class to Child.

Break from begin/rescue doesn't compile

Probably related to #48, putting a break inside a begin/rescue doesn't work. In fact, it doesn't compile.

def foo
  while true
    begin
      raise 'foo'
    rescue
      break
    end
  end
  'done'
end

p foo

This is the error from the compiler:

/tmp/natalie.cpp20210713-425385-1hup9gz: In function โ€˜Natalie::ValuePtr begin_fn4(Natalie::Env*, Natalie::ValuePtr, size_t, Natalie::ValuePtr*, Natalie::Block*)โ€™:
/tmp/natalie.cpp20210713-425385-1hup9gz:37:17: error: โ€˜while_result2โ€™ was not declared in this scope
   37 |                 while_result2 = ValuePtr { NilValue::the() };
      |                 ^~~~~~~~~~~~~
At global scope:
cc1plus: note: unrecognized command-line option โ€˜-Wno-unknown-warning-optionโ€™ may have been intended to silence earlier diagnostics
Traceback (most recent call last):
	4: from bin/natalie:215:in `<main>'
	3: from bin/natalie:108:in `run'
	2: from bin/natalie:133:in `compile_and_run'
	1: from /home/tim/pp/natalie/lib/natalie/compiler.rb:49:in `compile'
/home/tim/pp/natalie/lib/natalie/compiler.rb:57:in `compile_c_to_binary': There was an error compiling. (Natalie::Compiler::CompileError)

Issue when using the REPL with Fibers

When working on Enumerator::Lazy I noticed, that the REPL does not support Fibers. The following Error will occur:

nat> [1,2,3].to_enum.each {}
ruby: /home/richard/repos/natalie/include/natalie/fiber_value.hpp:120: void Natalie::FiberValue::create_stack(Natalie::Env*, int): Assertion `comparison_ptr < Heap::the().start_of_stack()' failed.

I already debugged this and found the issue. The start of the stack (set by Heap::the().set_start_of_stack()) will not be set because it is being set in the main function (src/main.cpp) which is not invoked when using the REPL.

I'm not sure about how we can fix this because I'm a bit confused about the whole start_of_stack-thingy. I tried using a new extern function setup_heap and to call it with a Fiddle::Function from the REPL. The problem is that I do not know which address I should pass to this function. Is this the correct way or are we able to use a fixed address for the start of the stack instead of the arguments passed to the program?

extern "C" void setup_heap(void *address) {
    Heap::the().set_start_of_stack(address);
}

Checking equality on recursive arrays needs some more work

  describe '==' do
    it 'does not return true always for recursive arrays' do
      a = []
      a << a
      a.should_not == [[2]]
    end
  end

  describe 'eql?' do
    it 'does not return true always for recursive arrays' do
      a = []
      a << a
      a.should_not eql [[2]]
    end
  end

These two tests fail currently.

break statement is broken in a loop inside a method

I haven't investigated further, but it appears that break is messed up when used inside a loop inside a method. Here is some sample code to demonstrate:

def broken_break
  x = 0
  while true
    x += 1
    break if x > 2
  end
  'this should be returned'
end

p broken_break # => 'this should be returned'

The code above should return 'this should be returned', but it instead returns nil. ๐Ÿ˜ญ

Integrate AddressSanitizer

We currently have CI running a few files through Valgrind, but it would be better to integrate AddressSanitizer (supported by both gcc and clang) so we find bugs faster. Also, not all the code is running through Valgrind, so I'm sure there are bugs we are not catching.

Bug compiling &&/and causes double-eval of left side

# wat.rb
def x
  puts 'should only print once'
  false
end

if x && false
  puts 'true'
else
  puts 'false'
end
โ†’ ruby wat.rb
should only print once
false

โ†’ bin/natalie wat.rb
should only print once
should only print once
false

Bug with begin/rescue at top-level of file

If you create a file with the following code:

begin
  raise 'wat'
rescue
end

...and run it with natalie, you get a compiler error:

โ†’ bin/natalie top_level_rescue_bug.rb
In file included from /home/tim/pp/natalie/build/include/natalie/void_p_value.hpp:9,
                 from /home/tim/pp/natalie/build/include/natalie.hpp:50,
                 from /tmp/natalie.cpp20210704-253398-1syoq0s:1:
/tmp/natalie.cpp20210704-253398-1syoq0s: In function โ€˜Natalie::Value* EVAL(Natalie::Env*)โ€™:
/tmp/natalie.cpp20210704-253398-1syoq0s:242:71: error: โ€˜argcโ€™ was not declared in this scope
  242 |         ValuePtr NATCALLBEGIN8 = NAT_CALL_BEGIN(env, self, begin_fn2, argc, args, block);
      |                                                                       ^~~~
/home/tim/pp/natalie/build/include/natalie/macros.hpp:80:52: note: in definition of macro โ€˜NAT_CALL_BEGINโ€™
   80 |     Natalie::ValuePtr _result = begin_fn(&e, self, argc, args, block); \
      |                                                    ^~~~
/tmp/natalie.cpp20210704-253398-1syoq0s:242:77: error: โ€˜argsโ€™ was not declared in this scope
  242 |         ValuePtr NATCALLBEGIN8 = NAT_CALL_BEGIN(env, self, begin_fn2, argc, args, block);
      |                                                                             ^~~~
/home/tim/pp/natalie/build/include/natalie/macros.hpp:80:58: note: in definition of macro โ€˜NAT_CALL_BEGINโ€™
   80 |     Natalie::ValuePtr _result = begin_fn(&e, self, argc, args, block); \
      |                                                          ^~~~
/tmp/natalie.cpp20210704-253398-1syoq0s:242:83: error: โ€˜blockโ€™ was not declared in this scope; did you mean โ€˜mlockโ€™?
  242 |         ValuePtr NATCALLBEGIN8 = NAT_CALL_BEGIN(env, self, begin_fn2, argc, args, block);
      |                                                                                   ^~~~~
/home/tim/pp/natalie/build/include/natalie/macros.hpp:80:64: note: in definition of macro โ€˜NAT_CALL_BEGINโ€™
   80 |     Natalie::ValuePtr _result = begin_fn(&e, self, argc, args, block); \
      |                                                                ^~~~~
/home/tim/pp/natalie/build/include/natalie/macros.hpp:82:16: error: cannot convert โ€˜Natalie::ValuePtrโ€™ to โ€˜Natalie::Value*โ€™ in return
   82 |         return _result;                                                \
      |                ^~~~~~~
/tmp/natalie.cpp20210704-253398-1syoq0s:242:34: note: in expansion of macro โ€˜NAT_CALL_BEGINโ€™
  242 |         ValuePtr NATCALLBEGIN8 = NAT_CALL_BEGIN(env, self, begin_fn2, argc, args, block);
      |                                  ^~~~~~~~~~~~~~
At global scope:
cc1plus: note: unrecognized command-line option โ€˜-Wno-unknown-warning-optionโ€™ may have been intended to silence earlier diagnostics
Traceback (most recent call last):
	4: from bin/natalie:215:in `<main>'
	3: from bin/natalie:108:in `run'
	2: from bin/natalie:133:in `compile_and_run'
	1: from /home/tim/pp/natalie/lib/natalie/compiler.rb:49:in `compile'
/home/tim/pp/natalie/lib/natalie/compiler.rb:57:in `compile_c_to_binary': There was an error compiling. (Natalie::Compiler::CompileError)

This appears to be due to an assumption about the presence of argc and args, which is not true at the top level.

We either need to make those variables available at the top level or change the compiler assumption.

Implementing methods of array

Not all the methods as specified in: https://ruby-doc.org/core-2.7.3/Array.html are implemented in Natalie at the moment. Some of the methods are partially implemented but do not match the spec. This is a list of specs we need to copy over and fully implement.

Spec files can be found at: https://github.com/ruby/spec/tree/master/core/array

  • allocate_spec.rb
  • any_spec.rb
  • append_spec.rb
  • array_spec.rb
  • assoc_spec.rb
  • at_spec.rb
  • bsearch_index_spec.rb
  • bsearch_spec.rb
  • clear_spec.rb
  • clone_spec.rb
  • collect_spec.rb
  • combination_spec.rb
  • compact_spec.rb
  • comparison_spec.rb
  • concat_spec.rb
  • constructor_spec.rb
  • count_spec.rb
  • cycle_spec.rb
  • deconstruct_spec.rb
  • delete_at_spec.rb
  • delete_if_spec.rb
  • delete_spec.rb
  • difference_spec.rb
  • dig_spec.rb
  • drop_spec.rb
  • drop_while_spec.rb
  • dup_spec.rb
  • each_index_spec.rb
  • each_spec.rb
  • element_reference_spec.rb
  • element_set_spec.rb
  • empty_spec.rb
  • eql_spec.rb
  • equal_value_spec.rb
  • fetch_spec.rb
  • fill_spec.rb
  • filter_spec.rb
  • find_index_spec.rb
  • first_spec.rb
  • flatten_spec.rb
  • frozen_spec.rb
  • hash_spec.rb
  • include_spec.rb
  • index_spec.rb
  • initialize_spec.rb
  • insert_spec.rb
  • inspect_spec.rb
  • intersection_spec.rb
  • join_spec.rb
  • keep_if_spec.rb
  • last_spec.rb
  • length_spec.rb
  • map_spec.rb
  • max_spec.rb
  • minmax_spec.rb
  • min_spec.rb
  • minus_spec.rb
  • multiply_spec.rb
  • new_spec.rb
  • partition_spec.rb
  • permutation_spec.rb
  • plus_spec.rb
  • pop_spec.rb
  • prepend_spec.rb
  • product_spec.rb
  • push_spec.rb
  • rassoc_spec.rb
  • reject_spec.rb
  • repeated_combination_spec.rb
  • repeated_permutation_spec.rb
  • replace_spec.rb
  • reverse_each_spec.rb
  • reverse_spec.rb
  • rindex_spec.rb
  • rotate_spec.rb
  • sample_spec.rb
  • select_spec.rb
  • shift_spec.rb
  • shuffle_spec.rb
  • size_spec.rb
  • slice_spec.rb
  • sort_by_spec.rb
  • sort_spec.rb
  • sum_spec.rb
  • take_spec.rb
  • take_while_spec.rb
  • to_ary_spec.rb
  • to_a_spec.rb
  • to_h_spec.rb
  • to_s_spec.rb
  • transpose_spec.rb
  • try_convert_spec.rb
  • union_spec.rb
  • uniq_spec.rb
  • unshift_spec.rb
  • values_at_spec.rb
  • zip_spec.rb

Hide data members in classes

When converting from C to C++, I failed to hide all the data members in classes. For example, value.hpp has klass exposed rather than providing a klass() getter method.

  • include/natalie/array_value.hpp
  • include/natalie/block.hpp
  • include/natalie/class_value.hpp
  • include/natalie/encoding_value.hpp
  • include/natalie/env.hpp
  • include/natalie/exception_value.hpp
  • include/natalie/false_value.hpp
  • include/natalie/float_value.hpp
  • include/natalie/global_env.hpp
  • include/natalie/hash_value.hpp
  • include/natalie/integer_value.hpp
  • include/natalie/io_value.hpp
  • include/natalie/match_data_value.hpp
  • include/natalie/method.hpp
  • include/natalie/module_value.hpp
  • include/natalie/nil_value.hpp
  • include/natalie/proc_value.hpp
  • include/natalie/range_value.hpp
  • include/natalie/regexp_value.hpp
  • include/natalie/string_value.hpp
  • include/natalie/symbol_value.hpp
  • include/natalie/true_value.hpp
  • include/natalie/value.hpp
  • include/natalie/void_p_value.hpp

Issue with default and keyword arguments

If a method has a default argument and keyword arguments, the default argument will be filled with the keyword arguments which is contrary to the behavior of MRI.

def foo(a = 1, **)
  puts a 
end

# Natalie:
foo(bar: :foo)
# => { :bar => :foo }

# MRI:
foo(bar: :foo)
# => 1

I'm not sure if we have to fix this in arg_value_by_path or beforehand in the compiler..

Send respond_to to the right receiver from the core API

Found: #105

The test "compares with an equivalent Array-like object using #to_ary" highlighted an issue on how we use respond_to? in natalie's standard library. We rely on calling the method on the c++ impl rather than send the event "respond_to? :method_name" so mocks won't be able to correctly override such value and neither will anything else :) We should replace all the direct respond_to with the ruby method and maybe create a utility that wraps send(env, Symbol::intern("respond_to?")...) so the ergonomics won't change much for us but natalie will in fact behave correctly.

We can rename the existing respond_to to respond_to_method (there is precedent for this naming suffix, e.g. ClassValue::new_method, ModuleValue::private_method, etc.) so it is clear it's what the bindings point to.

Then we can make a new convenience method called respond_to that does send(env, Symbol::intern("respond_to?")...).

Implement redo statement

Honestly, I didn't even know this was a thing in Ruby. ๐Ÿ˜„

a = []
loop do
  a << 1
  redo if a.size < 2 # restart the loop
  a << 2
  break if a.size == 3
end
a.should == [1, 1, 2]

Build with CMake

I'd like to use CMake rather than our custom bespoke Makefile.

Implement respond_to_missing and method missing

These 2 methods allow a subclass of Object to dynamically provide methods when respond_to does not. It is a useful mechanism for dynamic delegation and is used throughout the core library, including Array which needs it for flatten and flatten! (among other methods) #75 .

An example of these two methods can be found here.
Links:
respond_to_missing (part of Kernel):

method_missing (part of Object):

Lambdas are not asserting argument count

Lambdas should normally raise ArgumentErrors if passed a wrong number of arguments.

MRI:

2.7.3 :001 > ->(a) { a }.call
Traceback (most recent call last):
        5: from /usr/share/rvm/rubies/ruby-2.7.3/bin/irb:23:in `<main>'
        4: from /usr/share/rvm/rubies/ruby-2.7.3/bin/irb:23:in `load'
        3: from /usr/share/rvm/rubies/ruby-2.7.3/lib/ruby/gems/2.7.0/gems/irb-1.2.6/exe/irb:11:in `<top (required)>'
        2: from (irb):1
        1: from (irb):1:in `block in irb_binding'
ArgumentError (wrong number of arguments (given 0, expected 1))

Natalie:

nat> ->(a) { a }.call
nil

I tried looking at how we do this with "normal" method calls and I think it is done by pass1 of the compiler (see prepare_argc_assertion). Lambdas seem to be processed in the same step (line 393 of pass1.rb). I guess we can just add the argc assertion there somehow, right?

Use const more consistently

As a C/C++ noob, and as a lazy human being, I have not been great about using const everywhere I should have. This issue will track progress toward a full "const-aware" Natalie.

I will be adding clang-tidy configuration and tooling to help us find missing const qualifiers and to add them.

Speed up method calls with a vector of function pointers

Consider the following Ruby code:

class Foo
  def foo
    'foo'
  end

  def go
    10_000.times { foo }
  end
end

Foo.new.go

The "foo" method is looked up in the methods hashmap 10,000 times, even though the compiler could already have all the information it needed to compile that code into direct function calls.

When profiling code compiled with Natalie, method hashmap lookup is a considerable amount of time spent, sometimes 30% or more. For methods known at compile time, this could be much faster.

Basics:

  1. The compiler keeps track of methods defined on classes.
  2. In addition to being added to the methods hashmap, add the method pointer to a Vector.
  3. The symbol "foo" can be turned into an index into the Vector, 0, 1, 2, ...

Of course, this would only work for method names known at compile time. For other cases, we'd still fall back to a hash lookup (the current technique).

It may not be this easy, but I think it's worth looking into!

Implement Array#pack

  1. Remove "skip-test" comment from top of a spec/core/array/pack/*_spec.rb file.
  2. Modify ArrayPacker and/or StringPacker to make the specs pass
  3. Commit! (EASY! ๐Ÿ˜†)
  • a_spec.rb
  • at_spec.rb
  • b_spec.rb
  • buffer_spec.rb
  • comment_spec.rb
  • c_spec.rb
  • d_spec.rb
  • empty_spec.rb
  • e_spec.rb
  • f_spec.rb
  • g_spec.rb
  • h_spec.rb
  • i_spec.rb
  • j_spec.rb
  • l_spec.rb
  • m_spec.rb
  • n_spec.rb
  • percent_spec.rb
  • p_spec.rb
  • q_spec.rb
  • s_spec.rb
  • u_spec.rb
  • v_spec.rb
  • w_spec.rb
  • x_spec.rb
  • z_spec.rb

(more) weird `break' behaviours

break should (probably) propagate out of the block.

This:

def x(x)
  x.each do |y|
    p y
    yield y
    p y
  end
end


x([1, 2, 3]) {
  break
}

should only print 1.

Return from begin/rescue doesn't work

This Ruby code runs forever in Natalie:

def foo
  while true
    begin
      raise 'foo'
    rescue
      return 'error'
    end
  end
end

p foo

(In MRI, it prints "error".)

Implement Enumerable::Lazy

Enumerable::Lazy allows lazy chaining of some Enumerable methods.

The following methods have to be implemented:

  • chunk
  • chunk_while
  • collect
  • collect_concat
  • drop
  • drop_while
  • eager
  • enum_for
  • filter
  • filter_map
  • find_all
  • flat_map
  • force
  • grep
  • grep_v
  • lazy
  • map
  • reject
  • select
  • slice_after
  • slice_before
  • slice_when
  • take
  • take_while
  • to_a
  • to_enum
  • uniq
  • with_index
  • zip

Implement the Enumerable module

I started work in 2508f40 on the Enumerable module. It's a biggie! Anyone is welcome to jump in and lend a hand!

Here are the specs that need to be enabled and passing:

  • all_spec.rb
  • any_spec.rb
  • chain_spec.rb
  • chunk_spec.rb
  • chunk_while_spec.rb
  • collect_concat_spec.rb
  • collect_spec.rb
  • count_spec.rb
  • cycle_spec.rb
  • detect_spec.rb
  • drop_spec.rb
  • drop_while_spec.rb
  • each_cons_spec.rb
  • each_entry_spec.rb
  • each_slice_spec.rb
  • each_with_index_spec.rb
  • each_with_object_spec.rb
  • entries_spec.rb
  • filter_map_spec.rb
  • filter_spec.rb
  • find_all_spec.rb
  • find_index_spec.rb
  • find_spec.rb
  • first_spec.rb
  • flat_map_spec.rb
  • grep_spec.rb
  • grep_v_spec.rb
  • group_by_spec.rb
  • include_spec.rb
  • inject_spec.rb
  • lazy_spec.rb
  • map_spec.rb
  • max_by_spec.rb
  • max_spec.rb
  • member_spec.rb
  • min_by_spec.rb
  • minmax_by_spec.rb
  • minmax_spec.rb
  • min_spec.rb
  • none_spec.rb
  • one_spec.rb
  • partition_spec.rb
  • reduce_spec.rb
  • reject_spec.rb
  • reverse_each_spec.rb
  • select_spec.rb
  • slice_after_spec.rb
  • slice_before_spec.rb
  • slice_when_spec.rb
  • sort_by_spec.rb
  • sort_spec.rb
  • sum_spec.rb
  • take_spec.rb
  • take_while_spec.rb
  • tally_spec.rb
  • to_a_spec.rb
  • to_h_spec.rb
  • uniq_spec.rb
  • zip_spec.rb

REPL bug: cannot have variable and method of same name

โ†’ bin/natalie
nat> def foo
nat> end
:foo
nat> foo = foo
Trying to get variable `foo' at index 1 which is not set.
zsh: abort (core dumped)  bin/natalie

This only happens in the REPL.

In a regular rb file, foo is set to nil as it is in MRI.

`Array#hash` producing too many collisions

I'm currently tinkering around with Hash#hash and noticed this bug in Array#hash:

[2, 2].hash == [1, 1].hash
# => true

[2, 2].hash will return 100019 which is the starting point of the hash calculation, because (2.hash * 207269) ^ (2.hash * 207269) will eliminate each other. My first thought of fixing this is multiplying with a random prime number instead of a fixed one. Would this work?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.