Ruby2CExtension Limitations

Ruby2CExtension has some limitations. Some things do not work “by design” other things don’t work because they are not (yet) implemented.

Generally Ruby2CExtension tries to match Ruby’s semantics as close as possible, so if a Ruby file doesn’t fail to compile and doesn’t have any of the issues described below, then the compiled C extension should just work and behave exactly as the Ruby code would.

Sections: Warnings and Exceptions, C Extension Limitations, Arity, Continuations, Scope, Vmode, Cref, Block Pass, Block Argument Semantics, super with implicit arguments and optional arguments, Not Supported Features.

Warnings and Exceptions

Not all warnings that Ruby code might emit are reproduced in the compiled C extension.

Ruby2CExtension also omits some checks that might raise exceptions for Ruby code (it won’t warn or raise exceptions on wrong argument count in lambdas for example).

As a rule of thumb: if Ruby code works without warnings then the compiled C extension will also work without warnings, but if the Ruby code emits warnings, then the compiled C extension might emit less warnings.

Also, sometimes the compiled C extension will raise slightly different exceptions than the Ruby code would.

C Extension Limitations

Because the compiled code is a C extension, it will also behave like a C extension:

Arity

All methods and procs defined in Ruby code that is compiled to a C extension will return -1 as arity. So if your Ruby code depends on the method arity, you might have to change it.

Continuations

Continuations (callcc) are possible in compiled Ruby code, but you might see wrong behavior with local variables (i.e. they will have an old value), depending on whether these local variables are needed for a closure or not.

In short, you probably should not use continuations in Ruby code that you want to compile to a C extension.

Scope

In Ruby every scope (class scope, methods, ...) has an associated C struct SCOPE, which stores the local variables (including $~ and $_) for that scope. Methods that are implemented in a C extension don’t get such a SCOPE, because it is expensive to set up and they don’t need it (they just use variables on the stack).

That means that the methods in Ruby code that is compiled to a C extension won’t get a SCOPE either, which is usually no problem because local variables are handled differently anyway. It only gets a problem, if the code uses methods that work with the current SCOPE. Here are some of these methods:

These methods will work, but they will see the SCOPE of the nearest Ruby code scope. This is also true for $_ and $~ (and the derived $&, $`, $', $1, ...). Example:

def bar
  a = 2
  "xxx" =~ /./ # changes $~ of the nearest Ruby code scope
  p eval("a") # won't return 2 (a's value), instead it will be evaled in the
              # nearest Ruby code scope
  local_variables # returns local variables of the nearest Ruby code scope
end

If bar is compiled into a C extension and then called from the following Ruby code:

a = b = 42
$~ = nil
p bar
p $~

Then it will output

42
["a", "b"]
#<MatchData: ...>

instead of the expected

2
["a"]
nil

Another consequence is that if compiled methods call each other, they will all share the same $_ and $~ (those of the nearest Ruby code scope). This might make some code behave unexpectedly.

Vmode

The so called vmode is also associated with the Ruby SCOPE (though not directly stored inside the SCOPE struct). The vmode is set by the methods private, protected, public and module_function and determines which visibility newly defined methods get.

Ruby2CExtension can not (easily) access Ruby’s internal vmode, instead it tries to figure out at compile time what visibility a method defined using def will get. But be careful with Module#define_method and Module#attr*: those methods will use Ruby’s internal vmode.

So, there are actually two different vmodes at work for compiled Ruby code: the vmode that Ruby2CExtension uses at compile time to guess which visibility methods defined with def get and Ruby’s internal vmode at runtime which is used by Module#define_method and Module#attr*.

Ruby2CExtension’s compile time vmode heuristic works by replacing calls to the vmode changing methods with nil and instead changing the compile time vmode. The calls are only replaced if they are without receiver and without arguments. And it also depends on where the calls are made:

In the toplevel scope only calls to private and public are replaced, the default vmode in the toplevel scope is private.

In a module scope calls to private, protected, public and module_function are replaced, the default vmode in a module scope is public.

In a class scope calls to private, protected and public are replaced, the default vmode in a class scope is public.

In methods and blocks no calls are replaced and all methods defined with def will be public.

If your code doesn’t do anything tricky, then the compile time heuristic should just work as expected. Here is an example:

# start of file
# default vmode is private
def m1; end # private

public # this call will be replaced with nil
# vmode is now public
def m2; end # public

class A
  # default vmode is public
  def m2; end # public

  private # this call will be replaced with nil
  # vmode is now private
  def m3 # private
    def foo # public, because it is in a method
    end
  end

end

protected # this call is not replaced and will probably fail because
          # #protected is not defined at toplevel

# end of file

Here is an example, where it fails:

class A
  def pub; end # public

  private if nil # Ruby2CExtension replaces the call to private with nil
  # vmode is now private

  def pub2; end # will be private in the compiled C extension
end

But this should be a pretty uncommon case. The visibility of methods defined using def should be correct for most Ruby code.

It is a bit more complicated for methods/attributes defined using define_method or the attr* methods. Their visibility is determined by Ruby’s internal vmode at run time.

As explained above, C extension code does not get its own SCOPE, so it also doesn’t get its own vmode. When a C extension is required, Ruby sets the vmode to public before loading the C extension. All the class/module scopes that are executed during the require are executed in the same SCOPE and so also with the same vmode.

Because all the vmode changing method calls are replaced with nil, Ruby’s internal vmode will probably stay public all the time and so all methods/attributes defined using define_method or the attr* methods will be public. Here is an example:

# Ruby's internal vmode is public
class A
  # still the same internal vmode, still public
  define_method(:a) {} # public

  private # is replaced with nil, does not affect Ruby's internal vmode
  def b; end # private

  # but Ruby's internal vmode is still public
  define_method(:c) {} # public
  attr_accessor :d, :e # also public
end

If those methods really need another visibility, then it can be changed explicitly:

class A
  define_method(:a) {}
  private :a
end

In methods the internal vmode is that of the nearest Ruby SCOPE (as explained above), so it is usually unpredictable. And additionally calling one of the vmode changing methods will also affect the SCOPE of the caller:

# in compiled C extension
def my_private
  private
end
# in Ruby code
class A
  def a;end # public
  my_private
  def b;end # private, but would be public if the above code was not compiled
end

To be save, don’t use the vmode changing methods without arguments inside methods, instead set the visibility explicitly (e.g. private :a).

Cref

The so called cref is another Ruby internal that Ruby2CExtension has to emulate. The cref is a linked list that describes the current lexical class/module nesting. Example:

class A
  class B
    # cref here is B -> A -> Object
  end
  # cref here is A -> Object
  module C
    # cref here is C -> A -> Object
  end
end
# cref here is Object

The current cref is used for constant and class variable lookup and for def, undef and alias.

Ruby2CExtension emulates cref with the same semantics as Ruby. There are only two problems.

First, when a method is defined in Ruby code, it saves the current cref for later use. From C it isn’t possible to store an extra value when defining a method, so Ruby2CExtension works around this problem by storing the cref in a global variable. But there is still a problem when a def is used/run multiple times, because the cref can differ each time.

So if a def that requires a cref is used multiple times, then the compiled C extension will raise an exception (this is the only case where code compiled with Ruby2CExtension raises an exception that Ruby wouldn’t raise). If a def doesn’t need a cref, then everything is fine and it can be used multiple times. Examples:

["a", "b"].each { |s|
  class << s
    def bar; self; end # is OK, doesn't need cref
  end
}

["a", "b"].each { |s|
  class << s
    def baz; Array.new(self); end # fails the 2nd time
  end
}

The second case fails because the constant lookup for Array needs a cref. If you really need this to work, then you can use ::Array instead of Array, because that won’t need a cref.

Again in the usual case everything should be fine and if you really need to use a def that requires a cref multiple times, then you might be able to modify it so that it won’t need a cref.

The second problem that arises from Ruby2CExtension emulating crefs is that methods that access Ruby’s internal cref, will see the wrong cref and thus not behave as expected. Those methods are Module.nesting, Module.constants (but Module#constants works), autoload and autoload? (use Module#autoload and Module#autoload? instead).

Block Pass

There currently is no way to pass a Proc instance as the block parameter to a method call on the C side, i.e. there is no way to do the following from C:

def foo(proc)
  bar(&proc)
end

Ruby2CExtension works around this by compiling the above to something similar to this:

def foo(proc)
  tmp_proc = proc.to_proc
  bar { |*arg| tmp_proc.call(*arg) }
end

The downside of this workaround is that it doesn’t work (as expected) with methods like instance_eval and it is problematic if a proc is passed deep into a recursive method, because that block will then be wrapped multiple times.

This issue might be fixed if a future Ruby version provides a way to cleanly pass a proc to a method call.

Block Argument Semantics

Block argument semantics are a bit tricky, but Ruby2CExtension should get it right for most cases, here are two cases where the result is wrong:

def t1(*a); yield *a; end
p t1([1, 2]) { |a| a }

def bl_pa_tst(); p yield([1, 2]); end
pc = proc { |*a| a }
bl_pa_tst(&pc)
bl_pa_tst { |*a| a }

Ruby outputs:

[1, 2]
[[1, 2]]
[[1, 2]]

The compiled C extension outputs:

[[1, 2]]
[1, 2]
[1, 2]

But again, for most cases it should just work and maybe it will get better in future Ruby2CExtension versions.

super with implicit arguments and optional arguments

In a compiled C extension super with implicit arguments will not use optional arguments that have not been given by the original caller. Example:

class A
  def foo(*a)
    p a
  end
end

class B < A
  def foo(a = nil)
    super
  end
end

B.new.foo

In Ruby this code will output [nil], in the compiled form it will output [].

Not Supported Features

Ruby files that use one or more of the above described problematic things will usually compile just fine and fail or behave wrong at runtime. In contrast, the following things will be catched at compile time. Most of them are just not supported yet and might be added later.

Control Flow

Most of the things that are not supported are related to control flow. Ruby implements most of its control flow using setjmp() and longjmp(). Most of this is in eval.c and there is no real API to access it, so Ruby2CExtension has to do some tricks to make control flow work. For most cases workarounds are implemented, but some of the harder cases are not (yet) supported.

The first thing that isn’t supported is return from inside a block. Also break with a value from inside a block is not supported (but break without a value is):

def foo
  bar(1, 2, 3) {
    next      # works
    next 42   # works
    redo      # works
    break     # works
    break 42  # does not work
    return    # does not work
    return 42 # does not work
  }
  return    # works
  return 42 # works
end

For while/until loops everything works.

Another problematic area in Ruby2CExtension 0.1.0 was control flow “through” a rescue or ensure clause. This is mostly implemented now, only in ensure clauses it still does not work (it does work in ensure bodies):

def foo
  while bar?
    begin
      next      # works
      next 42   # works
      redo      # works
      break     # works
      break 42  # works
      return    # works
      return 42 # works
    rescue
      next      # works
      next 42   # works
      redo      # works
      break     # works
      break 42  # works
      return    # works
      return 42 # works
    ensure
      next      # does not work
      next 42   # does not work
      redo      # does not work
      break     # does not work
      break 42  # does not work
      return    # does not work
      return 42 # does not work
    end
  end
end

defined?

The defined? statement is also not fully supported, it works for most of the common cases like constants, global variables, instance variables, class variables and $~ (and derived). Ruby2CExtension fails at compile time if a case is not supported.

super with implicit arguments in Ruby 1.8.5 and later

For Ruby 1.8.5 super with implicit arguments is only supported in methods (not in blocks or rescue/ensure clauses). For Ruby 1.8.4 super with implicit arguments is supported everywhere.

To work around this problem, just specify the arguments explicitly:

def foo(a, b)
  3.times { super(a, b) }
end

instead of:

def foo(a, b)
  3.times { super }
end