Ruby2CExtension has some limitations. Some things do not work “by design” other things don’t work because they are not (yet) implemented.
Generally Ruby2CExtension tries to match Ruby’s semantics as close as possible, so if a Ruby file doesn’t fail to compile and doesn’t have any of the issues described below, then the compiled C extension should just work and behave exactly as the Ruby code would.
Sections: Warnings and Exceptions, C Extension Limitations, Arity, Continuations, Scope, Vmode, Cref, Block Pass, Block Argument Semantics, super with implicit arguments and optional arguments, Not Supported Features.
Not all warnings that Ruby code might emit are reproduced in the compiled C extension.
Ruby2CExtension also omits some checks that might raise exceptions for Ruby code (it won’t warn or raise exceptions on wrong argument count in lambdas for example).
As a rule of thumb: if Ruby code works without warnings then the compiled C extension will also work without warnings, but if the Ruby code emits warnings, then the compiled C extension might emit less warnings.
Also, sometimes the compiled C extension will raise slightly different exceptions than the Ruby code would.
Because the compiled code is a C extension, it will also behave like a C extension:
set_trace_func will only get C extension events
("c-call", "c-return", "raise"), instead of the usual Ruby code eventsAll methods and procs defined in Ruby code that is compiled to a C extension
will return -1 as arity. So if your Ruby code depends on the method
arity, you might have to change it.
Continuations (callcc) are possible in compiled Ruby code, but you might see
wrong behavior with local variables (i.e. they will have an old value),
depending on whether these local variables are needed for a closure or not.
In short, you probably should not use continuations in Ruby code that you want to compile to a C extension.
In Ruby every scope (class scope, methods, ...) has an associated C struct
SCOPE, which stores the local variables (including $~ and $_) for that
scope. Methods that are implemented in a C extension don’t get such a SCOPE,
because it is expensive to set up and they don’t need it (they just use
variables on the stack).
That means that the methods in Ruby code that is compiled to a C extension won’t get a SCOPE either, which is usually no problem because local variables are handled differently anyway. It only gets a problem, if the code uses methods that work with the current SCOPE. Here are some of these methods:
local_variablesevalbindingThese methods will work, but they will see the SCOPE of the nearest Ruby code
scope. This is also true for $_ and $~ (and the derived $&, $`, $',
$1, ...). Example:
def bar
a = 2
"xxx" =~ /./ # changes $~ of the nearest Ruby code scope
p eval("a") # won't return 2 (a's value), instead it will be evaled in the
# nearest Ruby code scope
local_variables # returns local variables of the nearest Ruby code scope
end
If bar is compiled into a C extension and then called from the following Ruby code:
a = b = 42
$~ = nil
p bar
p $~
Then it will output
42
["a", "b"]
#<MatchData: ...>
instead of the expected
2
["a"]
nil
Another consequence is that if compiled methods call each other, they will all
share the same $_ and $~ (those of the nearest Ruby code scope). This
might make some code behave unexpectedly.
The so called vmode is also associated with the Ruby SCOPE (though not
directly stored inside the SCOPE struct). The vmode is set by the methods
private, protected, public and module_function and determines which
visibility newly defined methods get.
Ruby2CExtension can not (easily) access Ruby’s internal vmode, instead it
tries to figure out at compile time what visibility a method defined using
def will get. But be careful with Module#define_method and
Module#attr*: those methods will use Ruby’s internal vmode.
So, there are actually two different vmodes at work for compiled Ruby
code: the vmode that Ruby2CExtension uses at compile time to guess which
visibility methods defined with def get and Ruby’s internal vmode at
runtime which is used by Module#define_method and Module#attr*.
Ruby2CExtension’s compile time vmode heuristic works by replacing calls to the
vmode changing methods with nil and instead changing the compile time vmode.
The calls are only replaced if they are without receiver and without
arguments. And it also depends on where the calls are made:
In the toplevel scope only calls to private and public are replaced, the
default vmode in the toplevel scope is private.
In a module scope calls to private, protected, public and
module_function are replaced, the default vmode in a module scope is public.
In a class scope calls to private, protected and public are replaced,
the default vmode in a class scope is public.
In methods and blocks no calls are replaced and all methods defined with
def will be public.
If your code doesn’t do anything tricky, then the compile time heuristic should just work as expected. Here is an example:
# start of file
# default vmode is private
def m1; end # private
public # this call will be replaced with nil
# vmode is now public
def m2; end # public
class A
# default vmode is public
def m2; end # public
private # this call will be replaced with nil
# vmode is now private
def m3 # private
def foo # public, because it is in a method
end
end
end
protected # this call is not replaced and will probably fail because
# #protected is not defined at toplevel
# end of file
Here is an example, where it fails:
class A
def pub; end # public
private if nil # Ruby2CExtension replaces the call to private with nil
# vmode is now private
def pub2; end # will be private in the compiled C extension
end
But this should be a pretty uncommon case. The visibility of methods defined
using def should be correct for most Ruby code.
It is a bit more complicated for methods/attributes defined using
define_method or the attr* methods. Their visibility is determined by
Ruby’s internal vmode at run time.
As explained above, C extension code does not get its own SCOPE, so it also
doesn’t get its own vmode. When a C extension is required, Ruby
sets the vmode to public before loading the C extension. All the class/module
scopes that are executed during the require are executed in the same SCOPE
and so also with the same vmode.
Because all the vmode changing method calls are replaced with nil, Ruby’s
internal vmode will probably stay public all the time and so all
methods/attributes defined using define_method or the attr* methods will
be public. Here is an example:
# Ruby's internal vmode is public
class A
# still the same internal vmode, still public
define_method(:a) {} # public
private # is replaced with nil, does not affect Ruby's internal vmode
def b; end # private
# but Ruby's internal vmode is still public
define_method(:c) {} # public
attr_accessor :d, :e # also public
end
If those methods really need another visibility, then it can be changed explicitly:
class A
define_method(:a) {}
private :a
end
In methods the internal vmode is that of the nearest Ruby SCOPE (as explained above), so it is usually unpredictable. And additionally calling one of the vmode changing methods will also affect the SCOPE of the caller:
# in compiled C extension
def my_private
private
end
# in Ruby code
class A
def a;end # public
my_private
def b;end # private, but would be public if the above code was not compiled
end
To be save, don’t use the vmode changing methods without arguments inside
methods, instead set the visibility explicitly (e.g. private :a).
The so called cref is another Ruby internal that Ruby2CExtension has to emulate. The cref is a linked list that describes the current lexical class/module nesting. Example:
class A
class B
# cref here is B -> A -> Object
end
# cref here is A -> Object
module C
# cref here is C -> A -> Object
end
end
# cref here is Object
The current cref is used for constant and class variable lookup and for def,
undef and alias.
Ruby2CExtension emulates cref with the same semantics as Ruby. There are only two problems.
First, when a method is defined in Ruby code, it saves the current cref for
later use. From C it isn’t possible to store an extra value when defining a
method, so Ruby2CExtension works around this problem by storing the cref in a
global variable. But there is still a problem when a def is used/run
multiple times, because the cref can differ each time.
So if a def that requires a cref is used multiple times, then the compiled C
extension will raise an exception (this is the only case where code compiled
with Ruby2CExtension raises an exception that Ruby wouldn’t raise). If a def
doesn’t need a cref, then everything is fine and it can be used multiple
times. Examples:
["a", "b"].each { |s|
class << s
def bar; self; end # is OK, doesn't need cref
end
}
["a", "b"].each { |s|
class << s
def baz; Array.new(self); end # fails the 2nd time
end
}
The second case fails because the constant lookup for Array needs a cref. If
you really need this to work, then you can use ::Array instead of Array,
because that won’t need a cref.
Again in the usual case everything should be fine and if you really need to
use a def that requires a cref multiple times, then you might be able to
modify it so that it won’t need a cref.
The second problem that arises from Ruby2CExtension emulating crefs is that
methods that access Ruby’s internal cref, will see the wrong cref and thus
not behave as expected. Those methods are Module.nesting, Module.constants
(but Module#constants works), autoload and autoload? (use
Module#autoload and Module#autoload? instead).
There currently is no way to pass a Proc instance as the block parameter to a method call on the C side, i.e. there is no way to do the following from C:
def foo(proc)
bar(&proc)
end
Ruby2CExtension works around this by compiling the above to something similar to this:
def foo(proc)
tmp_proc = proc.to_proc
bar { |*arg| tmp_proc.call(*arg) }
end
The downside of this workaround is that it doesn’t work (as expected) with
methods like instance_eval and it is problematic if a proc is passed deep
into a recursive method, because that block will then be wrapped multiple
times.
This issue might be fixed if a future Ruby version provides a way to cleanly pass a proc to a method call.
Block argument semantics are a bit tricky, but Ruby2CExtension should get it right for most cases, here are two cases where the result is wrong:
def t1(*a); yield *a; end
p t1([1, 2]) { |a| a }
def bl_pa_tst(); p yield([1, 2]); end
pc = proc { |*a| a }
bl_pa_tst(&pc)
bl_pa_tst { |*a| a }
Ruby outputs:
[1, 2]
[[1, 2]]
[[1, 2]]
The compiled C extension outputs:
[[1, 2]]
[1, 2]
[1, 2]
But again, for most cases it should just work and maybe it will get better in future Ruby2CExtension versions.
super with implicit arguments and optional argumentsIn a compiled C extension super with implicit arguments will not use
optional arguments that have not been given by the original caller. Example:
class A
def foo(*a)
p a
end
end
class B < A
def foo(a = nil)
super
end
end
B.new.foo
In Ruby this code will output [nil], in the compiled form it will output
[].
Ruby files that use one or more of the above described problematic things will usually compile just fine and fail or behave wrong at runtime. In contrast, the following things will be catched at compile time. Most of them are just not supported yet and might be added later.
Most of the things that are not supported are related to control flow. Ruby
implements most of its control flow using setjmp() and longjmp(). Most of
this is in eval.c and there is no real API to access it, so Ruby2CExtension
has to do some tricks to make control flow work. For most cases workarounds
are implemented, but some of the harder cases are not (yet) supported.
The first thing that isn’t supported is return from inside a block. Also
break with a value from inside a block is not supported (but break without
a value is):
def foo
bar(1, 2, 3) {
next # works
next 42 # works
redo # works
break # works
break 42 # does not work
return # does not work
return 42 # does not work
}
return # works
return 42 # works
end
For while/until loops everything works.
Another problematic area in Ruby2CExtension 0.1.0 was control flow “through” a
rescue or ensure clause. This is mostly implemented now, only in ensure
clauses it still does not work (it does work in ensure bodies):
def foo
while bar?
begin
next # works
next 42 # works
redo # works
break # works
break 42 # works
return # works
return 42 # works
rescue
next # works
next 42 # works
redo # works
break # works
break 42 # works
return # works
return 42 # works
ensure
next # does not work
next 42 # does not work
redo # does not work
break # does not work
break 42 # does not work
return # does not work
return 42 # does not work
end
end
end
defined?The defined? statement is also not fully supported, it works for most of the
common cases like constants, global variables, instance variables, class
variables and $~ (and derived). Ruby2CExtension fails at compile time if a
case is not supported.
super with implicit arguments in Ruby 1.8.5 and laterFor Ruby 1.8.5 super with implicit arguments is only supported in methods (not in blocks or
rescue/ensure clauses). For Ruby 1.8.4 super with implicit arguments is
supported everywhere.
To work around this problem, just specify the arguments explicitly:
def foo(a, b)
3.times { super(a, b) }
end
instead of:
def foo(a, b)
3.times { super }
end