-
Notifications
You must be signed in to change notification settings - Fork 0
My "Ruby interface for C++ extensions" repo (this is not the one you want, probably -- see http://github.com/jameskilton/rice instead)
License
cout/rice
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
\mainpage Rice - Ruby Interface for C++ Extensions \section intro Introduction Rice is a C++ interface to Ruby's C API. It provides a type-safe and exception-safe interface in order to make embedding Ruby and writing Ruby extensions with C++ easier. It is similar to Boost.Python in many ways, but also attempts to provide an object-oriented interface to all of the Ruby C API. What Rice gives you: \li A simple C++-based syntax for wrapping and defining classes \li Automatic conversion of exceptions between C++ and Ruby \li Smart pointers for handling garbage collection \li Wrappers for most builtin types to simplify calling code \section installation Installation There are two ways to install Rice. You can download the tarball from the project page: http://rubyforge.org/projects/rice extract it and: \code ./configure make sudo make install \endcode or you can install via RubyGems \code gem install rice \endcode Note to Windows users: Rice is only known to properly compile and run under Cygwin and Mingw. \section tutorial Tutorial \subsection geting_started Getting started Writing an extension with Rice is very similar to writing an extension with the C API. The first step is to create an extconf.rb file: \code require 'mkmf-rice' create_makefile('test') \endcode Note that we use mkmf-rice instead of mkmf. This will ensure that the extension will be linked with standard C++ library. Next we create our extension and save it to test.cpp: \code extern "C" void Init_Test() { } \endcode Note the extern "C" line above. This tells the compiler that the function Init_Test should have C linkage and calling convention. This turns off name mangling so that the Ruby interpreter will be able to find the function (remember that Ruby is written in C, not C++). So far we haven't put anything into the extension, so it isn't particularly useful. The next step is to define a class so we can add methods to it. \subsection classes Defining clases Defining a class in Rice is easy: \code #include "rice/Class.hpp" using namespace Rice; extern "C" void Init_Test() { Class rb_cTest = define_class("Test"); } \endcode This will create a class called Test that inherits from Object. If we wanted to inherit from a different class, we could easily do so: \code #include "rice/Class.hpp" using namespace Rice; extern "C" void Init_Test() { Class rb_cMySocket = define_class("MySocket", rb_cIO); } \endcode Note the prefix rb_c on the name of the class. This is a convention that the Ruby interpreter and many extensions tend to use. It signifies that this is a class and not some other type of object. Some other naming conventions that are commonly used: \li rb_c variable name prefix for a Class \li rb_M variable name prefix for a Module \li rb_e variable name prefix for an Exception type \li rb_ function prefix for a function in the Ruby C API \li rb_f_ function prefix to differentiate between an API function that takes Ruby objects as arguments and one that takes C argument types \li rb_*_s_ indicates the function is a singleton function \li *_m suffix to indicate the function takes variable number of arguments Also note that we don't include "ruby.h" directly. Rice has a wrapper for ruby.h that handles some compatibility issues across platforms and Ruby versions. Always include Rice headers before including anything that might include "ruby.h". \subsection methods Defining methods Now let's add a method to our class: \code #include "rice/Class.hpp" #include "rice/String.hpp" using namespace Rice; Object test_hello(Object /* self */) { String str("hello, world"); return str; } extern "C" void Init_Test() { Class rb_cTest = define_class("Test") .define_method("hello", test_hello); } \endcode Here we add a method Test#hello that simply returns the string "Hello, World". The method takes self as an implicit parameter, but isn't used, so we comment it out to prevent a compiler warning. We could also add an #initialize method to our class: \code #include "rice/Class.hpp" #include "rice/String.hpp" using namespace Rice; Object test_initialize(Object self) { self.iv_set("@foo", 42); } Object test_hello(Object /* self */) { String str("hello, world"); return str; } extern "C" void Init_Test() { Class rb_cTest = define_class("Test") .define_method("initialize", test_initialize); .define_method("hello", test_hello); } \endcode The initialize method sets an instance variable @foo to the value 42. The number is automatically converted to a Fixnum before doing the assignment. Note that we're chaining calls on the Class object. Most member functions in Module and Class return a reference to self, so we can chain as many calls as we want to define as many methods as we want. \subsection data_types Wrapping C++ Types It's useful to be able to define Ruby classes in a C++ style rather than using the Ruby API directly, but the real power Rice is in wrapping already-defined C++ types. Let's assume we have the following C++ class that we want to wrap: \code class Test { public: Test(); std::string hello(); }; \endcode This is a C++ version of the Ruby class we just created in the previous section. To wrap it: \code #include "rice/Data_Type.hpp" #include "rice/Constructor.hpp" using namespace Rice; extern "C" void Init_Test() { Data_Type<Test> rb_cTest = define_class<Test>("Test") .define_constructor(Constructor<Test>()) .define_method("hello", &Test::hello); } \endcode This example is similar to the one before, but we use Data_Type<> instead of Class and the template version of define_class() instead of the non-template version. This creates a binding in the Rice library between the Ruby class Test and the C++ class Test, so that we pass member function pointers to define_method() and have conversions be done automatically. It's possible to write the conversion functions ourself (as we'll see below), but Rice does all the dirty work for us. \subsection conversions Type conversions Let's look again at our example class: \code class Test { public: Test(); std::string hello(); }; \endcode When we wrote our class, we never wrote a single line of code to convert the std::string returned by hello() into a Ruby type. Neverthless, the conversion works, and when we write: \code test = Test.new puts test.hello \endcode We get the expected result. Rice has two template conversion functions to convert between C++ and Ruby types: \code template<typename T> T from_ruby(Object x); template<typename T> Object to_ruby(T const & x); \endcode Rice has included by default specializations for many of the builtin types. To define your own conversion, you can write a specialization: \code template<> Foo from_ruby<Foo>(Object x) { // ... } template<> Object to_ruby<Foo>(Foo const & x) { // ... } \endcode The implementation of these functions would, of course, depend on the implementation of Foo. \subsection data_conversions Conversions for wrapped C++ types Take another look at the wrapper we wrote for the Test class: \code extern "C" void Init_Test() { Data_Type<Test> rb_cTest = define_class<Test>("Test") .define_constructor(Constructor<Test>()) .define_method("hello", &Test::hello); } \endcode When we called define_class<Test>, it created a Class for us and automatically registered the new Class with the type system, so that the calls: \code Data_Object<Foo> obj(new Foo); Foo * f = from_ruby<Foo *>(obj); Foo const * f = from_ruby<Foo const *>(obj); \endcode work as expected. The Data_Object class is a wrapper for the Data_Wrap_Struct and the Data_Get_Struct macros in C extensions. It can be used to wrap or unwrap any class that has been assigned to a Data_Type. It inherits from Object, so any member functions we can call on an Object we can also call on a Data_Object: \code Object object_id = obj.call("object_id"); std::cout << object_id << std::endl; \endcode The Data_Object class can be used to wrap a newly-created object: \code Data_Object<Foo> foo(new Foo); \endcode or to unwrap an already-created object: \code VALUE obj = ...; Data_Object<Foo> foo(obj); \endcode A Data_Object functions like a smart pointer: \code Data_Object<Foo> foo(obj); foo->foo(); std::cout << *foo << std::endl; \endcode Like a VALUE or an Object, data stored in a Data_Object will be marked by the garbage collector as long as the Data_Object is on the stack. \subsection exception Exceptions Suppose we added a member function to our example class that throws an exception: \code class MyException : public std::exception { }; class Test { public: Test(); std::string hello(); void error(); }; \endcode If we were to wrap this function: \code extern "C" void Init_Test() { Data_Type<Test> rb_cTest = define_class<Test>("Test") .define_constructor(Constructor<Test>()) .define_method("hello", &Test::hello) .define_method("error", &Test::error); } \endcode and call it from inside Ruby: \code test = Test.new test.error() \endcode we would get an exception. Rice will automatically convert any C++ exception it catches into a Ruby exception. But what if we wanted to use a custom eror message when we convert the exception, or what if we wanted to convert to a different type of exception? We can write this: \code extern "C" void Init_Test() { Data_Type<Test> rb_cTest = define_class<Test>("Test") .add_handler<MyException>(handle_my_exception) .define_constructor(Constructor<Test>()) .define_method("hello", &Test::hello) .define_method("error", &Test::error); } \endcode The handle_my_exception function need only rethrow the exception as a Rice::Exception: \code void handle_my_exception(MyException const & ex) { throw Exception(rb_eRuntimeError, "Goodnight, moon"); } \endcode And what if we want to call Ruby code from C++? These exceptions are also converted: \code Object o; o.call("some_function_that_raises", 42); protect(rb_raise, rb_eRuntimeError, "some exception msg"); \endcode Internally whenever Rice catches a C++ or a Ruby exception, it converts it to an Exception object. This object will later be re-raised as a Ruby exception when control is returned to the Ruby VM. Rice uses a similar class called Jump_Tag to handle symbols thrown by Ruby's throw/catch or other non-local jumps from inside the Ruby VM. \subsection builtin Builtin types You've seen this example: \code Object object_id = obj.call("object_id"); std::cout << object_id << std::endl; \endcode Rice mimics the Ruby class hierarchy as closely as it can given that C++ is statically typed. In fact, the above code also works for Classes: \code Class rb_cTest = define_class<Test>("Test"); Object object_id = rb_cTest.call("object_id"); std::cout << object_id << std::endl; \endcode Rice provides builtin wrappers for many builtin Ruby types, including: \li Object \li Module \li Class \li String \li Array \li Hash \li Struct \li Symbol \li Exception The Array and Hash types can even be iterated over the same way one would iterate over an STL container: \code Array a; a.push(to_ruby(42)); a.push(to_ruby(43)); a.push(to_ruby(44)); Array::iterator it = a.begin(); Array::iterator end = a.end(); for(; it != end; ++it) { std::cout << *it << std::endl; } \endcode STL algorithms should also work as expected on Array and Hash containers. \subsection inheritance Inheritance Inheritance is a tricky problem to solve in extensions. This is because wrapper functions for base classes typically don't know how to accept pointers to derived classes. It is possible to write this logic, but the code is nontrivial. Forunately Rice handles this gracefully: \code class Base { public: virtual void foo(); }; class Derived : public Base { }; extern "C" void Init_Test() { Data_Type<Base> rb_cBase = define_class<Base>("Base") .define_method("foo", &Base::foo); Data_Type<Derived> rb_cDerived = define_class<Derived, Base>("Derived"); } \endcode The second template parameter to define_class indicates that Derived inherits from Base. Rice does not yet support multiple inheritance, but it is believed that this is possible through the use of mixins. \subsection overloading Overloaded functions If you try to create a member function pointer to an overloaded function, you will get an error. So how do we wrap classes that have overloaded functions? Consider a class that uses this idiom for accessors: \code class Container { size_t capacity(); // Get the capacity void capacity(size_t cap); // Set the capacity }; \endcode We can wrap this class by using typedefs: \code extern "C" void Init_Container() { typedef size_t (*Container::get_capacity)(); typedef void (*Container::set_capacity)(size_t); Data_Type<Container> rb_cContainer = define_class<Container>("Container") .define_method("capacity", get_capacity(&Container::capacity)) .define_method("capacity=", set_capacity(&Container::capacity)) } \endcode A future version of Rice may provide a simplified interface for this. \subsection user_defined_conversions User-defined type conversions Rice provides default conversions for many built-in types. Sometimes, however, the default conversion is not their right conversion. For example, consider a function: \code void foo(char * x); \endcode Is x a pointer to a single character or a pointer to the first character of a null-terminated string or a pointer to the first character of an array of char? Because the second case is the most common use case (a pointer to the first character of a C string), Rice provides a default conversion that treats a char * as a C string. But suppose the above function takes a pointer to a char instead? If we write this: \comment : -- this comment is to satisfy vim syntax highlighting -- \code extern "C" void Init_Test() { define_global_function("foo", foo); } \endcode It will likely have the wrong behavior. To avoid this problem, it is necessary to write a wrapper function: \code Object wrap_foo(Object o) { char c = from_ruby<char>(o); foo(&c); return to_ruby(c); } extern "C" void Init_Test() { define_global_function("foo", wrap_foo); } \endcode Note that the out parameter is returned from wrap_foo, as Ruby does not have pass-by-variable-reference (it uses pass-by-object-reference). Future versions of Rice will have a cleaner way of dealing with this. \section motivation Motivation There are a number of common problems when writing C or C++ extensions for Ruby: \li Type safety. It is easy to mix-up integral types such as ID and VALUE. Some of the functions in the Ruby API are not consistent with which types they take (e.g. rb_const_defined takes an ID and rb_mod_remove_const takes a Symbol). \li DRY principle. Specifying the number of arguments that each wrapped function takes is easy to get wrong. Adding a new argument to the function means that the number of arguments passed to rb_define_method must also be updated. \li Type conversion. There are many different functions to convert data to and from ruby types. Many of them have different semantics or different forms. For example, to convert a string, one might use the StringValue macro, but to convert a fixnum, one might use FIX2INT. Unwrapping previously wrapped C data uses yet another form. \li Exception safety. It is imperative that C++ exceptions never make their way into C code, and it is also imperative that a Ruby exception never escape while there are objects on the stack with nontrivial destructors. Rules for when it is okay to use which exceptions are difficult to get right, especially as code is maintained through time. \li Thread safety. Because the Ruby interpreter is not threads-safe, the Ruby interpreter must not be run from more than one thread. Because of tricks the GC and scheduler play with the C stack, it's not enough to ensure that only one thread runs the interpreter at any given time; once the interpreter has been run from one thread, it must only ever be run from that thread in the future. Additionally, because Ruby copies the stack when it switches threads, C++ code must be careful not to access objects in one Ruby thread that were created on the stack in another Ruby thread. \li C-based API. The Ruby API is not always convenient for accessing Ruby data structurs such as Hash and Array, especially when writing C++ code, as the interface for these containers is not consistent with standard containers. \li Calling convention. Function pointers passed into the Ruby API must follow the C calling convention. This means that it is not possible to pass a pointer to a template function or static member function (that is, it will work on some platforms, but isn't portable). \li Inheritance. When wrapping C++ objects, it is easy to store a pointer to a derived class, but then methods in the base class must have knowledge of the derived class in order to unwrap the object. It is possible to always store a pointer to the base class and then dynamic_cast the pointer to the derived type when necessary, but this can be slow and cumbersome, and it isn't likely to work with multiple inheritance. A system that properly handles inheritance for all corner cases is nontrivial. \li Multiple inheritance. C++ supports true multiple inheritance, but the Ruby object model uses single inheritance with mixins. When wrapping a library whose public interface uses multiple inheritance, care must be taken in constructing the mapping. \li GC safety. All live Ruby objects must be marked during the garbage collector's mark phase, otherwise they will be prematurely destroyed. The general rule is that object references stored on the heap should be either registered with rb_gc_register_address or marked by a data object's mark function; object references stored on the stack will be automatically marked, provided the Ruby interpreter was properly initialized at startup. \li Callbacks. C implements callbacks via function pointers, while ruby typically implements callbacks via procs. Writing an adapter function to call the proc is not difficult, but there is much opportunity for error (particularly with exception-safety). \li Data serialization. By default data objects defined at the C layer are not marshalable. The user must explicitly define functions to marshal the data member-by-member. Rice addresses these issues in many ways: \li Type safety. Rice provides encapsulation for all builtin types, such as Object, Identifier, Class, Module, and String. It automatically checks the dynamic type of an object before constructing an instance of a wrapper. \li DRY principle. Rice uses introspection through the use of templates and function overloading to automatically determine the number and types of arguments to functions. Default arguments must still be handled explicitly, however. \li Type conversions. Rice provides cast-style to_ruby<> and from_ruby<> template functions to simplify explicit type conversions. Automatic type conversions for parameters and return values are generated for all wrapped functions. \li Exception safety. Rice automatically converts common exceptions and provides a mechanism for converting user-defined exception types. Rice also provides convenience functions for converting exceptions when calling back into ruby code. \li Thread safety. Rice provides no mechanisms for dealing with thread safety. Many common thread safety issues should be alleviated by YARV, which supports POSIX threads. \li C++-based API. Rice provides an object-oriented C++-style API to most common functions in the Ruby C API. \li Calling convention. Rice automatically uses C calling convention for all function pointers passed into the Ruby API. \li Inheritance. Rice provides automatic conversion to the base class type when a wrapped member function is called on the base class. \li Multiple inheritance. Rice provides no mechanism for multiple inheritance. Multiple inheritance can be simulated via mixins, though this is not yet as easy as it could be. \li GC safety. Rice provides a handful of convenience classes for interacting with the garbage collector. There are still basic rules which must be followed to ensure that objects get properly destroyed. \li Callbacks. Rice provides a handful of convenience classes for dealing with callbacks. \li Data serialization. Rice provides no mechanism for data serialization, but it is likely this may be added in a future release. \section what_not What Rice is Not There are a number projects which server similar functions to Rice. Two such popular projects are SWIG and Boost.Python. Rice has some distinct features which set it apart from both of these projects. Rice is not trying to replace SWIG. Rice is not a generic wrapper interface generator. Rice is a C++ library for interfacing with the Ruby C API. This provides a very natural way for C++ programmers to wrap their C++ code, without having to learn a new domain-specific language. However, there is no reason why SWIG and Rice could not work together; a SWIG module could be written to generate Rice code. Such a module would combine the portability of SWIG with the maintainability of Rice (I have written extensions using both, and I have found Rice extensions to be more maintainable when the interface is constantly changing. Your mileage may vary). Rice is also not trying to simply be a Ruby version of Boost.Python. Rice does use some of the same template tricks that Boost.Python uses, however there are some important distinctions. First of all, Boost.Python attempts to create a declarative DSL in C++ using templates. Rice is a wrapper around the Ruby C API and attempts to make its interface look like an OO version of the API; this means that class declarations look procedural rather than declarative. Secondly, the Ruby object model is different from the python object model. This is reflected in the interface to Rice; it mimics the Ruby object model at the C++ level. Thirdly, Rice uses Ruby as a code generator; I find this to be much more readable than using the Boost preprocessor library. \section history History Rice originated as a project to interface with C++-based trading software at Automated Trading Desk in Mount Pleasant, South Carolina. The Ruby bindings for Swig were at the time less mature than they are today, and did not suit the needs of the project. Excruby was written not as a wrapper for the Ruby API, but rather as a set of helper functions and classes for interfacing with the Ruby interpreter in an exception-safe manner. Over the course of five years, the project grew into wrappers for pieces of the API, but the original helper functions remained as part of the public interface. This created confusion for the users of the library, because there were multiple ways of accomplishing most tasks -- directly through the C API, through a low-level wrapper around the C API, and through a high-level abstraction of the lower-level interfaces. Rice was then born in an attempt to clean up the interface. Rice keeps the lower-level wrappers, but as an implementation detail; the public interface is truly a high-level abstraction around the Ruby C API. \section gc The GC \li Objects are not automatically registered with the garbage collector. \li If an Object is on the stack, it does not need to be registered with the garbage collector. \li If an Object is allocated on the heap or if it is a member of an object that might be allocated on the heap, use an Address_Registration_Guard to register the object with the garbage collector. \li If a reference counted object is being wrapped, or if another type of smart pointer is wrapped, ensure that only one mechanism is used to destroy the object. In general, the smart pointer manages the allocation of the object, and Ruby should hold only a reference to the smart pointer. When the garbage collector determines that it is time to clean up the object, the smart pointer will be destroyed, decrementing the reference count; when the reference count drops to 0, underlying object will be destroyed. \section embedding Embedding You can embed the Ruby interpter in your application by using the VM class: \code int main(int argc, char * argv[]) { Rice::VM vm(argc, argv); vm.run() } \endcode If the VM is not initialized from main() -- from a callback, for example -- then you may need to initialize the stack whenever you use Rice or the Ruby API: \code std::auto_ptr<Rice::VM> vm; Rice::Object obj; void some_application_extension_init() { vm.reset(new Rice::VM("some_application")); } void some_application_extension_callback() { // Need to initialize the stack here, because we don't know if // we are at the same stack depth as when the VM was initialized vm->init_stack(); // Now do some work... obj->call("some_callback_function") } \endcode Be aware that initializing the Ruby VM can cause a call to exit() if certain command-line options are specified. This has two implications: \li an application that constructs a Ruby VM may terminate unexpectedly if the options passed to the interpreter are not tightly controlled (a security issue), and \li an application that constructs a Ruby VM should not have any objects with nontrivial destructors on the stack when the VM is created, otherwise those objects might not get correctly destructed. vim:ft=cpp:tw=72:ts=2:sw=2:fo=cqrtn:noci:si
About
My "Ruby interface for C++ extensions" repo (this is not the one you want, probably -- see http://github.com/jameskilton/rice instead)
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published