Creating a PHP extension in Rust

UPDATE: A few hours after posting the initial draft of this I realized my PHP benchmark was broken. I’ve since updated the PHP and Rust versions to be more fair. You can see the changes in the GitHub repo (link at the bottom).

Last October I had a discussion with one of my coworkers at Etsy about how writing extensions to interpreted languages like PHP, Ruby or Python should be a lot easier than it is now. We talked a bit about how one of the barriers to successfully writing an extension is they’re generally written in C, and it’s hard to gain confidence in C code if you’re not an expert at the language.

Ever since then I’ve toyed with the idea of writing one in Rust, and for the past few days have been playing around with it. I finally got it to work this morning.

Rust in C in PHP #

My basic idea was to write some Rust code compiled into a library, write some C headers for it, use it in C to make an extension to be called from PHP. Not the most straightforward thing in the world, but it seemed like a bit of fun.

Rust FFI #

The first thing I did was start toying around with the Rust Foreign Function Interface (FFI) which allows Rust to talk to C. I wrote a quick library with a single method hello_from_rust that takes a single argument (a pointer to a C char, otherwise known as a string) and prints “Hello from Rust, ” followed by the input.

// hello_from_rust.rs
#![crate_type = "staticlib"]

#![feature(libc)]
extern crate libc;
use std::ffi::CStr;

#[no_mangle]
pub extern "C" fn hello_from_rust(name: *const libc::c_char) {
    let buf_name = unsafe { CStr::from_ptr(name).to_bytes() };
    let str_name = String::from_utf8(buf_name.to_vec()).unwrap();
    let c_name   = format!("Hello from Rust, {}", str_name);
    println!("{}", c_name);
}

I ripped most of this from Calling a Rust library from C (or anything else!). There’s a good explanation of what’s going on there.

Compiling this gives us a .a file, libhello_from_rust.a. This is a static library that contains all of it’s own dependencies, and we can link to it when compiling a C program, which we’ll do next. Notice when we compiled it we get the following output:

note: link against the following native artifacts when linking against this static library
note: the order and any duplication can be significant on some platforms, and so may need to be preserved
note: library: System
note: library: pthread
note: library: c
note: library: m

This is the Rust compiler telling us what else we need to link against when using this dependency.

Calling Rust from C #

Now that we have a library we have to do two things to make it callable from C. First we need to create a C header file for it, hello_from_rust.h and link to it when we compile.

Here’s the header file:

// hello_from_rust.h
#ifndef __HELLO
#define __HELLO

void hello_from_rust(const char *name);

#endif

This is a pretty basic header file and just provides the signature/definition for a single function. Next we need to write a C program that uses it.

// hello.c
#include <stdio.h>
#include <stdlib.h>
#include "hello_from_rust.h"

int main(int argc, char *argv[]) {
    hello_from_rust("Jared!");
}

And to compile it we run:

gcc -Wall -o hello_c hello.c -L /Users/jmcfarland/code/rust/php-hello-rust -lhello_from_rust -lSystem -lpthread -lc -lm

Notice the -lSystem -lpthread -lc -lm at the end telling gcc to link against those “native artifacts” in the order they were given by the Rust compiler when we compiled our Rust library.

Once we run that we’ll have a binary, hello_c that we can run:

$ ./hello_c
Hello from Rust, Jared!

Nice! We just called a Rust library from C. Now we just need to figure out how to shoe-horn that into a PHP extension.

Calling C from PHP #

This part took me awhile to figure out. The documentation on how to a PHP extension isn’t the best in the world. The nicest part is the PHP source comes bundled with a script ext_skel (which probably stands for “extension skeleton”) that will generate the majority of the boilerplate code you’ll need. I leaned pretty heavily on this PHP doc, “Extension structure” while trying to get this to work.

You start by downloading and un-taring the PHP source, cd'ing into the PHP directory and running:

$ cd ext/
$ ./ext_skel --extname=hello_from_rust

This will generate the basic skeleton needed to create a PHP extension. Now move that folder to wherever you want to keep your extension locally, and move your

into that same directory. So now you should have a directory that looks like this:

.
├── CREDITS
├── EXPERIMENTAL
├── config.m4
├── config.w32
├── hello_from_rust.c
├── hello_from_rust.h
├── hello_from_rust.php
├── hello_from_rust.rs
├── libhello_from_rust.a
├── php_hello_from_rust.h
└── tests
    └── 001.phpt

1 directory, 11 files

You can see a good description of what most of these files are at the PHP Docs, “Files which make up an extension”. We’ll start by editing config.m4

Without all the comments, here’s what mine looked like after I got it to work:

PHP_ARG_WITH(hello_from_rust, for hello_from_rust support,
[  --with-hello_from_rust             Include hello_from_rust support])

if test "$PHP_HELLO_FROM_RUST" != "no"; then
  PHP_SUBST(HELLO_FROM_RUST_SHARED_LIBADD)

  PHP_ADD_LIBRARY_WITH_PATH(hello_from_rust, ., HELLO_FROM_RUST_SHARED_LIBADD)

  PHP_NEW_EXTENSION(hello_from_rust, hello_from_rust.c, $ext_shared)
fi

As I understand it, these are basically macros. The documentation for these macros is pretty awful (for example, googling “PHP_ADD_LIBRARY_WITH_PATH” turns up no front page results written by the PHP team). The PHP_ADD_LIBRARY_PATH macro I stumbled upon in some ancient thread where someone talked about linking a static library to a PHP extension. The rest of the macros in use were recommended in the comments generated when I ran ext_skel.

Now that we have build configuration setup, we need to actually call into our library from the PHP script. To do this we modify the automagically generated file, hello_from_rust.c. First we add our hello_from_rust.h header file in the includes, and then we modify the PHP method definition of confirm_hello_from_rust_compiled.

#include "hello_from_rust.h"

// a bunch of comments and code removed...

PHP_FUNCTION(confirm_hello_from_rust_compiled)
{
    char *arg = NULL;
    int arg_len, len;
    char *strg;

    if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "s", &arg, &arg_len) == FAILURE) {
        return;
    }

    hello_from_rust("Jared (from PHP!!)!");

    len = spprintf(&strg, 0, "Congratulations! You have successfully modified ext/%.78s/config.m4. Module %.78s is now compiled into PHP.", "hello_from_rust", arg);
    RETURN_STRINGL(strg, len, 0);
}

Notice I added hello_from_rust("Jared (fromPHP!!)!");.

Now we can try to build our extension:

$ phpize
$ ./configure
$ sudo make install

That’s it! That will generate our meta-configuration, run Make’s configure command and then install the extension. I have to use sudo personally when installing because my user doesn’t own the PHP extension install directory.

Now we can run it!

$ php hello_from_rust.php
Functions available in the test extension:
confirm_hello_from_rust_compiled

Hello from Rust, Jared (from PHP!!)!
Congratulations! You have successfully modified ext/hello_from_rust/config.m4. Module hello_from_rust is now compiled into PHP.
Segmentation fault: 11

Sweet! PHP is reaching into our C extension, seeing our list of available methods and calling. Then the C extension is reaching into our Rust library and printing our string. That’s rad! But… what’s up with that segfault at the end?

As far as I can tell, it’s related to using the Rust println! macro, but I haven’t dug into it that far. If we remove that and return a char* instead from our Rust library, the segfault goes away.

Here’s the Rust code:

#![crate_type = "staticlib"]

#![feature(libc)]
extern crate libc;
use std::ffi::{CStr, CString};

#[no_mangle]
pub extern "C" fn hello_from_rust(name: *const libc::c_char) -> *const libc::c_char {
    let buf_name = unsafe { CStr::from_ptr(name).to_bytes() };
    let str_name = String::from_utf8(buf_name.to_vec()).unwrap();
    let c_name   = format!("Hello from Rust, {}", str_name);

    CString::new(c_name).unwrap().as_ptr()
}

And the C header change:

#ifndef __HELLO
#define __HELLO

const char * hello_from_rust(const char *name);

#endif

And the C extension change:

PHP_FUNCTION(confirm_hello_from_rust_compiled)
{
    char *arg = NULL;
    int arg_len, len;
    char *strg;

    if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "s", &arg, &arg_len) == FAILURE) {
        return;
    }

    char *str;
    str = hello_from_rust("Jared (from PHP!!)!");
    printf("%s\n", str);

    len = spprintf(&strg, 0, "Congratulations! You have successfully modified ext/%.78s/config.m4. Module %.78s is now compiled into PHP.", "hello_from_rust", arg);
    RETURN_STRINGL(strg, len, 0);
}

Useless micro-benchmarks #

So why might you want to do this? Well, I honestly don’t have a real world use case for this yet. But I did think of a good example of how a PHP extension could be radical, the Fibonacci sequence algorithm. It’s usually pretty straight-forward (here it is in Ruby):

def fib(at) do
    if (at == 1 || at == 0)
        return at
    else
        return fib(at - 1) + fib(at - 2)
    end
end

And it has really horrible performance, that can be improved by not using recursion:

def fib(at) do
    if (at == 1 || at == 0)
        return at
    elsif (val = @cache[at]).present?
        return val  
    end

    total  = 1
    parent = 1
    gp     = 1

    (1..at).each do |i|
        total  = parent + gp
        gp     = parent
        parent = total
    end

    return total
end

So let’s write two examples of this, on in PHP and one in Rust and see which is faster. Here’s the PHP version:

<?php

function fib($at) {
    if ($at == 0 || $at == 1) {
        return $at;
    } else {
        $total  = 1;
        $parent = 1;
        $gp     = 0;

        for ($i = 1; $i < $at; $i++) {
            $total  = $parent + $gp;
            $gp     = $parent;
            $parent = $total;
        }

        return $total;
    }
}

for ($i = 0; $i < 100000; $i ++) {
    fib(92);
}

And here’s how long it takes to run:

$ time php php_fib.php

real    0m2.046s
user    0m1.823s
sys 0m0.207s

Now let’s do the Rust version. Here’s the library source:

#![crate_type = "staticlib"]

fn fib(at: usize) -> usize {
    if at == 0 {
        return 0;
    } else if at == 1 {
        return 1;
    }

    let mut total  = 1;
    let mut parent = 1;
    let mut gp     = 0;
    for _ in 1 .. at {
        total  = parent + gp;
        gp     = parent;
        parent = total;
    }

    return total;
}

#[no_mangle]
pub extern "C" fn rust_fib(at: usize) -> usize {
    fib(at)
}

Quick note, I compiled the library with rustc -O rust_lib.rs to enable compiler optimizations (because we are benchmarking here). Here’s the C extension source (relevant excerpt):

PHP_FUNCTION(confirm_rust_fib_compiled)
{
    long number;

    if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "l", &number) == FAILURE) {
        return;
    }

    RETURN_LONG(rust_fib(number));
}

And the PHP script to run:

<?php
$br = (php_sapi_name() == "cli")? "":"<br>";

if(!extension_loaded('rust_fib')) {
    dl('rust_fib.' . PHP_SHLIB_SUFFIX);
}

for ($i = 0; $i < 100000; $i ++) {
    confirm_rust_fib_compiled(92);
}
?>

And here’s it running:

$ time php rust_fib.php

real    0m0.586s
user    0m0.342s
sys 0m0.221s

You can see that it’s ~3x as fast! Sweet micro-benchmarks for Rust!

Conclusions #

There’s really almost nothing to conclude here. I’m not sure where writing a PHP extension in Rust is a good idea, but it was a fun way to spend a few hours digging into Rust, PHP and C.

If you want to see all of the code and screw around with it, go ahead and take a look at the GitHub Repo for it.

 
467
Kudos
 
467
Kudos

Now read this

Sprockets with Rails 2.3

At FutureAdvisor our main app is still in Rails 2.3 because upgrading to Rails 3 is kind of a pain. I like to keep a few side projects around, like MyRoommate, to play around with new technologies. I’ve upgraded MyRoommate to use Rails... Continue →