Ruby's Bowels: Array
Exactly how Ruby’s arrays work has always been somewhat of a mystery to me. They’re certainly fast enough for most things, but where exactly they shine and where exactly they fail was for me and, I imagine, most Ruby programmers not well known.
So, let’s get right down to it. Ruby’s Arrays are backed by actual C arrays.
When you first declare a basic array ([] or Array.new)
an instance of the RArray struct is created, and it points to a
C array which is allocated with a size of 16. (As indiciated in array.c by the
const ARY_DEFAULT_SIZE.)
rb_ary_store function in array.c. rb_ary_store is the
function which runs when you try to store a new value in an array. The snippet
below is what runs when the index of that value exceeds the currently allocated
space.
if (idx >= RARRAY(ary)->aux.capa) {
long new_capa = RARRAY(ary)->aux.capa / 2;
if (new_capa < ARY_DEFAULT_SIZE) {
new_capa = ARY_DEFAULT_SIZE;
}
new_capa += idx;
if (new_capa * (long)sizeof(VALUE) <= new_capa) {
rb_raise(rb_eArgError, "index too big");
}
REALLOC_N(RARRAY(ary)->ptr, VALUE, new_capa);
RARRAY(ary)->aux.capa = new_capa;
}
Alright, let’s break this down. RARRAY(ary)->aux.capa is the
current size of the C array. (Note that this is not necessarily the same size
as the Ruby array, but it is at least the size of said Ruby array.) A few
lines down from where we see the initial use of the above statement we see:
new_capa += idx. idx refers to the index of the value
that we are currently trying to store in the array, which has caused this code
to run.
Now, new_capa is now the old size divided by 2 plus the index of
the value we’re trying to add.
After a check to make sure we haven’t exceeded the maximum size of Ruby’s max
array size, we run REALLOC_N (a helper function defined in the Ruby
source code) to increase the size of C array to new_capa and
finally we store new_capa in place of the old value.
So, let’s take a snippet of Ruby code and see what this code would try to do.
a = [] # the C array is initialized to 16 elements a[16] = 'foo' # it is reallocated to (16 / 2 + 16) or 24 a[50] = 'blah!' # now it is (24 / 2 + 50) or 62
So, what if you were to run the following, incredibly common, code in Ruby?
a = []
100_000.times do |i|
a << 'shenanigans!'
end
At what indices do the reallocations take place? You can use the following little snippet to figure that out.
x = 16
ultimate_size = 100_000
ultimate_size.times do |i|
if i >= x
x = x / 2 + i
print "#{x}, "
end
end
That blurb yields the following output: 24, 36, 54, 81, 121, 181, 271, 406, 609, 913, 1369, 2053, 3079, 4618, 6927, 10390, 15585, 23377, 35065, 52597, 78895, 118342. So, appending 100,000 elements to sequentially to an empty Ruby array means reallocating the C array backing it 22 times. It also means that the larger the size of the array the greater the amount of wasted memory. After the last reallocation the C array has a size of 118,342. We end up using 100,000. So we end up with 18,342 bytes of wasted memory. 18k isn’t terrible. But how about a 1 million element array? How much is wasted for that? It ends up being 347,984 bytes.
That sucks, but if you know how many things you’re going to be putting into it,
this problem has a solution. Array.new(n) where n is
the number of elements. Ruby will initialize the C array to the correct number
of elements, and no reallocations will have to take place.
So, you might be thinking that doing so many reallocations is probably really slow, right? Well… Not really. Keep in mind that the difference between the number of allocations for 100,000 elements and 1,000,000 elements is only 6. It goes from 22 to 28. (Thus the increase in wasted memory.) Beyond that, reallocations are really pretty fast.
The reason I started looking into Ruby’s arrays in more depth is because I’m working on a package to extend the number of data structures available in Ruby. There are of course, many many more data structures than the standard Array and Hash that Ruby offers. (I know there are more, but those are the most common.) It seems to me that offering more data structures could be a very good thing… If people can figure out which ones are best for their problem, of course.
As part of that, I think that offering linked lists could be a nifty feature, depending on what you’re doing. I figured that adding an arbitrary number of elements might be somewhat faster, due to not having to reallocate things. Turns out I was wrong. Here’s the results of sequentially appending 1,000,000 elements onto a blank list and a blank array.
user system total real
list: 0.430000 0.020000 0.450000 ( 0.449394)
array: 0.330000 0.000000 0.330000 ( 0.332871)
Turns out that (using my linked list implementation anyway) it’s slower. Now, of course, that’s just when appending to the array. Pushing things onto the front of the array is a totally different story.
Since Ruby arrays are backed by actual C arrays, pushing something onto the front of the array means moving all of the rest of the elements one byte forward. Here, the linked lists shine. (I reduced the amount to 100,000 elements, because I’m impatient.)
user system total real
list: 0.040000 0.000000 0.040000 ( 0.046550)
array: 3.840000 0.010000 3.850000 ( 3.855183)
So, in this case the code runs about 100 times slower than the equivalent
appending code. The story gets more interesting when we change the number of
elements we’re appending. The above 100x slowdown was for 100,000 records.
My results for 50,000 records was a 50x slowdown. 170x for 200,000 records, and
358x for 400,000 records. This all points to the fact that unshift
as well as shift are O(n) operations. (As opposed to the O(1) of
appending.)
Of course, random access is in Ruby arrays is blazingly fast as well.
So, to sum up: Ruby arrays are backed by actual C arrays, which are reallocated on the fly. Appending to Ruby arrays is O(1) and fast. Prepending to Ruby arrays is O(n)... Don’t do it.