Bubble sorting in R, C++ and Julia: code improvements and the R compiler

In the past few months I have written posts about implementing the bubble sort algorithm in different languages. In the mean while I have gotten some feedback and suggestions regarding improvements to the implementation I made, see the end of the post for the new source code of the algorithms. These often had a quite profound effect on performance.

One of the best tips was to use the R compiler, i.e. the compiler package which is now part of the standard R distribution. This works by simply calling cmpfun:

The following table presents the timings in microseconds of the different algorithms:

The following things I find striking:

  1. Compiling the for/while loop based R solution benefits massively form the compiler package, increasing the speed almost 5 fold.
  2. Compiling the recursive R based solution does not yield any improvements.
  3. The C++ solution is obviously much faster than any R based solution, increasing the speed between 1900 and 400 times.
  4. The Julia based solution is almost as fast as the C++ solution, which is very impressive for a high-level programming language.
  5. The native R sort is almost 8 times faster than the fastest bubble sort in C++ and Julia, but sort probably uses a faster algorithm that scales O(n log n) in stead of the 0(n^2) of the bubble sort algorithm.

R (recursive implementation). No improvements.

R (for/while loop implementation). This implementation was previously not present.

C++ (linked into R using Rcpp/inline). Precomputing vec.size() outside the loop improved performance by a factor of 2.

Julia. Subtly changing the definition of the for loop (1:(length(vec_in) - 1 - passes) vs [1:(length(vec_in) - 1 - passes)]) improved performance two fold.

Tagged with: , , ,
Posted in R stuff
5 Comments » for Bubble sorting in R, C++ and Julia: code improvements and the R compiler
  1. Julia is not interpreted — it’s just-in-time compiled.

    Bubble sort is slow, requiring n^2 operations rather than n * log n, like any library function would use.

    • Paul Hiemstra says:

      Thanks for the feedback, I will edit in some more details into my post. I chose to try and implement the bubble sort algorithm quite arbitrarily a few months back.

  2. Wayne Folta says:

    R’s sort uses Shellsort or Quicksort, which you can specify with the ‘method=’ keyword. It appears it may default to Shellsort.

    Note also that if the R structure being sorted has names, order will be preserved in the case of ties, which takes additional work but is probably what you want to have happen. I doubt that Julia does this, just as Julia doesn’t handle NA’s by default.

    • Arun says:

      Right, but quick sort isn’t stable. So using base, shell sort would be the only go to.

      • Jens Oehlschlägel says:

        Just for completeness: shellsort is not stable either, but has been made stable in R at additional cost. We should avoid comparing different algos when talking about speed of languages. We also should avoid toy-algos like bubble sort. With a serious N*log(N) algo like mergesort or quicksort, benchmarking gains more credibility, especially because it allows to seriously compare at different N, where beyond pure CPU the timings of memory allocation and cache-efficiency strongly inluence real-world performance. Finally: those serious divide-and-conquer algos are often for good reasons implemented recursively. If one language can compile a recursive algo bether than another language, that is an important result. BTW: the above function bubble_sort_recursive can by definition not benefit from compilation because it recalls bubble_sort, not itself. However recalling bubble_sort_recursive() or using the R convention Recall() doesn’t seem to speed-up recursive call either. However, this is likely not a problem of recursion itself but a consequence of creating multiple copies of the array to be sorted (R passes parameters by value). Compared to all that copying, the recursive overhead is minor. I pointed out already in 1998 that pass-by-value kills recursion performance in S+ (see http://math.yorku.ca/Who/Faculty/Monette/S-news/0190.html) and created package ref as a workaround (now defunct). Therefore Julia wins, R looses. Anyhow, something happens when compiling recursive R functions:

Leave a Reply

Your email address will not be published. Required fields are marked *

To create code blocks or other preformatted text, indent by four spaces:

    This will be displayed in a monospaced font. The first four 
    spaces will be stripped off, but all other whitespace
    will be preserved.
    
    Markdown is turned off in code blocks:
     [This is not a link](http://example.com)

To create not a block, but an inline code span, use backticks:

Here is some inline `code`.

For more help see http://daringfireball.net/projects/markdown/syntax

*