Blog Archives

Using mutate from dplyr inside a function: getting around non-standard evaluation

To edit or add columns to a data.frame, you can use mutate from the dplyr package:

Here, dplyr uses non-standard evaluation in finding the contents for mpg and wt, knowing that it needs to look in the context of

Tagged with: ,
Posted in R stuff

Parsing a large amount of characters into a POSIXct object

When trying to parse a large amount of datetime characters into POSXIct objects, it struck me that strftime and as.POSIXct where actually quite slow. When using the parsing functions from lubridate, these where a lot faster. The following benchmark shows

Tagged with: ,
Posted in R stuff

Tutorials freely available of course I taught: including ggplot2, dplyr and shiny

I was asked to write a R course for a group of innovative companies in the North of the Netherlands. The group of 12 people was a mix of engineers and programmers, and the course aimed at giving them a

Tagged with: , ,
Posted in R stuff

Data mining with R course in the Netherlands taught by Luis Torgo

In the course of this year, Dr. Luis Torgo will teach a Data Mining with R course together with the DIKW Academy in Nieuwegein, The Netherlands. Dr. Torgo is an Associate Professor at the department of Computer Science at the

Posted in R stuff

Vectorisation is your best friend: replacing many elements in a character vector

As with any programming language, R allows you to tackle the same problem in many different ways or styles. These styles differ both in the amount of code, readability, and speed. In this post I want to illustrate this by

Tagged with: ,
Posted in R stuff

The performance of dplyr blows plyr out of the water

Together with many other packages written by Hadley Wickham, plyr is a package that I use a lot for data processing. The syntax is clean, and it works great for breaking down larger data.frame‘s into smaller summaries. The greatest disadvantage

Tagged with: ,
Posted in R stuff

Bubble sorting in R, C++ and Julia: code improvements and the R compiler

In the past few months I have written posts about implementing the bubble sort algorithm in different languages. In the mean while I have gotten some feedback and suggestions regarding improvements to the implementation I made, see the end of

Tagged with: , , ,
Posted in R stuff

Parallel processing with short jobs only increases the run time

Parallel processing has become much more important over the years as multi-core processors have become common place. From version 02.14 onwards, parallel processing has become part of the standard R installation in the form of the parallel package. This package

Tagged with: , ,
Posted in R stuff

Julia is lightning fast: bubble sort revisited

I had heard the name of the new technical computing language Julia buzzing around for some time already. Now during Christmas I had some time on my hands, and implemented the bubble sort algorithm that I have already posted about

Tagged with: ,
Posted in R stuff

Much more efficient bubble sort in R using the Rcpp and inline packages

Recently I wrote a blogpost showing the implementation of a simple bubble sort algorithm in pure R code. The downside of that implementation was that is was awfully slow. And by slow, I mean really slow, as in “a 100

Tagged with: , , ,
Posted in R stuff