Blog Archives

Using mutate from dplyr inside a function: getting around non-standard evaluation

To edit or add columns to a data.frame, you can use mutate from the dplyr package:

Here, dplyr uses non-standard evaluation in finding the contents for mpg and wt, knowing that it needs to look in the context of

See more ›

Tagged with: ,
Posted in R stuff

Parsing a large amount of characters into a POSIXct object

When trying to parse a large amount of datetime characters into POSXIct objects, it struck me that strftime and as.POSIXct where actually quite slow. When using the parsing functions from lubridate, these where a lot faster. The following benchmark shows

See more ›

Tagged with: ,
Posted in R stuff

Tutorials freely available of course I taught: including ggplot2, dplyr and shiny

I was asked to write a R course for a group of innovative companies in the North of the Netherlands. The group of 12 people was a mix of engineers and programmers, and the course aimed at giving them a

See more ›

Tagged with: , ,
Posted in R stuff

The performance of dplyr blows plyr out of the water

Together with many other packages written by Hadley Wickham, plyr is a package that I use a lot for data processing. The syntax is clean, and it works great for breaking down larger data.frame‘s into smaller summaries. The greatest disadvantage

See more ›

Tagged with: ,
Posted in R stuff

Bubble sorting in R, C++ and Julia: code improvements and the R compiler

In the past few months I have written posts about implementing the bubble sort algorithm in different languages. In the mean while I have gotten some feedback and suggestions regarding improvements to the implementation I made, see the end of

See more ›

Tagged with: , , ,
Posted in R stuff

Parallel processing with short jobs only increases the run time

Parallel processing has become much more important over the years as multi-core processors have become common place. From version 02.14 onwards, parallel processing has become part of the standard R installation in the form of the parallel package. This package

See more ›

Tagged with: , ,
Posted in R stuff

Much more efficient bubble sort in R using the Rcpp and inline packages

Recently I wrote a blogpost showing the implementation of a simple bubble sort algorithm in pure R code. The downside of that implementation was that is was awfully slow. And by slow, I mean really slow, as in “a 100

See more ›

Tagged with: , , ,
Posted in R stuff

Bubble sort implemented in pure R

Please note that this is programming I purely did for the learning experience. The pure R bubble sort implemented in this post is veeeeery slow for two reasons: Interpreted code with lots of iteration is very slow. Bubble sort is

See more ›

Tagged with: , , ,
Posted in R stuff

Parsing complex text files using regular expressions and vectorization

When text data is in a nice CSV format, read.csv is enough to parse it into a useable format. But if this is not the case, getting the data into a useable format is not so straightforward. In this post

See more ›

Tagged with: , ,
Posted in R stuff

Automatic spatial interpolation with R: the automap package

In case of continuously collected data, e.g. observations from a monitoring network, spatial interpolation of this data cannot be done manually. Instead, the interpolation should be done automatically. To achieve this goal, I developed the automap package. automap builds on

See more ›

Tagged with: ,
Posted in R stuff