To estimate if a certain vector of numbers will fit into memory, you can quite easily predict the memory usage based on the size of the vector. An `integer`

vector will use 4 bytes per number, and a `numeric`

vector 8 bytes (double precision float). The following function prints the estimated memory usage of a vector based on the size of the vector and the type of vector:

1 2 3 4 5 6 7 8 9 10 11 12 |
predict_data_size = function(numeric_size, number_type = "numeric") { if(number_type == "integer") { byte_per_number = 4 } else if(number_type == "numeric") { byte_per_number = 8 } else { stop(sprintf("Unknown number_type: %s", number_type)) } estimate_size_in_bytes = (numeric_size * byte_per_number) class(estimate_size_in_bytes) = "object_size" print(estimate_size_in_bytes, units = "auto") } |

For example:

1 2 3 4 5 |
> predict_data_size(1518*1518, "numeric") 17.6 Mb > predict_data_size(1518*1518, "integer") 8.8 Mb > |

To print the size of the vector in a nice format, I change the class of `estimate_size_in_bytes`

to `"object_size"`

. In this way if I call `print`

on the object, R will call `print.object_size`

(see `utils:::print.object_size`

for the source), which performs the formatting.

You can also use this function to estimate the size of matrices and multi-dimensional arrays, it is the total size which matters. Note that the R object (vector, matrix, array) will take a little more space if it uses metadata (e.g. dimnames), but for any decently sized object this is probably small compared to the size of the numbers.

## Leave a Reply