2.2 Excel

No discussion of survey datasets would be complete without a mention of good old Excel. Our advice is to avoid Excel if possible - Excel files can have strange import issues, particularly with dates and formulas.

Where it is necessary to import Excel files there are many choices, but we would recommend these packages:

  • openxlsx - a custom Excel library built in C++ with an R frontend. In our experience this package has the least weird Excel issues, and is fantastic for writing and styling output tables.

  • readxl - Part of the tidyverse. Built on the libxls C library and the RapidXML C++ library. Very fast and reliable, but can do unexpected things when trying to guess column types. Does not support writing Excel workbooks.

  • xlsx - a wrapper for the Apache POI Java library. Apache POI is well maintained, but extremely high memory usage for larger datasets.