5 handy options in R data.table’s fread

Like all functions in the data.table R package, fread is fast. Very fast. But there’s more to fread than speed. It has several helpful features and options when importing external data into R. Here are five of the most useful.

Note: If you’d like to follow along, download the New York Times CSV file of daily Covid-19 cases by U.S. county at https://github.com/nytimes/covid-19-data/raw/master/us-counties.csv.

Use fread’s nrows option

Is your file large? Would you like to examine its structure before importing the whole thing – without having to open it in a text editor or Excel? Use fread’s nrows option to import only a portion of a file for exploration.

The code below imports just the first 10 rows of the CSV.

mydt10 <- fread("us-counties.csv", nrows = 10)

If you just want to see column names without any data at all, you can use nrows = 0

Use fread’s select option

Once you know the file structure, you can choose which columns to import. fread’s select option lets you pick columns you want to keep. select takes a vector of either column names or column-position numbers. If names, they need to be in quotation marks, like most vectors of character strings:

Copyright © 2020 IDG Communications, Inc.

Source link