R data.table symbols and operators you should know

R data.table code becomes more efficient — and elegant — when you take advantage of its special symbols and functions. With that in mind, we’ll look at some special ways to subset, count, and create new columns.

For this demo, I’m going to use data from the 2019 Stack Overflow developers survey, with about 90,000 responses. If you want to follow along, you can download the data from Stack Overflow.

If the data.table package is not installed on your system, install it from CRAN and then load it as usual with library(data.table). To start, you may want to read in just the first few rows of the data set to make it easier to examine the data structure. You can do that with with data.table’s fread() function and the nrows argument. I’ll read in 10 rows:

data_sample <- fread("data/survey_results_public.csv", nrows = 10)

As you’ll see, there are 85 columns to examine. (If you want to know what all the columns mean, there are files in the download with the data schema and a PDF of the original survey.) 

To read in all the data,  I’ll use:

Copyright © 2020 IDG Communications, Inc.

Source link