Finding number of NAs across all columns, row by row, and assigning number to new variable

Question

So I have a dataframe which has multiple columns and many rows. I want to be able to assign the number of NAs across all the columns row by row to a new variable (NACount). Something like this:

Col1 Col2 Col3 Col4 NACount
 A     A   B    NA     1
 B     B   NA   NA     2

I built a loop to do this but my data set is huge so the loop takes forever! Here is my code:

    for(i in 1:nrow(dat)){

      temp = which(!is.na(dat[i,]))

      dat$NACount[[i]] = length(temp)

       }

Please help me find an easier approach/way to do this!

Thanks so much!

dat$na_count = apply(dat,1,function(x) sum(is.na(x))) – spazznolo Jun 27 '19 at 16:50 — spazznolo, Jun 27 '19 at 16:50

score 3 · Accepted Answer · answered Jun 27 '19 at 16:50

Use rowSums:

dat[["NACount"]] <- rowSums(is.na(dat))

This is much faster than, say, apply:

microbenchmark::microbenchmark(
  rowSums = rowSums(is.na(dat)), 
  apply = apply(dat, 1, function(x) sum(is.na(x)))
)

Output:

Unit: microseconds
    expr     min       lq     mean  median       uq      max neval cld
 rowSums  78.033  88.4245 112.5160 106.839 116.1365  439.751   100  a 
   apply 632.643 657.8040 768.2667 674.395 725.2615 6124.064   100   b

Finding number of NAs across all columns, row by row, and assigning number to new variable

1 Answers1