0

I have a dataset of instances of protests in the US and want to assign certain values to a district variable based on the state and the county. Basically I want to "tell" the code "If an event occured a) in X state and b) in one of Y districts it belongs in Z district. I tried this:

#Alabama
dfprotest$district[dfprotest$ADMIN1=="Alabama" & dfprotest$ADMIN2=="Washington" | "Mobile" | "Baldwin" | 
                                                          "Escambia" | "Monroe" | "Clarke"]<- "AL-1"

It didn't work though. Anyone want to help an R-noob out?

Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
  • 1
    Noob (working) way is `dfprotest$ADMIN2=="Washington" | dfprotest$ADMIN2=="Mobile" | dfprotest$ADMIN2=="Baldwin"...`. Better way is `dfprotest$ADMIN2 %in% c("Washington", "Mobile", "Baldwin", ...)` – Gregor Thomas Nov 02 '20 at 20:08
  • 1
    Alternately, pro way is to make a look-up table. Create a data frame that has columns `ADMIN1`, `ADMIN2` and `district`, use a merge/join to add the `district` column to `dfprotest`: `dplyr::left_join(dfprotest, district_lookup, by = c("ADMIN1", "ADMIN2"))`. – Gregor Thomas Nov 02 '20 at 20:11
  • 1
    Here's a worked example of [using a lookup table](https://stackoverflow.com/q/64649153/903061). – Gregor Thomas Nov 02 '20 at 20:12

1 Answers1

0

I would use dplyr library mutate + case_when combination

library(dplyr)
dfprotest <- dfprotest %>%
    mutate(district = case_when(ADMIN2 %in% c("Washington", "Mobile", "etc") ~ "AL-1"))
markus
  • 25,843
  • 5
  • 39
  • 58
hachiko
  • 671
  • 7
  • 20