creating time intervals and assigning unique id to every unique interval

Question

I have a group of times that I have already converted to time intervals, and I would like to assign an unique ID for every unique time intervals. The data frame y looks something like:

start        end
01:00:00     05:00:00
13:00:00     17:00:00
12:00:00     16:00:00
01:00:00     05:00:00
13:00:00     17:00:00

And I used the following code to create a time interval:

y$interval = data.frame(interval=paste(start,end))

and the results look like

start        end           interval
01:00:00     05:00:00      01:00:00 05:00:00
13:00:00     17:00:00      13:00:00 17:00:00
12:00:00     16:00:00      12:00:00 16:00:00
01:00:00     05:00:00      01:00:00 05:00:00
13:00:00     17:00:00      13:00:00 17:00:00

I would now like to create a new column in Y that assigns an unique ID to every unique time interval:

start        end           interval               ID
01:00:00     05:00:00      01:00:00 05:00:00      1
13:00:00     17:00:00      13:00:00 17:00:00      2
12:00:00     16:00:00      12:00:00 16:00:00      3
01:00:00     05:00:00      01:00:00 05:00:00      1
13:00:00     17:00:00      13:00:00 17:00:00      2

I have tried using dplyr's group_indice:

y$id = group_indices(y$interval)

but it assigns ID number 1 to every interval. What should I do?

Thanks so much!

Be attentive that the function has an additional `s`. Did you try `group_indices` instead? — rdornas, Mar 01 '20 at 00:31
I forgot to type the 's' in here. But I did have it in my R code. — dddd_y, Mar 01 '20 at 00:33
`group_indices()` works on grouped data so you need to do `y %>% group_by(interval) %>% mutate(id = group_indices())` or you can do `y$id <- match(y$interval, unique(y$interval)`. — Ritchie Sacramento, Mar 01 '20 at 00:42
@dddd_y You need to provide a sample of your data. Run `dput(head(y))` and paste the output into your question. — Ritchie Sacramento, Mar 01 '20 at 01:01
@H1, I was using time columns at first when I was running the pipe, and I've changed it to characters and it worked. — dddd_y, Mar 01 '20 at 01:10

score 0 · Accepted Answer · answered Mar 01 '20 at 00:51

I was working in an answer that is really similar (same?) to what @H 1 just did. Note that all columns are character because in your example is not really clear if you are really working with time columns or not.

library(dplyr)

y <- data.frame(
  stringsAsFactors = FALSE,
             start = c("01:00:00","13:00:00",
                       "12:00:00","01:00:00","13:00:00"),
               end = c("05:00:00","17:00:00",
                       "16:00:00","05:00:00","17:00:00")
)

y %>% 
  mutate(interval = paste(start, end)) %>% 
  group_by(interval) %>% 
  mutate(id = group_indices())

#> # A tibble: 5 x 4
#> # Groups:   interval [3]
#>   start    end      interval             id
#>   <chr>    <chr>    <chr>             <int>
#> 1 01:00:00 05:00:00 01:00:00 05:00:00     1
#> 2 13:00:00 17:00:00 13:00:00 17:00:00     3
#> 3 12:00:00 16:00:00 12:00:00 16:00:00     2
#> 4 01:00:00 05:00:00 01:00:00 05:00:00     1
#> 5 13:00:00 17:00:00 13:00:00 17:00:00     3

Thanks it worked. I am really new to pipe operators, and I would like to know how can I store these new columns: 'Interval' and 'ID' into the original dataset Y for further use? I tried using y <- y %>%, but it said that column interval can't be modified because it's a grouping variable. — dddd_y, Mar 01 '20 at 01:02

creating time intervals and assigning unique id to every unique interval

1 Answers1