0

I have a group of times that I have already converted to time intervals, and I would like to assign an unique ID for every unique time intervals. The data frame y looks something like:

start        end
01:00:00     05:00:00
13:00:00     17:00:00
12:00:00     16:00:00
01:00:00     05:00:00
13:00:00     17:00:00

And I used the following code to create a time interval:

y$interval = data.frame(interval=paste(start,end))

and the results look like

start        end           interval
01:00:00     05:00:00      01:00:00 05:00:00
13:00:00     17:00:00      13:00:00 17:00:00
12:00:00     16:00:00      12:00:00 16:00:00
01:00:00     05:00:00      01:00:00 05:00:00
13:00:00     17:00:00      13:00:00 17:00:00

I would now like to create a new column in Y that assigns an unique ID to every unique time interval:

start        end           interval               ID
01:00:00     05:00:00      01:00:00 05:00:00      1
13:00:00     17:00:00      13:00:00 17:00:00      2
12:00:00     16:00:00      12:00:00 16:00:00      3
01:00:00     05:00:00      01:00:00 05:00:00      1
13:00:00     17:00:00      13:00:00 17:00:00      2

I have tried using dplyr's group_indice:

y$id = group_indices(y$interval)

but it assigns ID number 1 to every interval. What should I do?

Thanks so much!

dddd_y
  • 85
  • 1
  • 2
  • 9

1 Answers1

0

I was working in an answer that is really similar (same?) to what @H 1 just did. Note that all columns are character because in your example is not really clear if you are really working with time columns or not.

library(dplyr)

y <- data.frame(
  stringsAsFactors = FALSE,
             start = c("01:00:00","13:00:00",
                       "12:00:00","01:00:00","13:00:00"),
               end = c("05:00:00","17:00:00",
                       "16:00:00","05:00:00","17:00:00")
)

y %>% 
  mutate(interval = paste(start, end)) %>% 
  group_by(interval) %>% 
  mutate(id = group_indices())

#> # A tibble: 5 x 4
#> # Groups:   interval [3]
#>   start    end      interval             id
#>   <chr>    <chr>    <chr>             <int>
#> 1 01:00:00 05:00:00 01:00:00 05:00:00     1
#> 2 13:00:00 17:00:00 13:00:00 17:00:00     3
#> 3 12:00:00 16:00:00 12:00:00 16:00:00     2
#> 4 01:00:00 05:00:00 01:00:00 05:00:00     1
#> 5 13:00:00 17:00:00 13:00:00 17:00:00     3
rdornas
  • 630
  • 7
  • 15
  • Thanks it worked. I am really new to pipe operators, and I would like to know how can I store these new columns: 'Interval' and 'ID' into the original dataset Y for further use? I tried using y <- y %>%, but it said that column interval can't be modified because it's a grouping variable. – dddd_y Mar 01 '20 at 01:02
  • Never mind it worked! Thanks so much! – dddd_y Mar 01 '20 at 01:07