2

I got a problem about recreating sessions in Kusto. I got an event named client_session_start, which timetamp gives me info about when a player starts a new session. The problem is that I don't have an event for the session end, so I should label everything between those 2 events per each player with a new session ID. This is how it looks like

ID_player   Timestamp   event_name
aaa         12:00:00    client_session_start
aaa         12:30:00    client_session_start

In this case, I need to label everything between 12:00 and 12:30 and also the last session (past 12:30) which doesn't have the next ''client_session_start''.

I thought I could extract all the events named ''client_session_start'', rank them by user and timestamp, then merge the new session_id and use something like the SQL function last_value() to fill the gaps (something like this SQL QUERY replace NULL value in a row with a value from the previous known value)

but it seems I cannot do that in that way in Kusto. Any suggestion?

ianux22
  • 405
  • 4
  • 16
  • this should be solvable using either [row_window_session()](https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/row-window-session-function) or the [scan operator](https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/scan-operator). if you were to provide an input sample data set (with the additional records between session start events), and the expected output for it - people may be able to show you how to use the above operators. – Yoni L. Jun 02 '22 at 17:51

1 Answers1

1

partition & row_cumsum()

datatable (ID_player:string, Timestamp:timespan, event_name:string)
[
     "aaa", "12:00:00", "client_session_start"
    ,"aaa", "12:10:00", ""
    ,"aaa", "12:20:00", ""
    ,"aaa", "12:30:00", "client_session_start"
    ,"bbb", "12:00:00", "client_session_start"
    ,"bbb", "12:10:00", ""
    ,"bbb", "12:20:00", ""
    ,"bbb", "12:30:00", "client_session_start" 
    ,"bbb", "12:40:00", ""    
    ,"bbb", "12:50:00", "client_session_start"  
    ,"bbb", "13:00:00", ""        
    ,"ccc", "12:00:00", "client_session_start" 
    ,"ccc", "12:10:00", ""
    ,"ccc", "12:20:00", ""
    ,"ccc", "12:30:00", ""
    ,"ccc", "12:40:00", ""             
]
|   partition hint.strategy=native by ID_player
    (
            order by Timestamp asc
        |   extend session_index = row_cumsum(case(event_name == "client_session_start", 1, 0))
    )
ID_player Timestamp event_name session_index
aaa 12:00:00 client_session_start 1
aaa 12:10:00 1
aaa 12:20:00 1
aaa 12:30:00 client_session_start 2
bbb 12:00:00 client_session_start 1
bbb 12:10:00 1
bbb 12:20:00 1
bbb 12:30:00 client_session_start 2
bbb 12:40:00 2
bbb 12:50:00 client_session_start 3
bbb 13:00:00 3
ccc 12:00:00 client_session_start 1
ccc 12:10:00 1
ccc 12:20:00 1
ccc 12:30:00 1
ccc 12:40:00 1

Fiddle

David דודו Markovitz
  • 42,900
  • 6
  • 64
  • 88
  • Can you explain me a bit? Bacause I tried to attach your code from '' | partition hint '' to the end in a script in which I have ['events.all'] | where ...... | partition hint .... (your code), but and error was raised. – ianux22 Jun 03 '22 at 17:41
  • (1) What is the error that you get? (2) what part do you need explanation for? – David דודו Markovitz Jun 03 '22 at 18:26
  • (1) clientRequestId: KustoWebV2;1b45e9ad-d530-41a4-bf60-b3daa64beb20 (2) I did not understand why your solution work on the table you create for example but doesn't work on mine. Is there some problem with the data type? Here you can see my code https://pastebin.com/BqEsxsJF – ianux22 Jun 03 '22 at 19:20
  • (1) Not much information.... (2) Can you add to your post a datatable with a few rows? – David דודו Markovitz Jun 03 '22 at 19:27
  • 1
    Sorry Bro, your code is great and I'm dumb. I forgot to modify event_name with my actual event_name column in the data. Now works perfectly. Thanks for your time and patience and have a good day/evening! – ianux22 Jun 03 '22 at 19:37