I have a non-relational table that records visits per area across time (months):
| area_id | period | visitors | visits |
|---|---|---|---|
| 91387821 | 2022-04 | 452 | 664 |
| 91387821 | 2022-05 | 516 | 704 |
| 105252924 | 2022-04 | 8834 | 20445 |
etc.
I need to load this data into a Postgres DB, and generate a primary key for each tuple.
My options for the primary key are:
- Use auto-increment key.
- Use a composite key combining
area_idandperiod. - Generate a unique key from
area_idandperiode.g. 20220491387821 or 91387821202204 for the first record.
Which option will provide the best performance for running summaries across periods per area id? The table is going to have hundreds of millions of records, corresponding to millions of unique area_ids.