Primary key design - autoincrement vs composite vs artificial key for time-series data

Asked May 26 '22 at 19:19

Active May 26 '22 at 19:19

Viewed 13 times

I have a non-relational table that records visits per area across time (months):

area_id	period	visitors	visits
91387821	2022-04	452	664
91387821	2022-05	516	704
105252924	2022-04	8834	20445

etc.

I need to load this data into a Postgres DB, and generate a primary key for each tuple.

My options for the primary key are:

Use auto-increment key.
Use a composite key combining area_id and period.
Generate a unique key from area_id and period e.g. 20220491387821 or 91387821202204 for the first record.

Which option will provide the best performance for running summaries across periods per area id? The table is going to have hundreds of millions of records, corresponding to millions of unique area_ids.

asked May 26 '22 at 19:19

Ruslan

I would prefer 2 or maybe 1. Option 3 is pretty much the same as 2. but makes things unnecessary complicated and has no advantage over 2. – May 26 '22 at 19:31
does the combination of an integer and date have any performance implications? – Ruslan May 26 '22 at 19:36
If you know that the combination is guaranteed to be unique, and you never want to update those columns, go with 2. – Laurenz Albe May 27 '22 at 06:56

Primary key design - autoincrement vs composite vs artificial key for time-series data

0 Answers0