I have a table with a deleted_at column, see comment.
The deleted_at column behaves like a one-way boolean event flag, i.e. once a row is deleted it can't be restored.
There is also an asynchronous process that permanently deletes rows older than a certain timestamp. This process works in batches, for example: "pt-archiver nibbles records from a MySQL table"
For efficiently looking up rows to delete (and paging in general) the table has an id column as primary key. This column is generated in code, it's a GUID, and it's sortable by creation date.
primary key (`id`)
Deleting rows is a three step process
-- De-activate
update x set deleted_at = :timestamp where id = :id
-- Select IDs to delete
select id from x where deleted_at is not null
-- Hard delete
delete from x where id in (...)
Also, the table is updated in batches with an upsert query. The "on duplicate key update" clause makes use of a unique index consisting of two columns (code, deleted_at). The code must be unique over all active rows, deleted_at is null for active rows, and the upsert query looks like this
insert into x (id, code, foo, deleted_at)
values (...)
on duplicate key update foo=values(foo)
Is it a bad idea to make deleted_at part of a unique index, and if so why?
unique key `uniq_code_deleted_at`(`code`, `deleted_at`)
coderows with differentdeleted_at, unless you have another unique index overcodeby itself. What do you want to happen? – Charlieface May 10 '23 at 11:13x_Activewhich is foreign-keyed tox– Charlieface May 11 '23 at 09:31on duplicate key updateclause of the upsert query ensures there is only one active row per code – mozey May 11 '23 at 09:49