Programmers hate deleting things; we’ve all felt that feeling in the pit our stomach when we realise that thing we deleted was really deleted, and o

Avoiding the soft delete anti-pattern · Cultured Systems

submited by
Style Pass
2024-05-11 08:30:02

Programmers hate deleting things; we’ve all felt that feeling in the pit our stomach when we realise that thing we deleted was really deleted, and on the other hand the security of deleting some unused code, safe in the knowledge that it’s still really there in version control. In the sphere of databases, this terror of deleting things leads people to advocate soft deletion: instead of really deleting a record, you add a field which marks the record as deleted, and you treat any record marked in that way as if it were deleted. This is generally a bad idea, and there are a number of better ways of ensuring access to old data.

The main problem with soft deletion is that you’re systematically misleading the database. This is most obvious in the case of foreign key constraints. If it’s important to your data integrity that some column in one table references a row that exists in another table, a foreign key constraint instructs the database to enforce that by, among other things, preventing the deletion. A similar issue arises with unique constraints: conceptually, uniqueness probably shouldn’t consider deleted items (say a user deletes their account, and later wants to sign up again with the same email address), but the database doesn’t know that. You also face difficulties with modifying the database structure, as you have to consider, not just how these modification affect live data, but how it would affect historical data, where the change might not make sense. You can mitigate some of these problems by carefully crafting triggers and constraints to take into account the deletion marker, but that adds significant complexity.

Because you’re misleading the database, you also encounter problems with querying the data. You can no longer rely on SELECTs to only return live data; you have to remember to check the deletion marker in each WHERE clause (and in each JOIN; I nearly forgot about that while writing this, a reflection of the many times I’ve forgotten about it while implementing this). You can potentially avoid this by creating views that filter out the deleted rows, or you can outsource the problem to an ORM (with something like Hibernate’s @Where annotation which can add the deletion marker to its constructed queries). But these are fragile and, especially for the ORM case, leaky abstractions; it’s easy to make a mistake and end up querying soft deleted data, or for the ORM to make a mistake and try and query data which it then fails to find because it thinks its deleted.

Leave a Comment