Database design where % of rows have unused attributes/columns

Question

Is there a theory/practice that promotes a design by analysing the usage of attributes of a table, such that given a threshold of unused columns over a percentage of rows would suggest that the attributes do not actually belong that that table, but instead in a relationship?

example:

users
| id | name  | completed_sign_up |
| 1  | Bob   | null |
| 2  | Sarah | false |
| 3  | Penny | true |

The above false and nil cases are the same, so instead one could say completed_sign_up is not a property of users, but instead should be a join table on it, eg:

sign_ups
| id | user_id | 
| 10 | 3

If by "nil" you mean "null", there is a guy called C. J. Date who claims nulls don't belong in an RDBMS. The meaning of null is not false, it is "not known". I can imagine attributes for your "sign_ups", like "when". So a separate entity may not be a bad idea. — Gerard H. Pille, Mar 04 '20 at 13:33
Oops. updated nil->null. cheers... I'm wondering if there is a practice or theory for this, or is it as CJ Date has said. — Ian Vaughan, Mar 04 '20 at 15:03
It may be useful to consider "what is the set of all database principles that deal with null values?". — Joe Borysko, Mar 04 '20 at 19:03

Joe Borysko · Answer 1 · 2020-03-04T19:02:15.310

0

The question of 'when should nulls be used?' is touchy. See this post as an example. I haven't heard of a principle that is based on the percentage of null values. How would the threshold be decided?

To avoid nulls completely, I also thought of the same 'join table' solution you did: for every nullable column, convert it to a one-to-zero-or-one lookup table. I use that pattern occasionally in practice.

edited Mar 04 '20 at 19:02

answered Mar 04 '20 at 18:42

Joe Borysko

406
3
7

Database design where % of rows have unused attributes/columns

1 Answers1