1

Is there a theory/practice that promotes a design by analysing the usage of attributes of a table, such that given a threshold of unused columns over a percentage of rows would suggest that the attributes do not actually belong that that table, but instead in a relationship?

example:

users
| id | name  | completed_sign_up |
| 1  | Bob   | null |
| 2  | Sarah | false |
| 3  | Penny | true |

The above false and nil cases are the same, so instead one could say completed_sign_up is not a property of users, but instead should be a join table on it, eg:

sign_ups
| id | user_id | 
| 10 | 3
Ian Vaughan
  • 111
  • 4
  • 1
    If by "nil" you mean "null", there is a guy called C. J. Date who claims nulls don't belong in an RDBMS. The meaning of null is not false, it is "not known". I can imagine attributes for your "sign_ups", like "when". So a separate entity may not be a bad idea. – Gerard H. Pille Mar 04 '20 at 13:33
  • Oops. updated nil->null. cheers... I'm wondering if there is a practice or theory for this, or is it as CJ Date has said. – Ian Vaughan Mar 04 '20 at 15:03
  • It may be useful to consider "what is the set of all database principles that deal with null values?". – Joe Borysko Mar 04 '20 at 19:03

1 Answers1

0

The question of 'when should nulls be used?' is touchy. See this post as an example. I haven't heard of a principle that is based on the percentage of null values. How would the threshold be decided?

To avoid nulls completely, I also thought of the same 'join table' solution you did: for every nullable column, convert it to a one-to-zero-or-one lookup table. I use that pattern occasionally in practice.

Joe Borysko
  • 406
  • 3
  • 7