36

If A is a friend of B, then should I store both values AB and BA, or one is enough? What are the advantages and disadvantages of both methods.

Here is my observation:

  • If I keep both then I have to update both when receive a request from a friend.
  • If I don't keep both, then I found it difficult when having to do multiple JOIN with this table.

Currently, I keep the relationship one way.

enter image description here

So what should I do in this case? Any advice?

Nick Chammas
  • 14,670
  • 17
  • 75
  • 121
roxrook
  • 485
  • 1
  • 4
  • 7

5 Answers5

34

I would store AB and BA. A friendship is really a two-way relationship, each entity is linked to another. Even though intuitively we think of the "friendship" as one link between two people, from a relational point of view it is more like "A has a friend B" and "B has a friend A". Two relationships, two records.

datagod
  • 7,091
  • 4
  • 36
  • 56
  • 4
    Many thanks. I really need to think about your idea carefully! The reason that I avoid storing AB and BA is because of the storage, since each time I have a friendship, my table would store twice as much. – roxrook Jan 05 '12 at 18:31
  • 1
    You are right about the storage, but remember that if stored as integers, each friend-friend relationship would take around 30 byes (2 records x 3 columns x 4 bytes per integer = 24 bytes plus some padding). 1 million people with 10 friends each would still be only around 300MB of data. – datagod Jan 05 '12 at 19:48
  • 1
    datagod: that's right! – roxrook Jan 06 '12 at 02:32
  • This is how I designed my tables as well, AB & BA. – kabuto178 Feb 06 '13 at 13:14
  • 3
    Plus, in situations where there is only AB and not BA, this can represent a 'pending friend request'. – Greg May 13 '14 at 13:31
  • Wouldn't this slow down the "get A's friends" query? – Zorayr Oct 23 '20 at 00:04
  • How do you find unique friendship from this data. If I want a set of friendships where both directions (A->B and B->A) are satisfied, how do I query a list of unique tuples of (A, B)? Right now I can do an inner join and get a list of entries with bidirectional relationships only, but both (A, B) and (B, A) show up in my results. How do you get only the unique results from them? – NurShomik Jan 26 '22 at 17:14
  • @NurShomik you are asking for half of a relationship. If you could post your question along with an example of your query I am sure you would get good help with the answer. Tag me if you can, and I'll put in my two cents. – datagod Mar 19 '22 at 13:25
14

If friendship is intended to be symmetrical (i.e. it is not possible for A to be friends with B but not vice-versa) then I would just store the one way relationship with a check constraint ensuring that each relationship can only be represented one way.

Also I would ditch the surrogate id and have a composite PK instead (and possibly a composite unique index also on the reversed columns).

CREATE TABLE Friends
  (
     UserID1 INT NOT NULL REFERENCES Users(UserID),
     UserID2 INT NOT NULL REFERENCES Users(UserID),
     CONSTRAINT CheckOneWay CHECK (UserID1 < UserID2),
     CONSTRAINT PK_Friends_UserID1_UserID2 PRIMARY KEY (UserID1, UserID2),
     CONSTRAINT UQ_Friends_UserID2_UserID1 UNIQUE (UserID2, UserID1)
  ) 

You don't say the queries that this makes difficult but you can always create a View

CREATE VIEW Foo
AS
SELECT UserID1,UserID2 
FROM Friends
UNION ALL
SELECT UserID2,UserID1 
FROM Friends
Martin Smith
  • 84,644
  • 15
  • 245
  • 333
  • I know this is quite old, so sorry for digging this up. Wouldn't it be better to NOT define the reverse friendship index UNIQUE, in order to not put unnecessary and redundant extra burden on INSERTs? Since we have PRIMARY KEY (a,b) and since a PK is UNIQUE, the reversed KEY (b,a) is also UNIQUE no matter what. – tfrommen Sep 17 '15 at 16:51
  • 1
    @tf Guess that depends on the querŷ optimiser. As you point out it is only necessary to check one way round so the insert plan might do this anyway. The question is tagged MySQL - no idea how that behaves. – Martin Smith Sep 17 '15 at 19:21
  • I know this is an old answer, but I just want to point out to anyone stumbling on this that MySQL completely ignores CHECK constraints (though it will "parse" them successfully) so this approach is probably not the way to go with that technology. – Micah Aug 08 '16 at 21:17
  • @Micah true. I wasn't aware of that in 2012. Still will work in other DBMSs... – Martin Smith Aug 08 '16 at 21:22
  • +1 for implementing a View for that. Having AB & BA stored brings in inconsistency (if the relationship is not bi-directional) whereas this method is a better approach – imans77 Mar 14 '19 at 16:39
7

Assuming a "friendship" is always two-way/mutual, I'd probably handle it something like this.

CREATE TABLE person (
    person_id int IDENTITY(1,1) PRIMARY KEY,
    ...other columns...
)

CREATE TABLE friendship (
    friendship_id int IDENTITY(1,1) PRIMARY KEY,
    ...other columns, if any...
)

CREATE TABLE person_friendship (
    person_id int NOT NULL,
    friendship_id int NOT NULL
    PRIMARY KEY (person_id, friendship_id)
)

The result is that you change it from a many-to-many join from "person" to "person", to a many-to-many join from "person" to "friendship". This will simplify joins and constraints, but has the side effect of allowing more than two people in a single "friendship" (though maybe the additional flexibility would be a potential advantage).

db2
  • 9,658
  • 3
  • 34
  • 58
5

You may need to define indexes around friendships instead of doubling the number of rows:

CREATE TABLE person
(
    person_id INT NOT NULL AUTO_INCREMENT,
    ...
    PRIMARY KEY (person_id)
);
CREATE TABLE friendship
(
    friend_of INT NOT NULL,
    friend_to INT NOT NULL,
    PRIMARY KEY (friend_of,friend_to),
    UNIQUE KEY friend_to (friend_to,friend_of)
);

This way, you double the storage for indexes but not for the table data. As a result, this should be a 25% savings on diskspace. The MySQL Query Optimizer will choose perform index range scans only, which is why the concept of covering indexes works well here.

Here are some nice links on Covering Indexes:

CAVEAT

If friendship is not mutual, you have the basis for another type of relationship : FOLLOWER

If friend_to is not a friend of friend_of, you can simply leave that relationship out of the table.

If you want to define relationships for all types, whether they are mutual or not, you could probably use the following table layout:

CREATE TABLE person
(
    person_id INT NOT NULL AUTO_INCREMENT,
    ...
    PRIMARY KEY (person_id)
);
CREATE TABLE relationship
(
    rel_id INT NOT NULL AUTO_INCREMENT,
    person_id1 INT NOT NULL,
    person_id2 INT NOT NULL,
    reltype_id TINYINT,
    PRIMARY KEY (rel_id),
    UNIQUE KEY outer_affinity (reltype_id,person_id1,person_id2),
    UNIQUE KEY inner_affinity (reltype_id,person_id2,person_id1),
    KEY has_relationship_to (person1_id,reltype_id),
    KEY has_relationship_by (person2_id,reltype_id)
);
CREATE TABLE relation
(
    reltype_id TINYINT NOT NULL AUTO_INCREMENT,
    rel_name VARCHAR(20),
    PRIMARY KEY (reltype_id),
    UNIQUE KEY (rel_name)
);
INSERT INTO relation (relation_name) VALUES
('friend'),('follower'),('foe'),
('forgotabout'),('forsaken'),('fixed');

From the relation table, you could arrange the relationships to include the following:

  • Friends should be mutual
  • Foes could be mutual or not
  • Followers could be mutual or not
  • The other relationships would be subject to interpretation (by the forgotten or forsaken or the recipient of revenge (fixed))
  • Possibie relationships can be further extended

This should be more robust for all relationships, whether the relationship is mutual or not.

RolandoMySQLDBA
  • 182,700
  • 33
  • 317
  • 520
  • hi @rolandomysqldba, I am big fan of your answers. its really helpful to me (in this case 1st example). Now here is one caveat for me, I want unique relationship. (eg. If user A friends with B then, B friend with A is not acceptable.) should I do with trigger? and what about performance? because I have very huge table (about 1 million records), and If I searching for User A's friends (A is stored in both (friend_of, friend_to) fields, and mysql using only one index, then It is performing very slow. Thats way I must have to store duplicate entries in my table (eg.A->B,B->A). Any better option? – Manish Sapkal Dec 13 '14 at 11:57
1

If you can control within the application that id of A is always lower than id of B ( pre order the A,B elements ids ) you can leverage of asking without an OR ( select where id_A = a AND id_B = b, instead of asking (id_A = a AND id_B = b) OR (id_A = b AND id_B = a) ), and also maintain half of the records you'll need with the other's approximations. Then you should use another field to maintain the state of the relationship ( are-friends, a-solicited-to-b, b-solicited-to-a, exfriends-a, exfriends-b ), and you're done.

This is the way I've managed my friendship system, and this simplifies the system and uses half the rows you'll need with other systems, only saying A is equals to the lower id value in code.

appartisan
  • 304
  • 1
  • 3