I have a username column in my table: username VARCHAR(50) I want to index this column, so I can find a username easily and send messages to users like: "this username has already been taken." So what is the best index plan. the table use innodb engine. Can I use fulltext index?
- 165
- 7
-
1Full text indexes are supported for InnoDB since 5.6 version. But you don't need them. – ypercubeᵀᴹ Sep 07 '16 at 17:45
1 Answers
Enforcing username uniqueness
If you want to enforce uniqueness for the values stored in your username column and you also have a user_id column set up as the PRIMARY KEY (PK) of the relevant table, you might be interested in the implementation of a
for the username column.
Such situation implies that, at the logical level, the username would be an ALTERNATE KEY (AK). An AK is a column, or a combination of columns, that holds values that uniquely identify a certan row but was not chosen as the PK of the pertinent table; each table can have zero, one or more AKs. As suggested above, they are usually defined in a SQL DDL structure via a UNIQUE constraint which is served, at the physical level, by a UNIQUE index.
Logical and physical levels
As it is paramount to distinguish logical from physical aspects when constructing a relational database —independently of the platform of choice—, you might find the concepts and links included in this answer of help.
Displaying a warning
If you want to display a message to the people trying to enter username values that have already been taken, you should develop such functionality in the code of the application programs that have access to your database.
Setting up the length of a column
If you don’t want to allow username values with a length greater than 18 characters, then you might like to define the column size accordingly, i.e., VARCHAR(18), so you may find of value the following content from the MYSQL reference manual about
from which I will stand out the following paragraph:
The
CHARandVARCHARtypes are declared with a length that indicates the maximum number of characters you want to store. For example,CHAR(30)can hold up to 30 characters.
Prefix and Fulltext indexes
@Ypercube has enriched this answer with responses to your individual questions about some types of indexes:
Can I use Prefix index?
Technically you could but there is no reason for that. Prefix indexes are rarely - or never - useful in MySQL. If you want to restrict the length of the usernames to 18, the column should be VARCHAR(18) and not (50). If you allow only ASCII characters in the usernames, the column should have an ascii character set and not UTF-8. And even if you keep the VARCHAR(50), the index should still be in the whole column, not a prefix.
What about a Full Text index?
No, you don’t need them either. Full text indexes are for when one wants to search words or phrases in large texts. Not useful in your case. A simple (unique) B-tree index would be the best.
-
I added the username to the primary key as well: 'PRIMARY KEY (user_id, username)' , Now, If I send a query and ask for username only, without user_id, is it still a fast query? You know, because of the message I mention in my question. – cool Sep 11 '16 at 01:56
-
@cool There are some things that I need to know before answering that question: (1) To define the situation clearly: You are modeling the classic
usertable, right? (2) What are all the columns of the table under discussion? (3) Is there a particular business rule that demands creating a composite PRIMARY KEY made up of the columns nameduser_idandusername? (4) The column calleduser_idis a surrogate, right? -- Once I know those points, I can provide a proper answer. – MDCCL Sep 12 '16 at 20:49 -
(1) Yes, it's a ordinary user table. (2) The columns are
user_id , username, password, security_question, answer, date_of_registration, school_id. (3) I'm not sure if I understand the question but I have a leader board that every user can see, and I don't want they have similar names. (4) Yes user_id is a surrogate key. – cool Sep 13 '16 at 12:28 -
@cool Ok, very good. Now, clarifying question (4) is very important because a relational database is supossed to reflect the business context of interest with accuracy, so that's why I need to know if there is a particular business rule that you are representing by means of the implementation of the primary key (PK) made up of the combination of
user_idandusername. So, to put it another way: Why did you decide to define the PK of theusertable in that way? Is there a particular reason for doing so (perhaps, a real business aspect or, maybe, a personal preference)? – MDCCL Sep 14 '16 at 15:41 -
thank you for your time. Why did you decide to define the PK of the user table in that way? at first I decided to define an id for every user, and use that id as a foreign key in other table (3 other tables), then I realize that I don't like users to have same usernames because of the leaderboard thing I mentioned, I thought maybe using
unique indexwould help, but then in my registration page I wanted to add this feature to message the user for duplicate username and ask him to change it. so I suppose I should have a quick answer from server. – cool Sep 15 '16 at 06:43 -
Summary: I need an id column to use it as a foreign key, it should be unique, I need the username to be unique, for my leaderboard, and I send queries that uses the username only and also queries which only use id. Like
SELECT * FROM user WHERE username = '{$username}' LIMIT 1and other queries like this:SELECT * FROM user WHERE user_id = {$user_id} LIMIT 1– cool Sep 15 '16 at 06:52 -
@cool You are welcome. So, as I mentioned in the answer body, this post where I discuss some relational topics (e.g. logical vs. physical aspects, key identification and declaration, capturing meaning, database design, etc.) is quite relevant to this situation and might help you to avoid future problems of a similar kind. – MDCCL Sep 16 '16 at 15:53
-
@cool As for the definition of the
userPK made up ofuser_idandusername, you should not proceed that way as it is not a real PK because it would not enforce uniqueness for each of the columns individually, so that combination of columns should not be declared as the PK. Instead, you should leaveuser_idas the PK (aided, physically, by its individual index), and setusernamewith a UNIQUE constraint (served, physically, by its own INDEX, which protects uniqueness and, at the same time, enhances speed retrieval), as it is, logically, an ALTERNATE KEY. – MDCCL Sep 16 '16 at 15:59 -
The
user_idcolumn should be used, as you rightly mentioned, for FK references from (or to) other tables, (e.g., it might be a FK that referencesperson.person_id) and its PK index might make it, physically, a little faster than theusername's one regarding queries, but the index of the UNIQUE constraint fixed tousernameshould be quite fast as well. So, both indexes should behave quite well depending on the server and hardware configurations, network speed, etc. – MDCCL Sep 16 '16 at 16:10 -
In this situation, the
usernameALTERNATE KEY is a natural key that is made up of real data. Theuser_idsurrogate (which you specified) should be meaningless and system generated value that, by itself does not provide row meaning uniqueness, so it is necessary to find a natural key (usernamein this case), and declare it properly so that the platform can protect the meaning uniqueness of each row. Surrogates should be used only in specific cases as they have to be implemented as an extra column and very likely and additional index, which has physical and logical repercussions, etc. – MDCCL Sep 16 '16 at 16:25