2

I have already seen some posts making comparisons of varchar with numeric fields, but most only compare with equal values ​​and do not talk about performance in primary keys.

I have a user registration table where each user needs to have a unique code and the auto-incrementing sequential numbers (1,2,3 ...) are not a good idea.

So I'm using PHP's uniquid() command to generate user codes. Something like this: it5f29b715cff42 containing on average 13 to 15 characters in varchar type fields

I use this code in the primary key because it is the main code of the user's registration, and there will always be a user search for this code in the table.

Some people have suggested switching to a varbinary type field to gain performance. And yet I still don't know if it's the best idea.

So I thought of turning this code into numbers using PHP's hexdec() command. Leaving the code more or less like this: 10001674124319326018 containing an average of 16 to 20 characters. And storing that code in a bigint type field.

I know that the bigint field will always be 8 bytes, but how many bytes would varchar and varbinary in that number of characters occupy? What is the best option to use as a primary key in a table with an average of 2 billion records?

Fernando VR
  • 133
  • 1
  • 4
  • @LowlyDBA Not yet, his situation is completely different, and the answers still leave the same doubt about what I asked about the smaller size of data in the varchar compared to the larger size of numbers in the bigint, and does not even mention the possibility of a varbinary . I still have the same doubt. But thank you very much for the suggestion. – Fernando VR Aug 04 '20 at 21:00

2 Answers2

2

Let's itemize the purposes/goals of a PRIMARY KEY:

  • Uniquely identify each row of a table. This is a requirement for InnoDB.
  • It provides the disk layout order for the data. This sometimes plays into performance issues. (Regular UUIDs are bad for performance because the "next" UUID is nowhere near the "last" UUID. That is the scattering of the data is detrimental.)
  • It "should be" small, since it is used a lot -- for joining, for locating the data when using a secondary index, etc.
  • If there is a single place for assigning the PK, simply use AUTO_INCREMENT. It is fast and well optimized.
  • If you must create the PK from multiple clients, explain further; there are other tricks.

On the other hand, I suggest that fewer than 5% of tables are so big or so busy that this much effort should be spent on designing the FK.

The size and datatype makes some difference in performance. (Others will say it makes a big difference.)

Hex is more compact than decimal, as you found. But you may as well go to more compact by using bin2hex. Unfortunately, uniquid() may be an odd length, so tacking on a '0' would let you get a BINARY(7) from the number. Messy.

Bottom line: For a table that is, or will grow to be, over a billion rows, simply use BIGINT AUTO_INCREMENT. The rest of the stuff is not worth the hassle.

Then shift your focus to other things, such as normalization, partitioning, sharding, etc.

Rick James
  • 78,038
  • 5
  • 47
  • 113
1

Bignint is much faster than cvarchar or varbinary. at anbay time..

And so you can have 18446744073709551615 (unsigned) Rows.

You would use uuids, when you enter data in different databases and try to unite them.

but if you only need unique ids, use bigint.

nbk
  • 8,191
  • 5
  • 13
  • 27
  • Thanks for answering. Is there any possibility that a varchar or varbinary is lighter than bigint if the amount of data is less? Or no matter the situation and size, the bigint will always be lighter and faster for queries? – Fernando VR Aug 04 '20 at 21:05
  • bigint has always 8 byte , you 15 char needs 15, but that is not the issue, the searchung the index is simply quicker. – nbk Aug 04 '20 at 21:10