Why did MySQL remove the query cache feature after version 8.0?
3 Answers
There is a detailed blog from the MySQL server team about this, where Matt Lord says:
The query cache has been disabled-by-default since MySQL 5.6 (2013) as it is known to not scale with high-throughput workloads on multi-core machines.
We considered what improvements we could make to query cache versus optimizations that we could make which provide improvements to all workloads.
While these choices themselves are orthogonal, engineering resources are finite. That is to say that we are shifting strategy to invest in improvements that are more generally applicable to all workloads.
- 7,719
- 2
- 10
- 27
Good riddance !!!
It is a challenge for most database developers to correctly estimate the size of the most common result sets in their applications. Having a large query cache was just a big bandage for that.
There is a bigger reason that foreshadowed the demise of the query cache: Four years ago (June 07, 2014), I answered the post Why query_cache_type is disabled by default start from MySQL 5.6?. In short, the query cache was always inspecting the InnoDB Buffer Pool for changes. You can find this on Pages 209-215 of High Performance MySQL (2nd Edition).
I mentioned this over the years:
Sep 05, 2012: Is the overhead of frequent query cache invalidation ever worth it?Sep 25, 2013: invalidating query cache entries(key)Sep 26, 2013: query cache hit value is not changing in my databaseDec 23, 2013: MySQL with high CPU and memory usage
RIP Query Cache !!!
- 182,700
- 33
- 317
- 520
-
1I agree that query cache was a bandage mostly for counts being so slow on innodb. It made queries and applications be completely unpredictable in response time, you can see milliseconds to hours on the same query depending if it was cached or not.
However, at least it was a bandage. Now you're left bleeding :)
– John Feb 06 '20 at 16:02 -
1The QC existed before InnoDB existed. The old story I heard (perhaps around 2002) was that the QC was added to win a benchmark with MyISAM. – Rick James Feb 06 '20 at 16:36
-
@RickJames Hah sounds plausible, though it's cheating Mysql needs one thing more than anything else in my opinion: full multithreading on all types of queries. They seem to work in that direction (a bit here a bit there) bit but it should have been implemented long ago. That also would have been a killer feature to not just win but 'own' many types of benchmarks – John Feb 07 '20 at 17:53
-
@John - MySQL 8.0.17(?) have a few parallel queries. I'm not excited yet. A product called "shard query" has been around for a long time; it demonstrates that parallelism typically gives only half the theoretical capacity. (eg, 8 cores yields only 4x speedup.) Also, keep in mind that most I/O has no parallelism unless you have certain RAID setups. – Rick James Feb 08 '20 at 01:23
-
@John - and... "Long ago" servers had one, maybe two CPU cores. So parallelism was not very useful. Once there were multiple cores, there was a flurry of activity to clean up the Mutexes so that multiple connections could work in parallel. – Rick James Feb 08 '20 at 01:25
-
@John - and... I suspect that most of the competition has already implemented parallelism. So, "too late". But... Most apps have lots of connections, not a single connection with a long running query. That is, there is a tradeoff between more connections and longer-running queries. So, I focus on speeding up queries. – Rick James Feb 08 '20 at 01:28
-
@RickJames Yea, in my case I've (currently) low amount of connections for the real big queries. Usually just one but that should be served quickly. You are right about performance loss, using parallel connections you "should" significantly lose performance as IO sequential read is much faster and CPU time shoots up from overhead.
Ridiculously I consistently get 25-30 times speedups until CPU limit is reached, with more CPU (currently 36 cores) I should get 250-300 times speedup. Even for locky UPDATES.. Is there a competitive better server that's easy to migrate to from linux mysql ? – John Feb 08 '20 at 03:36
(I agree with the other Answer, but here is my 2-cents-worth.)
As implemented, ...
The QC cannot work with Galera or Group Replication, both of which are getting more traction in the HA arena.
When
query_cache_sizegot big, it got less efficient. This is due to inefficiencies in "pruning". (Note: Aurora reimplemented it, and seems to have fixed this issue.)There is an overhead in every
SELECTbecause it does not know whether the QC will come into play. Decades ago, this was estimated at 11%. Until getting rid of the QC, the only workaround was to do bothquery_cache_size = 0andquery_cache_type = 0in the config file. (Few people realized both were needed.)In the typical Production server, inserts are happening frequently. Since each insert caused a pruning of all entries for that table(s), the QC was virtually useless for such tables.
Perhaps 95% of the hundreds of systems I have reviewed for performance problems are better off without the QC.
- 78,038
- 5
- 47
- 113
Man in the middle caches are no replacement, they don't know if the data has changed. Given that it was a near fatal decision for huge databases. – John Feb 06 '20 at 13:40
The only way to count this below 5 minutes is to spawn 30 threads, separate the data into 30 id brackets, and sum the results up
– John Feb 06 '20 at 23:42The removal of query cache went heavy on me, probably 1000 development hours went into various caching mechanisms. since then. Though query cache was horrible as well, unpredictable and often hard to understand what is going on. – John Feb 06 '20 at 23:46
The problem is that mysql has low single threaded performance, I've best performance when going for 20-30 threads (manually constructing the query)
Currently it can reach about 100mb/sec read on primary index count, ~20-30 mb on non primary access
– John Feb 07 '20 at 17:17From SSH login root, Text results of: B) SHOW GLOBAL STATUS; after minimum 24 hours UPTIME C) SHOW GLOBAL VARIABLES; D) SHOW FULL PROCESSLIST; AND Optional very helpful information, if available includes - htop OR top for most active apps, ulimit -a for a Linux/Unix list of limits, iostat -xm 5 3 for IOPS by device and core/cpu count, for server workload tuning analysis to provide suggestions. – Wilson Hauck Feb 07 '20 at 21:28