Monday, December 31, 2007

A Nice Side Effect Of Query Cache

I came across an interesting advantage of the query cache over the MSSQL internal caching strategy today. Because the query cache is simplistic - that is it stores queries and results and returns results from the pre-computed cache when a repetitive query is encountered, it is also great for repetitive data analysis in a more OLAP environment.

For example - if the client application asks many queries, and the data isn't updated very frequently - say once a week, the mySQL query cache beats out the performance of MSSQL's strategy the more often you repeat the query or the longer the time between data updates.

The result of this seems to be that MySQL is a great storage engine for things like neural analysis and even testing and storing the results of multiple regression analyses.

As a practical example - I have a regression equation in a client application and I want to test and store the results of the regression given various weights on different variables. Because the base data which is being inputted in to the regression equation doesn't change, if I want to test a universe of 100 variations over 12 variables, I'm going to run 1200 regressions on the same base data coming out of the database. The query cache through computing the first time and serving cached answers the second time can put the onus of speed back on the client application thereby dividing the area for possible performance bottleneck across the whole system.

In short, MSSQL doesn't do this except as its proprietary caching strategy sees fit. My results net MySQL about 5-6 times faster even after the MSSQL cache has adapted and almost 20 times faster on the second through 5th regression.