Optimisation benchmark results

In an earlier post I outlined some options for optimising a bottleneck in a system I am working on. Some basic benchmark results:

  • Original box query times:

    • 8141 milliseconds - queried all boxes in March
    • 8093 milliseconds - queried all boxes in February
    • 8110 milliseconds - queried all boxes in January
    • 8063 milliseconds - added a new box and queried all boxes in January (again)
    • 7953 milliseconds - queried all boxes in January (again)
  • Query times with 20 MB cache set in MySql:

    • 10666 milliseconds - queried all boxes in March
    • 8555 milliseconds - queried all boxes in February
    • 7927 milliseconds - queried all boxes in January
    • 7919 milliseconds - added a new box and queried all boxes in January (again)
    • 55 milliseconds - queried all boxes in January (again)
  • Query times after adding a File Count column to the box table and using this instead of joins to count files. Modified app code to set the file count correctly when adding or editing a box:

    • 974 milliseconds - queried all boxes in March
    • 843 milliseconds - queried all boxes in February
    • 858 milliseconds - queried all boxes in January
    • 861 milliseconds - added a new box and queried all boxes in January (again)
    • 834 milliseconds - queried all boxes in January (again)
  • Query times after adding a File Count column to the box table and using this instead of joins to count files. Modified app code to set the file count correctly when adding or editing a box:

    • 3548 milliseconds - queried all boxes in March
    • 854 milliseconds - queried all boxes in February
    • 840 milliseconds - queried all boxes in January
    • 840 milliseconds - added a new box and queried all boxes in January (again)
    • 38 milliseconds - queried all boxes in January (again)

Conclusion

Cache

Using the cache seems to add overhead (increased query time) for the very first query. After that, subsequent queries which are different (e.g. queries for January, February and March), or are done after the table has changed (e.g. after adding a new box, which will flush the cache) are about the same as without the cache. But when two subsequent queries are the same and have the same result set, the performance increase is jaw dropping - going from 8 seconds to about 50 milliseconds!

File count column

A nice, easy, predictable and decent decrease in query time. Not as good as 50 milliseconds, but it is consistently good on all queries, regardless if anything has changed in the tables.

Both

An interesting mix. Probably what I will go for. Still a big overhead on that first query, but after that I get the benefit of good performance on all queries, and extra-good performance when the cache kicks in (e.g. the last query for January).

posted 2 years ago