Click Snooze’s keys on a hideout and how to enhance the performance of the web application
The use of side cache in memory to improve response times and reduce pregnancy on the database is one of the most common design patterns in web applications. This design is well characterized and has low preparation and maintenance costs. The side cache can be added to an existing web file to accelerate the download of all types of data, from database records to larger parts such as JSON or full HTML responses. By determining the expiration time, “lazy storage” (or “slow loading”) can be implemented, and meet the user’s tolerance of data that does not make sense to be Memcache and Redis the most popular.
As the system load increases, the challenges will definitely appear. Imagine your web application processing millions of orders daily. During peak hours, users suffer from great delay and cross errors. To relieve these problems, you can expand the cache infrastructure by adding additional counterparts, with expectation that the size of the cache in bottlenecks is expected. However, performance problems continue, and your application continues to struggle under pregnancy. This scenario, which I have repeatedly faced in the projects it worked on, requires a precise reassessment of our highly demanding timing approach.
Large technology companies often address such challenges with advanced programs and complex architectural changes. However, these solutions cannot always be accessed or costly for the smaller teams or organizations. Technologies, such as the implementation of the intermediate agents, may not be in line with the request or to adopt dedicated communication protocols with your system restrictions. This article provides a practical method that my colleagues have successfully employed to meet similar challenges in large -scale projects.
Cumin mutations in the upper centenary
One of the frequent problems we faced was the high response times that were observed in the upper centenary like P99, while medium and medium times look great. These high response times occurred due to the need to make the database read when the cache keys were missing.
I created a simulation to represent a typical work burden for the web application that we used to (see the simulation section below for more details). This model application processes 100 requests per second for one type of abstract key. The popularity of the keys follow the distribution of Parito, as two -thirds of user requests focus between the first 1,000 key out of a total of one million key.
Average time to read cake storage is about 50 times higher than the time reading database. The time for the expiration memory expires is 20 minutes.
P99 response times for these top 1,000 keys are very high (left upper chart), while medium and medium response times are within an acceptable threshold. The long tail is not considered from the least popular keys here because the multiplication rate is much lower, which makes temporary storage less efficient for her.
But are the high values in the 99th centenary of popular keys with this bad? This means that about 1 % of the main readings are slow. Although this depends on your application and patterns of use, it is generally not favorable. Think of the scenarios of the following real world:
- One user request brings hundreds of keys – a model case for social feed, news, etc.
- Hundreds of users ask the same key at the same moment – a style of something common for any common location. When this key ends, all these requests are transmitted directly to the database, and hitting them, slowing the entire back interface.
I want to confirm that this does not happen “some requests” but for the most popular fees, which represents the essence of users’ request.
Setting the time for the keys to solving the problem with load times can lead to load times. Unfortunately, this also denies “lazy storage” and requires mechanisms to nullify the cache. Is it possible to eliminate the issue of the main expiration of the authority while continuing to provide data that updated acceptable?
Expanding the time for the keys to the expiration of the keys dynamically
The desire to provide new data while maintaining low response times and downloading the database contradicts its nature. The ideal solution will be to know if the key that was recovered from the cache is about to end and update it if necessary, preferably in the background. Below is a written example of Pseudo-Python:
cache_response = cache.get(key)
if cache_response is not None:
if random() < α:
run async: # this block will run in background and not delay the response
db_response = dabase.get(key)
if db_response is not None:
# note cas_token, see explanation for CAS below
cache.set(key, db_response.value, cache_response.cas_token)
# returning value to the user
return cache_response.value
We have read a key from the cache, and if it succeeds, with a little possibility, set its value again, and extend the expiration time. This includes reading the key from the database to verify updates.
To avoid race conditions, CAS or Check and Set Mode should be used. CAS guarantees that the update of the cache only if the data is not modified through another process since the last time it is read, while maintaining the consistency of data.
The keys should be done in an unenviable way to avoid delaying the user’s responses. Even if that is not possible, it may still be useful to update the key that prevents the user.
Choose the optimal possibility α It depends on your traffic pattern and is generally recommended to be between 1 % and 10 %. The highest values of loading the database increases without greatly affecting popular keys.
In systems where keys are not frequently updated, expiration times are longer, demand rates can be high, and an additional strategy can be implemented. The keys may be updated with the possibility α, but only if the time for the remaining validity below below R..
This technique requires the ability to read expiration time. Redis provides TTL It is for this, while Memcache does not. This can be addressed by storing expiration time along with value, for example, packed in JSON.
simulation
The modification of the cache of your system of production is risky, at the very least, so it is more safe to gradually apply changes. However, due to the unpredictable nature of the incoming traffic, the distinction between small improvements from noise requires a complex data analysis. Development or gradual environments have proven that they are not much better in my situations as they had less traffic.
To treat this, separate simulation simulation tools were used. This approach allowed me to test any behavior in the system safely. I wanted to search much faster than conducting such experiments in the development environment.
While I was composing this article, I collected many code scraps used during these experiments in one tool and made them available through the GitHub: https://github.com/whisk/cachestudy.
conclusion
The expiration time technology that it described in this article shows that improving the performance of the web application does not always require expensive tools or complicated changes. By facing common cache challenges, developers can increase efficiency and deal with high traffic with simple and cost -effective solutions.
However, these improvements should be carefully tested in production environments to ensure their work in a way that does not create new problems. The simulations and pre -production environments often lack the complexity and size of the traffic in the real world, and only by testing under the actual circumstances, you can make sure to provide concrete gains without providing new problems.
Spoiler
Cover the image created by Dall · E.