Overview
This case study describes general optimization concepts and provides suggestions for preventing a loss of database performance, and for the backend as a whole. The Clever team recently tackled two projects with different technology stacks and requirements, but sharing the problem of constantly expanding database volumes.
Industry
Our projects:
- A catalog-like service that had been under development for over 4 years. Some unsuccessful solutions had been introduced prior to our intervention, so we had to fix the consequences.
- An online store with tens of thousands of products that needed improvements to help it adapt to explosive growth.
Each project serves a different purpose and has a different type of database. Nevertheless, certain approaches to their enhancement coincided, which allowed us to derive a single optimization logic.
Background
A drop in productivity and the failure of certain services inevitably take place as a long-term project grows and develops. Such deficiencies can directly affect the website’s performance and positions in search rankings.
It is well known that Google algorithms and search robots are sensitive to server response speeds. Also, Nginx errors such as 504 errors (which occur when Nginx cannot receive a response from service for an extended time period) negatively affect a company’s online representation and diminish user experience. Visitors frequently abandon websites with slow download speeds and error pages.
Security problems can arise from low performance if malicious users intentionally overload a database, making the entire site unresponsive or completely unavailable.
The Challenge
Project 1
In the initial stages, the project’s owners planned to create a website that would act as a data catalog and a platform for the exchange of services among users.
The server part was written in Node.js, and PostgreSQL was used as the database. Sequelize was used between them as ORM. Over time, new requirements emerged, some of which did not directly affect productivity, while others forced the project owners to make adaptive changes for optimization:
- For example, it was decided to add tariff plans to the project, adding millions of pages, and ranking that imposed more and more ORDER BY and LIMIT – OFFSET constructs, as well as subrequests. The huge amount of data negatively affected performance.
- The second decision was to add a filter for the catalog data, which reduced performance even more.
After these changes were implemented, transition to the last catalog page took up to 16 seconds. When testing and analyzing this problem, we found a lack of indexes, a pagination count with a large number of rows in the table, and the wrong number of argument columns used for sorting and filtering.
As a result of our lengthy survey and certain manipulations described in the next section, we were able to successfully address these problems.
Project 2
Our second project was an online store featuring a large number of constantly updating products. As in the previous project, we used node.js, this time in conjunction with MongoDB and Mongoose.
Our team faced a common problem faced by online stores: the more products added, the lower the response speed. Filtering and sorting were introduced to address this issue, but performance was still falling as the number of users increased. We were able to identify and solve this problem as well.
The Solution
Project 1
First of all, we used different indexes. This was the simplest and most basic thing we could do, and it brought a significant increase in speed.
Next, we started refactoring the database structure and normalizing the data in it. This reduced the physical size of the database and increased the speed of working with it. But these steps were not enough.
An effective solution was devised for caching the data by using materialized representations and temporary tables for the already formed filters.
The most interesting thing we did was to use pagination blanks without the LIMIT – OFFSET clause. Instead, we used the WHERE clause for the columns that were formed during the update of the materialized view.
Project 2
After thorough research, it became clear that when the database was built, its creators had adhered to simple rules characteristic of a regular online store, which is a rather predictable project type.
We added indexes, replaced populates (additional database requests at the ORM level) with aggregations, and began to use projections more often, which resulted in increased response speed.
The main problem boiled down to user sessions (passwords, cookies, cart contents, recommended products) being stored, not in the database, but in express sessions that had never been cleaned. Consequently, data piled up, causing a deterioration in speed. We transferred all sessions to the database and set up periodic cleaning.
Results/Achievements
For Project 1, we reduced the last catalog page’s opening time from 16 seconds to 80 milliseconds.
For Project 2, page opening time was reduced from 10 seconds to 1 second, taking DOM loading into account.
After lengthy research, we determined that the following optimization measures were needed in both cases:
- Use of indexes in places where they have physical weight.
- Data normalization.
- Private cache for individual cases.
- Materialized views.
- Replacing populates with aggregations.
- Applying the WHERE clause instead of the LIMIT – OFFSET clause.
- Periodic scheduled data updates instead of recalculating on demand.
- Due attention was paid to the project’s protection when implementing the points above. It was necessary to sanitize, clean, and strictly validate all data transferred to the database.
This article does not feature some obvious optimization methods, due to their commonplace, volatile character and high cost. For example:
- Increasing server computing power.
- Using balancing and additional servers.
- Software update.
- Explaining to the client that their task extends beyond the project architecture.