we have some tough performance issues that seem to us that they shouldn't exist, but we don't see any way of solving them.
We have a Gcloud Mysql database, with 4Cpu, 16go Ram. The full instance contains around 50Go (on ssd) but our main database only 15Go.
We get extreme slowdowns on most pages of our app in the middle of the day, when it is the most used. With some pages taking 40s (or more) to load.
We've extracted the queries and by themselves they take 'only' around 500ms, but as soon as I throw concurrent queries (using Jmeter), they baloon. 5 concurrent queries get us to 3s avg, 10 concurrent to 7s, 20 to 30s...
We've tried all kind of optimizations, indexes (on every column separately, composite of each column of the query). In the end I managed to split the time by two and then three, but it's just pushing the problem further.
We didn't augment the resources of the database further, because we've played the queries on a less powerful database and we only saw a linear downgrade, and we kinda expect that if we double the resources we'd just get twice the gain. Which still doesn't feel like it's enough (on top of being more expensive).
Here is an exemple of one of the problematic queries (a count, used for pagination). This is a simplified version, optimized as much as we could, and still it takes around 150ms which is not enough to prevent it from slowing down when doing concurrent queries.
Each filter in the where clause are user inputs and can't be stored. As I said, we have indexes on issue_date, customer_id, status, workflow and customer_id+status+workflow+issue_date. issue_date is a date, customer_id bigint, status and workflow INTEGER.
select count(i1_0.id)
from invoice i1_0
where i1_0.issue_date BETWEEN '2025-01-01' and '2025-07-30'
and (i1_0.customer_id = 20
and (i1_0.status, i1_0.workflow) in ((1,1), (2,1), (3,1), (4,1), (5,1), (6,1), (7,1))
);
That specific query returns 70k results for around 4M without any filter.
We've tried experimenting with read locks but didn't get any result.
We don't have any DB admin here so it's possible that we're missing something obvious, regarding the query, the resources, the locks...
Does someone has any idea ?
Update Here is the "explain" for that query
|id |select_type|table|partitions|type |possible_keys |key |key_len|ref|rows |filtered|Extra |
|---|-----------|-----|----------|-----|---------------------------------------------------------------------------------------------------------|----------|-------|---|-------|--------|------------------------|
|1 |SIMPLE |i1_0 | |range|invoice_issue_date_IDX,invoice_customer_id_IDX,invoice_workflow_num_IDX,invoice_status_num_IDX,composite_idx|composite_idx|23 | |145 523|100 |Using where; Using index|
Update 2 I've upgraded the database to 8cpu, 32Go and tried again. The query alone doesn't go faster (no surprise) but concurrent queries are obviously better. I had to push to 100 concurrent queries to get 1.5s avg, 200 for a 3s average. Obviously it's better but it doesn't feel like a real solution ?
Update 3 The full table. I've removed the indexes because as I stated, I've tried all different kinds and while there were some improvements, nothing solved the core problem. (Which, following @Shadow advice, is probably somewhere else)
CREATE TABLE `invoice` (
`id` bigint NOT NULL AUTO_INCREMENT,
`invoice_type_code` varchar(10) NOT NULL COMMENT ,
`issue_date` date DEFAULT NULL,
`category_code` varchar(3) DEFAULT NULL COMMENT ,
`document_currency_code` varchar(3) DEFAULT NULL COMMENT ,
`tax_exclusive_amount` decimal(19,6) DEFAULT NULL COMMENT ,
`tax_inclusive_amount` decimal(19,6) DEFAULT NULL COMMENT ,
`payable_amount` decimal(19,6) DEFAULT NULL COMMENT ,
`payment_means_code` varchar(11) DEFAULT NULL COMMENT ,
`datecreated` timestamp NULL DEFAULT CURRENT_TIMESTAMP,
`dateupdated` timestamp NULL DEFAULT NULL,
`tax_amount` decimal(19,6) DEFAULT NULL COMMENT ,
`percent` decimal(19,1) DEFAULT NULL,
`supplier_id` bigint DEFAULT NULL,
`customer_id` bigint DEFAULT NULL,
`downloaded` bit(1) DEFAULT NULL COMMENT ,
`status_date` datetime DEFAULT NULL COMMENT ,
`submission_type` varchar(100) DEFAULT NULL COMMENT ,
`tax_type` varchar(100) DEFAULT NULL COMMENT ,
`market_number` varchar(100) DEFAULT NULL COMMENT ,
`obligation_number` varchar(50) DEFAULT NULL,
`comment_status` varchar(2000) DEFAULT NULL,
`invoice_number` varchar(20) NOT NULL COMMENT,
`flux_id` bigint DEFAULT NULL,
`supplier_service` varchar(100) DEFAULT NULL,
`customer_service` varchar(100) DEFAULT NULL,
`metadata` longtext NOT NULL COMMENT,
`due_date` date DEFAULT NULL COMMENT,
`iban` varchar(34) DEFAULT NULL COMMENT,
`department_id` bigint DEFAULT NULL,
`origin_number` varchar(20) DEFAULT NULL,
`supplier_auxiliary_code` varchar(100) DEFAULT NULL COMMENT ,
`supplier_type` varchar(100) DEFAULT NULL COMMENT ,
`supplier_type_code` varchar(100) DEFAULT NULL COMMENT ,
`accounting_date` date DEFAULT NULL COMMENT ,
`custom_info` varchar(500) DEFAULT NULL,
`payment_date` date DEFAULT NULL COMMENT ,
`invoice_format` enum() NOT NULL,
`portal` enum() DEFAULT NULL,
`is_subscriber_deposit` bit(1) DEFAULT b,
`owner_type` enum() DEFAULT NULL,
`from_emission` bit(1) DEFAULT b,
`extracted_from_document` varchar(255) DEFAULT NULL,
`invoice_number_back_03_25` varchar(20) DEFAULT NULL,
`document_type` bigint NOT NULL,
`purchase_accounting_downloaded` bit(1) DEFAULT b,
`status` int DEFAULT NULL,
`workflow` int DEFAULT NULL,
PRIMARY KEY (`id`),
CONSTRAINT `fk_invoice_to_customer` FOREIGN KEY (`customer_id`) REFERENCES `structure` (`id`) ON DELETE RESTRICT ON UPDATE RESTRICT,
CONSTRAINT `fk_invoice_to_flux` FOREIGN KEY (`flux_id`) REFERENCES `invoice_flux` (`id`) ON DELETE RESTRICT ON UPDATE RESTRICT,
CONSTRAINT `fk_invoice_to_supplier` FOREIGN KEY (`supplier_id`) REFERENCES `structure` (`id`) ON DELETE RESTRICT ON UPDATE RESTRICT
) ENGINE=InnoDB AUTO_INCREMENT=5104843 DEFAULT CHARSET=utf8mb3 COMMENT=
Update 4: Answers to Rick James
Result of the SHOW TABLE STATUS LIKE 'invoice';
|Name |Engine|Version|Row_format|Rows |Avg_row_length|Data_length |Max_data_length|Index_length |Data_free|Auto_increment|Create_time |Update_time |Check_time|Collation |Checksum|Create_options|Comment |
|-------|------|-------|----------|---------|--------------|-------------|---------------|-------------|---------|--------------|-------------------|-------------------|----------|------------------|--------|--------------|----------------------------------------------------|
|invoice|InnoDB|10 |Dynamic |4 295 224 |319 |1 370 488 832 |0 |3 908 190 208 |4 194 304|5 109 810 |2025-09-02 02:42:28|2025-09-02 02:54:21| |utf8mb3_general_ci| | |
The column metadata contains strings of length between 40 and 170 bytes, also very often null.
On the environment where I'm realizing my tests:
- innodb_buffer_pool_size: 4 563 402 752
- RAM: 8GB (but I raised it to 32GB during my test, before lowering it back)
- Volume on the ssd 37GB (shared with other databases)
- Volume in the DB 16GB
- Volume on the table 4GB
On the actual production environment:
- innodb_buffer_pool_size: 11 811 160 064
- RAM: 16GB
- Total Volume on the ssd 50GB (shared with other databases)
- Volume in the DB 16GB
- Volume on the table 4GB
As for the last test, the count query prefers that index (by using explain):
customer_id, workflow_num, status_num, issue_date
EXPLAIN ANALYZEon query and share the results so we know index are used or not/