We were approached by a client experiencing a critical issue in their MySQL 8 database system. The primary issue was that many connections were stalled, displaying the status “Waiting for table metadata lock”. Despite no apparent ongoing queries or updates in the process list and the table performance_schena.data_lock_waits being empty, the issue persisted, leading to significant operational disruptions.
To diagnose and resolve this pressing issue we utilized MySQL’s performance schema, a powerful tool for monitoring MySQL server execution at a low level. This, in conjunction with an in-depth examination of the TRANSACTIONS section in SHOW ENGINE INNODB STATUS, enabled us to gather critical insights into the underlying problem.
In the TRANSACTIONS section we were able to clearly identify the locking transaction
TRANSACTION 422015233181904, not started
mysql tables in use 1, locked 1
We than queried the events_transactions_current and events_statements_history_long tables in the performance schema and were able to see the following query executed inside this transaction:
SQL_TEXT: LOCK TABLES table_name WRITE
In MySQL, the implementation of locking mechanisms varies with the storage engine in use. Some engines, like InnoDB, implement row-level locks, allowing for more precise data control by locking only the rows directly involved in a transaction. On the other hand, the LOCK TABLE statement operates at the MySQL server level, enforcing a table-level lock that restricts access to the entire table. The reason the performance_schena.data_lock_waits was empty is that only InnoDB’s row level locks were instrumented, that is why this table did not show any server level table locks.
This understanding of MySQL’s internals was critical for quickly resolving this issue, by interpreting the nuances of MySQL’s metrics and status outputs, we were able to pinpoint the exact cause of the problem, avoiding a trial-and-error approach that could have led to further downtime.
This analysis revealed that the issue stemmed from a bug in the customer's application. Specifically, a LOCK TABLE statement was being executed without a corresponding UNLOCK TABLE command. This oversight resulted in locking conflicts that were responsible for stalling multiple connections, a scenario not immediately evident from the standard diagnostic tables.
Upon identifying the bug, we guided the customer to rectify the application code, ensuring that every LOCK TABLE statement was appropriately paired with UNLOCK TABLE. This intervention rapidly resolved the connection stalling issue, restoring the database system to optimal functionality.
Troubleshooting issues in MySQL can be challenging if you are not well familiar with MySQL's internals, if you need help with solving an issue contact us and one of our database specialists will help you.
Comments