Here is a doc capturing how gtid_executed list in memory is updated upon transactions, when/how that list is flushed into mysql.gtid_executed table, relationship between persistence and undo purge thread, its read/write scenarios, how innodb/rocksdb treats that differently.
trx & gtid_executed list in memory
This part is only executed in InnoDB. When a trx is in the prepared stage in InnoDB, it allocates an update undo segment if it is an insert only transaction. This is because the insert undo segment can be purged immediately(because no snapshot is reading it), but we need to persist the GTID information before purging. For the update undo segment, it will only be purged after being persisted to the gtid_executed
table. After allocating the segment, it will write the gtid version and value to the undo header.
1 |
|
Later when committing the trx in memory, it will add the gtid value to a Gitd_info_list
that is active when running transactions. If the number of GTID accumulated in memory exceeds the threshold, flush that. It does that by setting m_event
, for which Clone_persist_gtid::periodic_write()
waits for.
1 |
|
mysql.gtid_executed persistence
Some important members of Clone_persist_gtid:
m_gtid_trx_no
Oldest transaction number for which GTID is not persisted
m_num_gtid_mem
Number of GTID accumulated in memory
` Gitd_info_list m_gtids[2] Two lists of GTID. One of them is active where running transactions add their GTIDs. Other list is used to persist them to table from time to time. Getting which list is depending on on
m_active_number`, which is incremented upon flush.
1 |
|
` std::atomic
std::atomic<uint64_t> m_flush_number
Number up to which GTIDs are flushed. Increased when list is flushed
` const static int s_gtid_threshold = 1024` Number of transaction/GTID threshold for writing to disk table
const static int s_max_gtid_threshold = 1024 * 1024
Maximum Number of transaction/GTID to hold. Transaction commits
must wait beyond this point. Not expected to happen as GTIDs are compressed and written together.
os_event_t m_event
Event for GTID background thread
s_time_threshold
Time(millisecond) threshold to trigger persisting GTID
Flush & purge:
1 |
|
Clone_persist_gtid::flush_gtids
is the core function for gtid_executed table persistence. Although RocksDB would start the ib_clone_gtid
thread and this function is also called, it will effectively do nothing because in the previous step, the gtid list in memory is not appended, and m_num_gtid_mem
will always be 0. It is called every 100ms, or gtid threshold > 1024.
For InnoDB, If there is any accumulated GTID in memory, it will start the flush process, get the flush list, save into the gtid_executed
table. Then it updates m_gtid_trx_no
which represents the oldest transaction number for which GTID is not persisted to be the next trx_no of latest trx_no that is just flushed, and write to the system header page, then wake up the undo purge thread, do compression if necessary.
In the purge thread, it will not purge the undo log if that’s not persisted into the gtid_executed table. See the function that determines the oldest view:
1 |
|
Write/read Scenarios
- Write Scenarios:
- server start
- trx commit in slave with binlog=OFF
- committing a statement or transaction, including XA, and also at XA prepare handling
- server shutdown
- binlog rotation
- flush_gtid. Called upon server start, reaching time threshold(100ms) or gtid threshold(1024)). This has an effect on InnoDB only
Read Scenarios:
References
- https://blog.csdn.net/weixin_34238642/article/details/90065665
- https://cloud.tencent.com/developer/article/2083743
- http://mysql.taobao.org/monthly/2023/11/02/
- https://blog.csdn.net/n88Lpo/article/details/127002441
- https://zhuanlan.zhihu.com/p/141403577
- https://cloud.tencent.com/developer/article/1396314