Overview
MySQL has buildin partition table support, which can help split data accross multi tables,
and provide a unified query interface as normal tables.
Benefit:
- Easy data management: If we need to archive old data, and our table is partitioned by datetime, we can drop old partition directly.
- Speed up query based on partition key(partitoin pruning)
Limit:
- For partition table, every unique key must use every column in table’s partition expression(include primary key)
- For innodb engine, paritioned table can’t have foreign key,and can’t have columns referenced by foreign keys.
- For MyISAM engine, mysql version <= 5.6.5, DML operation will lock all partition as a whole.
Partition live db table
MySQL have several different partition type: range, list, key, hash…
For my live db table, I just want to partition it by date, one table per month.
So I choose to use range partition by the time_created
column (it’s a bigint).
origin table schema:
CREATE TABLE `Record` (
`id` bigint NOT NULL, AUTO_INCREMENT,
`data` text DEFAULT '',
`time_created` bigint NOT NULL,
PRIMARY KEY (`id`)
)
After partitioning, I only want to keep recent 3 months data. If I delete rows in original table first, it will take a lot of time, also rebuild primary key and partitioning will cost many time as well.
So I decide create a new partitioned table first, and select needed data from original table to it, and do a rename operation at last.
Steps:
- create new partiton table
Record_new
(with corrent primary key and partition info) - insert into
Record_new
select * from Record where id > xxx and id < yyy. (copy needed data to new table, should do it chunk by chunk since the number is huge). - rename table
Record
toRecord_old
,Record_new
toRecord
.
With this flow, we can have minimum impact on live server, and can cancel at any time during copy data.
Create sql for new partition table:
CREATE TABLE `Record` (
`id` bigint NOT NULL, AUTO_INCREMENT,
`data` text DEFAULT '',
`time_created` bigint NOT NULL,
PRIMARY KEY (`id`, `time_created`)
) PARTITION BY RANGE(time_created)
(PARTITION p_2016 VALUES LESS THAN (UNIX_TIMESTAMP('2017-01-01')),
PARTITION p_2017_01 VALUES LESS THAN (UNIX_TIMESTAMP('2017-02-01')),
PARTITION p_2017_02 VALUES LESS THAN (UNIX_TIMESTAMP('2017-03-01')),
PARTITION p_2017_03 VALUES LESS THAN (UNIX_TIMESTAMP('2017-04-01')),
PARTITION p_2017_04 VALUES LESS THAN (UNIX_TIMESTAMP('2017-05-01')),
PARTITION p_default VALUES LESS THAN MAXVALUE);
p_default
is used to hold data if no suitable partition is available, usually, it should be null. I use a monthly cronjob to reoriganize it to genearate a new partition:
ALTER TABLE Record REORGANIZE p_default into
(PARTITION p_2017_05 VALUES LESS THAN (UNIX_TIMESTAMP('2017-06-01')),
PARTITION p_2017_06 VALUES LESS THAN (UNIX_TIMESTAMP('2017-07-01')));
Partitoin management
Check how many rows every partitoin have:
select PARTITION_NAME,TABLE_ROWS from information_schema.partitions where table_schema='test_db' and table_name='test';
Drop partition:
ALTER TABLE Record DROP PARTITION p_xxx;