Well, the answer I always get is "DROP". Then I would say... "what if you wanted to preserve your data?" It is very common for DBAs to think only in terms of what is obviously available in MySQL, ALTER, than all of the tools that are really there. I have been placed in situations in many companies where ALTER was just not the way to go.
Given the amount of MySQL usage throughout the WEB2.0/IT industry and the fact that data sets seem to keep growing... I have come up with a great solution to ALTERING large data sets. Please keep in mind that this is an old feature of MySQL and was a great "lightning speech" at the Percona conference in 04/08.
-- THE TABLE
mysql> use test;
mysql> show table status like 't1'\\\\G
*************************** 1. row ***************************
Name: t1
Engine: InnoDB
Version: 10
Row_format: Compact
Rows: 892130
Avg_row_length: 6544
Data_length: 5838471168
Max_data_length: 0
Index_length: 213909504
Data_free: 9437184
Auto_increment: NULL
Create_time: 2009-09-09 15:50:32
Update_time: NULL
Check_time: NULL
Collation: latin1_swedish_ci
Checksum: NULL
Create_options:
Comment:
1 row in set (0.35 sec)
mysql> CREATE TABLE `t1` (
`id` varchar(32) NOT NULL,
`id0` varchar(32) DEFAULT NULL,
`id1` varchar(32) DEFAULT NULL,
`intcol1` int(32) DEFAULT NULL,
...
`intcol10` int(32) DEFAULT NULL,
`charcol1` varchar(128) DEFAULT NULL,
...
`charcol30` varchar(128) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `id0` (`id0`),
UNIQUE KEY `id1` (`id1`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Query OK, 0 rows affected (0.14 sec)
-- TABLE SIZE
mysql> select ((5838471168+213909504)/1024/1024/1024) as TBL_IN_GB;
+----------------+
| TBL_IN_GB |
+----------------+
| 5.636718750000 |
+----------------+
1 row in set (0.00 sec)
-- TESTING ALTER
mysql> alter table t1 modify charcol1 varchar(255);
Query OK, 800059 rows affected (25 min 42.40 sec)
Records: 800059 Duplicates: 0 Warnings: 0
OBSERVATIONS:
Big suprise, right... the table space doubled and it took a bit of time!
-- TESTING SELECT OUT AND LOAD IN
mysql> select * from test.t1 INTO OUTFILE '/var/lib/mysql/test.t1.out';
Query OK, 800059 rows affected (2 min 26.22 sec)
mysql> drop table test.t1;
Query OK, 0 rows affected (0.45 sec)
mysql> CREATE TABLE `t1` (
`id` varchar(32) NOT NULL,
`id0` varchar(32) DEFAULT NULL,
`id1` varchar(32) DEFAULT NULL,
`intcol1` int(32) DEFAULT NULL,
...
`intcol10` int(32) DEFAULT NULL,
`charcol1` varchar(255) DEFAULT NULL, -- FROM ALTER STATEMENT
...
`charcol30` varchar(255) DEFAULT NULL, -- NEW
PRIMARY KEY (`id`),
UNIQUE KEY `id0` (`id0`),
UNIQUE KEY `id1` (`id1`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Query OK, 0 rows affected (0.14 sec)
mysql> LOAD DATA INFILE '/var/lib/mysql/test.t1.out' INTO TABLE test.t1;
Query OK, 800059 rows affected (12 min 43.13 sec)
Records: 800059 Deleted: 0 Skipped: 0 Warnings: 0
-- TABLE BREAKDOWN
30 VARCHAR COLUMNS
10 INT COLUMNS
1 PRIMARY KEY
2 UNIQUE KEYS
-- CONCLUSIONS
Alter table is slow and a full dump and reload is 47% faster!
-- APPLICATIONS
It can be said that ALTER TABLE is more efficient when dealing with small tables (< 1GB); however, when I think of any large scale data set or larger sharded cluster set, the dump and reload wins every time!
-- EXAMPLES
1. 20GB table per database with 10 databases per server
-- 200GB of the data on the server needs to be modified
-- Now multiply the server count times 20
--- This is 4TB worth of data
2. The examples can go on and on!!!!!!
-- PROCESS
1. If you have a slave, AND YOU SHOULD AT THIS POINT, run stop slave
2. Run the SELECT INTO OUTFILE local, if you have the space, or over nfs (not very fast)
3. DROP or RENAME THE TABLE you are trying to ALTER
-- RENAME only if you have the space and you don't trust THE PROCESS!
4. CREATE the same table with the ALTERATIONS, if you dropped, or a new table with the ALTERATIONS
5. IMPORT the file from step 2
6. Either RENAME the two tables, if you renamed, or start slave
7. Fail over the writes to the newly altered slave
8. Repeat steps 1 - 7 for the old master
-- CONSIDERATIONS
1. How long do you have for the outtage
2. How much downtime can you take
3. Do you have a replica of the data
4. Do you have the space, either local or over nfs
You need to be a member of Everything MySQL to add comments!
Join Everything MySQL