I have a table of unique values (domains_unique), with column domain varchar(255), with more than 20 mil records.
What's the fastest way to insert into the table, by keeping the domain unique constraint?
I've decided that the query should be :
INSERT IGNORE INTO domains_table (domain) VALUE ('domain.com') Should I make domain the primary key or should I make it a unique index?
Primary key method:
CREATE TABLE `domains_unique` ( `domain` varchar(255) NOT NULL PRIMARY KEY (`domain`) ) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE = utf8_general_ci; Unique index method:
CREATE TABLE `domains_unique` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `domain` varchar(255) NOT NULL PRIMARY KEY (`id`), UNIQUE KEY unique_index (`domain`) ) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE = utf8_general_ci; Also how would much would changing the CHARSET and COLLATION to ascii_bin affect performance?
obs:
I would use INSERT DELAYED IGNORE INTO, but I need to know if the specific row (domain) was inserted, that also rules out the bulk insert.
UPDATE:
As promised the benchmarks results:
INSERT 5k unique rows with 4.5k new rows into a 1 mil row table, one row at a time:
- Primary Key method: 5.7 seconds
- Unique Index method: 6.3 seconds
To test scaling I've also tested the Unique Index method on a 40mil rows table and it took 45.8 s