"If a worker wants to do his job well, he must first sharpen his tools." - Confucius, "The Analects of Confucius. Lu Linggong"
Front page > Programming > Can a Unique Index Remove Duplicates in a Table with Existing Duplicates, and How?

Can a Unique Index Remove Duplicates in a Table with Existing Duplicates, and How?

Published on 2024-11-08
Browse:674

Can a Unique Index Remove Duplicates in a Table with Existing Duplicates, and How?

Removing Duplicates with Unique Index

In an effort to prevent duplicate data insertion, a normal index was mistakenly created for fields A, B, C, and D, resulting in the presence of duplicate records in a 20 million record table. The question arises: will adding a unique index for these fields remove the duplicates without compromising existing ones?

Correcting the Index and Handling Duplicates

Adding a unique index with the ALTER TABLE statement without the IGNORE modifier will fail since unique records already exist. However, using the IGNORE modifier will remove the duplicates.

Alternative Approach for MySQL Versions 5.7.4 and Above

For MySQL versions 5.7.4 and above, where the IGNORE modifier is not supported, a different approach is recommended:

  1. Copy the data to a temporary table.
  2. Truncate the original table.
  3. Create the unique index.
  4. Copy the data back to the original table using INSERT IGNORE, discarding any duplicate rows.

Syntax for Removing Duplicates with INSERT IGNORE

CREATE TABLE tmp_data SELECT * FROM mytable;
TRUNCATE TABLE mytable;
ALTER TABLE mytable ADD UNIQUE INDEX myindex (A, B, C, D);
INSERT IGNORE INTO mytable SELECT * FROM tmp_data;
DROP TABLE tmp_data;

Additional Considerations

The documentation does not specify which duplicate row will be retained after using the IGNORE modifier. It is advisable to test this on a smaller dataset before applying the solution to a large table.

Latest tutorial More>

Disclaimer: All resources provided are partly from the Internet. If there is any infringement of your copyright or other rights and interests, please explain the detailed reasons and provide proof of copyright or rights and interests and then send it to the email: [email protected] We will handle it for you as soon as possible.

Copyright© 2022 湘ICP备2022001581号-3