Apr 27, 2020 3:15 PM

Delete duplicates in MySQL/MariaDB

Delete duplicates in MySQL/MariaDB

In this article I will explain how to remove duplicates in MySQL/MariaDB. If you have already found duplicate records in your database, you will need to delete them, leaving the original record. You can choose which record to leave: with the highest or lowest identifier. Let's look at an example of how to remove duplicate records in a specific table.

First, create a simple `emails` table:

--
-- Table structure for table `emails`
--
CREATE TABLE IF NOT EXISTS `emails` (
	`id` INT UNSIGNED NOT NULL AUTO_INCREMENT,
	`email` VARCHAR(255) NOT NULL,
	PRIMARY KEY (`id`)
);

--
-- Dumping data for table `emails`
--
INSERT INTO `emails` (`email`)
VALUES
	('example@gmail.com'),
	('example2@gmail.com'),
	('example3@gmail.com'),
	('example4@gmail.com'),
	('example@gmail.com'),
	('example@gmail.com'),
	('example@gmail.com'),
	('example2@gmail.com'),
	('example2@gmail.com'),
	('example3@gmail.com');

Use the following query to identify duplicates:

SELECT `email`, COUNT(`email`) AS countEmail
FROM `emails`
GROUP BY `email`
HAVING countEmail > 1
ORDER BY countEmail DESC;

If you want to delete duplicate records and leave only the record with the highest identifier, use the following query:

DELETE t
FROM `emails` AS t
INNER JOIN `emails` AS t2
WHERE
	t.`email` = t2.`email` AND
	t.`id` < t2.`id`;

If you want to delete duplicate records and leave only the record with the lowest identifier, use the following query:

DELETE t
FROM `emails` AS t
INNER JOIN `emails` AS t2
WHERE
	t.`email` = t2.`email` AND
	t.`id` > t2.`id`;

In MySQL prior to version 5.7, you could use a simple solution to remove duplicates, but it was removed:

ALTER IGNORE TABLE `emails` ADD UNIQUE INDEX(`email`);

The essence of this solution is that when you add a unique index, duplicates are automatically deleted and only one first record remains.

Other articles