Apr 20, 2020 5:00 PM

Finding duplicates in MySQL/MariaDB

Finding duplicates in MySQL/MariaDB

In this article I will explain how to find duplicates in MySQL/MariaDB. Duplicate records may appear due to incorrect database structure. You must use the UNIQUE index to prevent duplicate entries from being added. Let's look at an example of how to find out how many duplicate records are in a particular table.

First, create a simple `emails` table:

--
-- Table structure for table `emails`
--
CREATE TABLE IF NOT EXISTS `emails` (
	`id` INT UNSIGNED NOT NULL AUTO_INCREMENT,
	`email` VARCHAR(255) NOT NULL,
	PRIMARY KEY (`id`)
);

--
-- Dumping data for table `emails`
--
INSERT INTO `emails` (`email`)
VALUES
	('example@gmail.com'),
	('example2@gmail.com'),
	('example3@gmail.com'),
	('example4@gmail.com'),
	('example@gmail.com'),
	('example@gmail.com'),
	('example@gmail.com'),
	('example2@gmail.com'),
	('example2@gmail.com'),
	('example3@gmail.com');

Now run the following query to identify duplicates:

SELECT `email`, COUNT(`email`) AS countEmail
FROM `emails`
GROUP BY `email`
HAVING countEmail > 1
ORDER BY countEmail DESC;

--
--	[output]
--
--	email				countEmail
--	example@gmail.com	4
--	example2@gmail.com	3
--	example3@gmail.com	2

This query calculates the number of records for the specified column using the COUNT() function in combination with GROUP BY. Next is filtering records using HAVING. The result can be sorted in the desired order using ORDER BY.

Other articles