Feb 24, 2020 9:15 AM

Finding missing identifiers in MySQL

Finding missing identifiers in MySQL

In this article I will explain how to find missing identifiers in MySQL. Recently, I was faced with the task of determining missing invoice identifiers in MySQL. I tried to find a ready-made solution, but no suitable solution was found. In many cases, it was not taken into account at all that there could be several missing identifiers one by one. Use my solution to find missing identifiers in MySQL:

SET @minId = 1000;
SELECT @maxId := max(`id`) FROM `invoices`;

CREATE TEMPORARY TABLE IF NOT EXISTS `tmp` (
	`id` INT UNSIGNED NOT NULL
);

DROP PROCEDURE IF EXISTS insertIds;

DELIMITER $$
CREATE PROCEDURE insertIds(startId INT, endId INT)
BEGIN
	SET @i = IFNULL(startId, 1);

	WHILE @i <= endId DO
		INSERT INTO `tmp` VALUES (@i);
		SET @i = @i + 1;
	END WHILE;
END $$
DELIMITER ;

CALL insertIds(@minId, @maxId);

SELECT `tmp`.`id` FROM `tmp`
LEFT JOIN `invoices` AS `t` on `t`.`id` = `tmp`.`id`
WHERE `t`.`id` IS NULL;

I will try to explain my decision to you. Let's first create a simple `invoices` table with missing identifiers:

--
-- Table structure for table `invoices`
--
CREATE TABLE `invoices` (
	`id` INT UNSIGNED NOT NULL
);

--
-- Dumping data for table `invoices`
--
INSERT INTO `invoices` (`id`) VALUES (1000), (1001), (1002), (1005), (1007), (1010);

We set the variable @minId, with which we start the search. Then we determine the maximum value of the identifier and set the variable @maxId:

SET @minId = 1000;
SELECT @maxId := max(`id`) FROM `invoices`;

We need to create a temporary table `tmp` into which we will insert all consecutive identifiers in the range from @minId to @maxId:

CREATE TEMPORARY TABLE IF NOT EXISTS `tmp` (
	`id` INT UNSIGNED NOT NULL
);

Now let's create a stored procedure insertIds that will insert sequential identifiers into the `tmp` table in the range from @minId to @maxId:

DROP PROCEDURE IF EXISTS insertIds;

DELIMITER $$
CREATE PROCEDURE insertIds(startId INT, endId INT)
BEGIN
	SET @i = IFNULL(startId, 1);

	WHILE @i <= endId DO
		INSERT INTO `tmp` VALUES (@i);
		SET @i = @i + 1;
	END WHILE;
END $$
DELIMITER ;

Unfortunately, you cannot create a stored procedure with a check for existence:

CREATE PROCEDURE IF NOT EXISTS generateCodes(startId INT, endId INT, codePrefix VARCHAR(64))

Therefore, at the beginning of creating a stored procedure, we delete it so that the error "PROCEDURE generateCodes already exists" does not occur.

You can also read my article on how to use the WHILE loop in MySQL.

Now we can call the stored procedure insertIds to generate sequential identifiers:

CALL insertIds(@minId, @maxId);

The last step remains, in which we get the missing identifiers using LEFT JOIN:

SELECT `tmp`.`id` FROM `tmp`
LEFT JOIN `invoices` AS `t` on `t`.`id` = `tmp`.`id`
WHERE `t`.`id` IS NULL;

--
--	[output]
--
--	id
--	1003
--	1004
--	1006
--	1008
--	1009

Other articles