How do you find duplicate rows in a table using SQL?

Detailed Answer

Step 1: find which values are duplicated

SELECT email, COUNT(*) AS occurrences
FROM users
GROUP BY email
HAVING COUNT(*) > 1;

This tells you which emails are duplicated and how many times, but not the individual row IDs.

Step 2 (if you need the actual duplicate rows): join back

SELECT u.*
FROM users u
JOIN (
    SELECT email FROM users GROUP BY email HAVING COUNT(*) > 1
) dupes ON dupes.email = u.email
ORDER BY u.email;

A cleaner approach with window functions: identify which copy to keep

The most common real task isn't just "find duplicates" — it's "keep one copy and delete the rest." ROW_NUMBER() handles both:

SELECT id, email, ROW_NUMBER() OVER (PARTITION BY email ORDER BY id) AS rn
FROM users;
-- rn = 1 is the row you'd keep (lowest id) per email; rn > 1 are duplicates

Deleting the duplicates, keeping the earliest row per email:

DELETE FROM users
WHERE id IN (
    SELECT id FROM (
        SELECT id, ROW_NUMBER() OVER (PARTITION BY email ORDER BY id) AS rn
        FROM users
    ) t
    WHERE rn > 1
);

(Some engines, like PostgreSQL, let you write this more directly with a CTE and DELETE ... USING.)

Preventing future duplicates

Finding and cleaning duplicates is a one-time fix — the durable fix is a UNIQUE constraint (or unique index) on the column(s) that must not repeat:

ALTER TABLE users ADD CONSTRAINT uq_users_email UNIQUE (email);

Note you generally have to clean up existing duplicates before this constraint can be added, since the database will refuse to create a unique index over data that already violates it.

How do you find duplicate rows in a table using SQL?

Quick Answer

Detailed Answer

Step 1: find which values are duplicated

Step 2 (if you need the actual duplicate rows): join back

A cleaner approach with window functions: identify which copy to keep

Preventing future duplicates

Related Resources

Related Questions