Three thoughts on duplicate records

Three thoughts on duplicate records

Duplicate records are a reality in any database of any size, so as database managers, we always have to deal with them (or should deal with them!). So here are three thoughts on managing duplicate records:

  1. Duplicate records will always exist. Always. Unless you're managing only a tiny number of records, you will always have duplicates. So the goal isn't to eliminate duplicates, but to minimize their number.
  2. Focus on and fix any processes that tend to cause creation of duplicate records. For example, I often hear from my clients "Our website password recovery process doesn't work, or is too cumbersome, so customers just create new records in order to register for an event quickly." Whatever process you have that leads to duplicate records, fix that process.
  3. Consistently run a process for identifying potential duplicates, and clean them up. On at least a quarterly basis you should run a query that helps you identify potential duplicate records and then take the time to clean up records that are actual duplicates. And of course, clean up duplicate records as you find them in your day-to-day work. But seeking them out and fixing them consistently, over time, is the best way to minimize duplicate records.

Duplicate records are a reality of life. But suffering with an overwhelming number of duplicates is a choice, and something you can fix, if you take the time to do so.

Wes's Wednesday Wisdom Archives

Don’t miss this year’s DAN Science Fair!

September 3, 2025

Don’t miss this year’s DAN Science Fair! I’m taking a break from my weekly tips […]

Just start. And it’s never done.

August 27, 2025

Just start. And it’s never done. Last week I had the opportunity to lead a […]

Don’t ask if you’re not going to answer!

August 20, 2025

Don’t ask if you’re not going to answer! In the past week, on three different […]

Seek forgiveness rather than permission

August 6, 2025

Seek forgiveness rather than permission “Seek forgiveness, not permission.” I can’t remember when I first […]

Is a 360 degree view necessary?

July 30, 2025

Is a 360 degree view necessary? “A 360 degree view of our members should NOT […]

Why data matters

July 23, 2025

Why data matters I write about data because managing our data effectively will help us […]

It’s about process

July 16, 2025

It’s about process I’m in the midst of adding an AI agent to my website […]

Data integrity reports

July 9, 2025

Data integrity reports I speak and write a lot about data integrity reports. So I […]

The most important training tip

June 25, 2025

The most important training tip I’ve been sitting in a lot of database training sessions […]

Be careful about getting tangled

June 18, 2025

Be careful about getting tangled “We had done a lot to tangle ourselves up.” – […]

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top