Three thoughts on duplicate records

Three thoughts on duplicate records

Duplicate records are a reality in any database of any size, so as database managers, we always have to deal with them (or should deal with them!). So here are three thoughts on managing duplicate records:

  1. Duplicate records will always exist. Always. Unless you're managing only a tiny number of records, you will always have duplicates. So the goal isn't to eliminate duplicates, but to minimize their number.
  2. Focus on and fix any processes that tend to cause creation of duplicate records. For example, I often hear from my clients "Our website password recovery process doesn't work, or is too cumbersome, so customers just create new records in order to register for an event quickly." Whatever process you have that leads to duplicate records, fix that process.
  3. Consistently run a process for identifying potential duplicates, and clean them up. On at least a quarterly basis you should run a query that helps you identify potential duplicate records and then take the time to clean up records that are actual duplicates. And of course, clean up duplicate records as you find them in your day-to-day work. But seeking them out and fixing them consistently, over time, is the best way to minimize duplicate records.

Duplicate records are a reality of life. But suffering with an overwhelming number of duplicates is a choice, and something you can fix, if you take the time to do so.

Wes's Wednesday Wisdom Archives

Action isn’t the same as progress

January 25, 2023

Action isn’t the same as progress I’ve written before that not taking action is an […]

Start with the end in mind

January 18, 2023

Start with the end in mind Like so many, I probably first heard the phrase […]

It’s quiet in here…maybe TOO quiet…

January 11, 2023

It’s quiet in here…maybe TOO quiet… One of the truisms of data management is that […]

It’s ALWAYS about expectations

January 4, 2023

It’s ALWAYS about expectations The headline reads: “Tesla sets record for vehicle deliveries, an increase […]

It’s hard to get UNangry

December 14, 2022

It’s hard to get UNangry I often emphasize to my clients the importance of testing […]

Some history IS important!

December 7, 2022

Some history IS important! When I’m advising clients on data conversion (moving data from one […]

“Many mickles make a muckle.”

November 30, 2022

“Many mickles make a muckle.” “Many mickles make a muckle.” – George Washington Apparently, this […]

It’s easy to collect; it’s harder to manage

November 16, 2022

It’s easy to collect; it’s harder to manage The beauty of today’s highly configurable AMS […]

Tell them why you want the data

November 9, 2022

Tell them why you want the data Because data is so easy to collect these […]

Don’t get hung up on something minor

November 2, 2022

Don’t get hung up on something minor I’m a problem solver. I love to solve […]

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top