Three thoughts on duplicate records

Three thoughts on duplicate records

Duplicate records are a reality in any database of any size, so as database managers, we always have to deal with them (or should deal with them!). So here are three thoughts on managing duplicate records:

  1. Duplicate records will always exist. Always. Unless you're managing only a tiny number of records, you will always have duplicates. So the goal isn't to eliminate duplicates, but to minimize their number.
  2. Focus on and fix any processes that tend to cause creation of duplicate records. For example, I often hear from my clients "Our website password recovery process doesn't work, or is too cumbersome, so customers just create new records in order to register for an event quickly." Whatever process you have that leads to duplicate records, fix that process.
  3. Consistently run a process for identifying potential duplicates, and clean them up. On at least a quarterly basis you should run a query that helps you identify potential duplicate records and then take the time to clean up records that are actual duplicates. And of course, clean up duplicate records as you find them in your day-to-day work. But seeking them out and fixing them consistently, over time, is the best way to minimize duplicate records.

Duplicate records are a reality of life. But suffering with an overwhelming number of duplicates is a choice, and something you can fix, if you take the time to do so.

Wes's Wednesday Wisdom Archives

Is that meaningless data?

September 25, 2019

Is that meaningless data? I’m not a big quotes guy, but one of the few […]

Be aware of unintended consequences

September 18, 2019

Be aware of unintended consequences I’ve written before that every decision involves a trade-off. When […]

Positive change is harder to see

September 11, 2019

Positive change is harder to see Humans are wired to see negative change because we […]

MVP: Minimum Viable Product

September 4, 2019

MVP: Minimum Viable Product In product development there is a concept known as MVP, or […]

You always need a reason for collecting data

August 28, 2019

You always need a reason for collecting data When you ask for data from someone […]

If you’re unhappy, speak up!

August 21, 2019

If you’re unhappy, speak up! My clients will often ask me something along the lines […]

Does it advance the mission?

August 14, 2019

Does it advance the mission? Because associations are mission-driven, everything you do should be seen […]

How should you start a new data project?

August 7, 2019

How should you start a new data project? When you’ve got a new data project […]

A Data Integrity Report…for Reports!

July 29, 2019

I’ve written elsewhere about the value of data integrity reports. But one of the most […]

Simple rules for complex systems

July 28, 2019

Simple rules for complex systems I first heard the phrase “simple rules for complex systems” […]

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top