Three thoughts on duplicate records

Three thoughts on duplicate records

Duplicate records are a reality in any database of any size, so as database managers, we always have to deal with them (or should deal with them!). So here are three thoughts on managing duplicate records:

  1. Duplicate records will always exist. Always. Unless you're managing only a tiny number of records, you will always have duplicates. So the goal isn't to eliminate duplicates, but to minimize their number.
  2. Focus on and fix any processes that tend to cause creation of duplicate records. For example, I often hear from my clients "Our website password recovery process doesn't work, or is too cumbersome, so customers just create new records in order to register for an event quickly." Whatever process you have that leads to duplicate records, fix that process.
  3. Consistently run a process for identifying potential duplicates, and clean them up. On at least a quarterly basis you should run a query that helps you identify potential duplicate records and then take the time to clean up records that are actual duplicates. And of course, clean up duplicate records as you find them in your day-to-day work. But seeking them out and fixing them consistently, over time, is the best way to minimize duplicate records.

Duplicate records are a reality of life. But suffering with an overwhelming number of duplicates is a choice, and something you can fix, if you take the time to do so.

Wes's Wednesday Wisdom Archives

Yes, you do have a process for that

May 13, 2020

Yes, you do have a process for that When I work with clients on pretty […]

“Correct your mistakes before they become your habits”

May 6, 2020

“Correct your mistakes before they become your habits” Yet another great quote from James Clear: “Correct […]

Sometimes it’s a process issue

April 29, 2020

Sometimes it’s a process issue When I work with clients on developing a needs list […]

First ask: “How will I use this data?”

April 22, 2020

First ask: “How will I use this data?” Whether it’s during data conversion, adding a […]

“We run the report three times and get three different results.”

April 15, 2020

“We run the report three times and get three different results.” “We run the same […]

How do you know if you’re making progress?

April 8, 2020

How do you know if you’re making progress? We all want to improve our data management, […]

Don’t move too quickly OR too slowly

April 1, 2020

Don’t move too quickly OR too slowly This may sound like a big “duh!” but I’ll […]

“Your current habits are perfectly designed to deliver your current results.”

March 25, 2020

“Your current habits are perfectly designed to deliver your current results.” “Your current habits are […]

There is value in just going through the exercise

March 18, 2020

There is value in just going through the exercise I was working with a client […]

Daylight savings and business rules

March 11, 2020

Daylight savings and business rules This past Sunday, most of the US observed Daylight Savings […]

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top