Three thoughts on duplicate records
Duplicate records are a reality in any database of any size, so as database managers, we always have to deal with them (or should deal with them!). So here are three thoughts on managing duplicate records:
- Duplicate records will always exist. Always. Unless you're managing only a tiny number of records, you will always have duplicates. So the goal isn't to eliminate duplicates, but to minimize their number.
- Focus on and fix any processes that tend to cause creation of duplicate records. For example, I often hear from my clients "Our website password recovery process doesn't work, or is too cumbersome, so customers just create new records in order to register for an event quickly." Whatever process you have that leads to duplicate records, fix that process.
- Consistently run a process for identifying potential duplicates, and clean them up. On at least a quarterly basis you should run a query that helps you identify potential duplicate records and then take the time to clean up records that are actual duplicates. And of course, clean up duplicate records as you find them in your day-to-day work. But seeking them out and fixing them consistently, over time, is the best way to minimize duplicate records.
Duplicate records are a reality of life. But suffering with an overwhelming number of duplicates is a choice, and something you can fix, if you take the time to do so.
![]()
Wes's Wednesday Wisdom Archives
We’ve always done it that way
We’ve always done it that way A couple of weeks ago I wrote about approving memberships, […]
Are you sure it doesn’t do that?
Are you sure it doesn’t do that? Even after 20 years of consulting, I’m surprised […]
Be grateful
Be grateful As Thanksgiving approaches here in the US, I’m reminded of two words: Be […]
Do you really need to approve them?
Do you really need to approve them? I often joke that the very best (because […]
Negativity bias is why we need database PR
Negativity bias is why we need database PR I’ve written before how cognitive biases can affect […]
A system change requires a culture change
A system change requires a culture change By its very nature, when you introduce a […]
Where is that data?
Where is that data? This is what data management nirvana looks like: When the question starts with […]
Be deliberate, but act quickly
Be deliberate, but act quickly Be deliberate, but act quickly. These are my words of […]
Why associations don’t like the “S” word
Why associations don’t like the “S” word A couple of weeks ago I asked my […]
The “S” stands for “Standard”
The “S” stands for “Standard” In a conversation with a past client, we were discussing […]
