Share this content

Why duplicate records are costly for your company - and what can be done about it

11th Aug 2015
Share this content

Customer data is the lifeblood of most businesses – without it where would the marketing, customer service and finance departments be? And yet, there is a big data problem amongst many organisations in the UK. The issue is duplication. In other words, many of their customers appear more than once on a customer database.

It might not sound like a huge challenge but, at a time when most marketers are striving to improve the performance and efficiency of their campaigns, duplicate records have the potential to create real business issues. We find that it is particularly when organisations install a CRM system such as Salesforce that they start to realise that this is a problem – and when we’re asked to look into it for them we have found some databases have up to 30% duplicate records. So what impact can duplicate records have?

The problem with duplicates

Well, the first, and most obvious issue is wastage of your marketing budget. If you send five of the same mailpack to one person, that’s like pouring money down the drain as you have to pay for duplicate print and postage costs, and you may also end up paying more for unnecessary data storage. In addition, it has a negative effect on response rates and so the overall ROI of your campaign. 

Secondly, you can harm your brand’s reputation as a result of bombarding one customer with a multiplicity of mailings – it looks unprofessional and any green credentials a brand has fly straight out the window. Indeed, in analysis it released last year, Gartner quantified the impact of bad data management, identifying that annoying customers in this way can result in a 25% reduction in potential revenue gains. 

There’s also a problem around reporting as duplicate records render a single customer view unfeasible. This makes it difficult to get a clear picture of your customers and their behaviour which can lead to bad targeting and decision making - e.g. If you can’t tie together one customer’s transactions then you lose the ability to talk relevantly to them with marketing communications or even at point of sale. Duplicates can create problems in other areas of the business too, such as customer service. If someone calls with a customer service issue, it makes it much harder to resolve if you can’t identify who they are because the customer ID they ordered under is different to the one you find when you search for them on your database.

So what’s the cause?

The main root of the problem is human error. Perhaps a customer may volunteer their information to your company twice, in slightly different ways. For example: 

Tye George, CCR, Unit 4 Minton Distribution Park, London Rd, Amesbury, Wilts, SP4 7RT


Tye G, CCR Data Ltd, London Road, Amesbury, Wiltshire, SP4 7RT

These may look like subtle differences but they’re enough to see you go on a database twice. Similarly, there may be a keystroke error when data is input manually. Another frequent cause is when a customer rings into a call centre to place an order. If the telesales operator can’t find the customer record quickly, they often simply add a new record.

And what’s the remedy?

To counteract the problems outlined, the solution is a process called, unsurprisingly, data deduplication. This is a blend of human insight, data processing and algorithms to help identify potential duplicates based on likelihood scores and common sense to identify where records look like a close match. This process involves analysing your database to identify potential duplicate records, and unravelling these to identify definite duplicates or where, for instance, it’s different members of the same family living in the same house.

Deduplication rules also need to be implemented, based on your own unique data issues, in order to create a bespoke deduplicaton strategy. The rules should take into account your decisions about how ‘strict’ you want to be with your deduplication, in terms of maintaining the balance between losing valuable customer data and having a clean database. As part of this, you need to manually scan and review the data to check for any anomalies or duplicates that are obvious to the human eye.

You also need to decide how to manage the duplicates. Depending on factors such as the cost of the mailpack or sending eshot to the same individual twice, you may decide to flag and exclude duplicates from future campaigns, remove them entirely or to merge the key information from across all the records into one, unique ‘master ‘record’. If you decide on the latter, you will also need to assess and make a decision about the best records to keep as the ‘master record’ too – e.g. do you keep the first record created, the most recent one or the one with the highest number/value of purchases? 

However you decide, the result will be a clean, manageable and efficient marketing database. 

Keep it clean

Once you have a clean database, it’s important to keep it that way and there are several things you can do. For a start, you should create more rigorous processes up front, such as stricter quality control on data capture or restrictions on who can create new data records. You also need to build duplicate management into your ongoing database strategy, assessing it regularly (and particularly prior to campaigns).

Obviously it costs money to deduplicate your database, but it’s worth the investment as the costs are likely to be significantly lower than the money (and brand reputation) that you will lose otherwise. For example, assuming a mailpack cost of 45p, if you are doing a mailing to 100,000 customers of which 20% are duplicates, deduplication will save you £9,000.

George Tye is client services director at CCR.

Replies (1)

Please login or register to join the discussion.

By MarcioWilges
13th Sep 2017 04:21

Although it might seem like a good idea to have a number of copies of the same information scattered around your networks, at the storage prices companies are charging these days, it might be a better idea to streamline everything instead. You can probably hire IT people to do that for you and save yourself the trouble if you're willing to pay for the service!

Thanks (0)