Any Answers (back to index)
Cleaning up data
Posted by D Matthews on Fri, 19/06/2009 - 14:00
I've read a bit about cleaning up customer data but can anyone recommend any good products/processes?
Thanks, Dan
- login or register to post comments
- Add to a social bookmarking site


Cleaning up data
Hi
There's essentially two aspects to enhancing data quality.
1. Improving the completeness and accuarcy of the data.
2. De-duplication of multiple records.
Some typical examples of incomplete or inaccurate data are:
- missing address fields
- incorrect spelling of cities and countries
- inconsistant formatting (e.g. UK, U.K., Unitied Kingdom, England etc).
- incorrect parsing (e.g. the postcode is in the city field)
- email addressess missing the @ or ending in a full stop
...and many others!
Unless you have really large data volumes, there's a great deal you can do in Excel to improve these type of problems by sorting the data in different ways. For example, by sorting the data by City field you can identify obvious mis-spellings, add the country field etc. Sort email addresses firstly by the first letter and then by the last letter to remove punctuation marks (a full stop at the end of an email address is a common problem). Use Excel functions to identify email addresses missing the @. Sort the data by postcode and cut and paste all the city names into the correct fields. I'm sure you get the idea.
To this first before you turn your attention to duplicates.
De-duplication work depends to an extent on how your data is currently stored. Many specialist CRM companies (ourselves included) have developed code that allows duplicates to be easily identified when the data is loaded into an SQL database.
Similarly we work extensively with salesforce.com and have found the DemandTools application to be an excellent way of de-duplicating any type of record. The tool allows you to define the parameters that constitute a duplicate including the application of fuzzy logic to identify potential duplicates for further investigation. You can find out more about DemandTools here
http://sites.force.com/appexchange/listingDetail?listingId=a0N300000016b...
Kind Regards
Gary
www.2020management.com
Free Audits
Dan
Most data suppliers will offer some kind of support in this regard. Royal Mail offer a free audit of your data through its Clear Prospects website offer for data bases under 5,000 records. Once registered you can get a view on the accuracy of the information you have including the health of your data as well. All making sure you understand wha is good and not so good in your data.
www.royalmail.com/clearprospects
Its a great first step and easy to do. For larger databases they offer a similar service through their data services team 08456 000 098.
They will support this audit with recommendations on how to clean your data, define your customer profile and build your prospect pool.
Hope this is useful