Dirty data negatively affects workflows, marketing efforts, and your customers’ experience. It can even get you into legal trouble.
What is Dirty Data?
How data gets dirty
Duplicate data refers to records that partially or fully share the same information. They come when the same information is entered multiple times, sometimes in different formats. A typical duplicate dirty data example is when one customer exists in your CRM multiple times. This often happens because the customer’s name is written slightly differently each time (Ellie H. Rhodes, Ellie Hannah Rhodes, Eleanor H. Rhodes, Eleanor Hannah Rhodes)
Because customer information is scattered across different records, duplicate customer data leads to:
- Poor customer service
- Incorrect tracking and reporting
- Duplicate marketing targeting
Data that is not encrypted or access controlled is considered insecure. This means that it can be accessed by anyone within your company, and in some cases, even by third parties. Insecure data poses a risk to privacy and can also result in legal issues, as companies may be non-compliant with laws such as GDPR and CCPA.
An example of dirty data that’s incomplete would be if your newsletter sign-up form has a field for the lead’s first name, but the field isn’t a required field. Leads can then sign up without leaving their name, rendering your personalised email campaigns less effective.
Inaccurate data is data that has errors or mistakes. For instance, if a customer makes a typo while entering their last name on one of your forms, the last name you have in your records is inaccurate. This is considered a dirty record.
Outdated data is inaccurate not because it was entered incorrectly but because it used to be accurate, and now it isn’t anymore. For instance, if your CRM still shows a customer's old address even after they have moved.
Other examples of outdated data are:
- Phone numbers or email addresses that are no longer in use
- Titles of people who have since switched jobs
- Out-of-date email segments
Incorrect data refers to data that does not meet previously defined parameters. It is easier to prevent than to correct. For instance, if a customer uses a dropdown menu to enter their birthdate, the system will only permit them to choose one out of 12 months and one out of 31 days and does not allow them to select a birth year that would make them older than 120 years.
For example, inconsistent data or data redundancy happens when companies store the same information in different places without syncing it. This can be seen when a company stores customer information in both its CRM and email marketing tool.
To increase the data quality and prevent dirty data, organisations should incorporate methodologies to ensure the data's completeness, validity, consistency, and correctness.