Organizations can struggle with data quality in virtually any system and duplicate management is a large piece of this puzzle. Clean and accurate information can build confidence, increase user adoption and create opportunities for deeper organizational insights. This holds true with Salesforce, where a core ‘superpower’ of the CRM is automation and process management. With emerging technologies like Salesforce Communities becoming more accessible to small and medium sized organizations, data cleanliness is as critical as ever. The opportunity to gather and share key information with constituents is dependent on a strong foundation of accurate data. In this post, we will discuss how to avoid duplicate information and how to leverage duplicate management tools to ensure data quality.
Stop Duplicates Before They Start
While there are plenty of tools to identify, prevent and manage duplicates, organizations should leverage Salesforce settings as a basis for their data management strategy. Salesforce provides granular tools for determining object, record and field-level access. Admins who understand Salesforce sharing settings can ensure users – particularly those entering high volumes of data – have the appropriate level of access in the system. Users cannot prevent duplicates or ensure clean data if they do not have appropriate access to information.
Download White Paper: Master Data Management for Nonprofits
A core strength of Salesforce’s ecosystem is its AppExchange and the third party applications that integrate to extend the platform’s capabilities. Integrations should be managed carefully as they are also a primary source of duplication and data quality issues. Since records are created and updated automatically, pre-defined rules usually determine how these records come into or out of Salesforce. These rules are not always able to identify duplicates, especially with highly customized Salesforce solutions. Creating a shared understanding of outside integrations like donor management software and email marketing tools are integral to the quality of an organization’s data. Understanding what those integrations do and how they work with data in Salesforce can help determine which deduplication tools will work best and how they should be configured.
To stop duplicates before they start admins should set expectations and goals around data cleanliness. First, understand and communicate that most information systems have some level of duplication and as a result data cleanliness must be actively managed. Then, work with organizational stakeholders to put a plan in place.
- Establish reasonable goals and timelines. While tools can drastically speed this process up, admins must first understand what data is needed to identify duplicates, how data gets into and out of Salesforce and what a clean system-of-record should look like. These are important first steps to selecting the right tool(s) for the job.
- Ensure organizational resources are leveraged appropriately. Data cleanliness requires staff time, resources, tools and active monitoring. Work with leadership to establish goals and communicate the time and resources needed to achieve data quality priorities.
- Understand how duplicates should be identified. Consider which fields should be checked when looking for duplicates. These fields should be unique identifiers and combinations of fields should also be considered. This can be particularly helpful for systems that have individuals, such as minors, who do not have unique emails or phone numbers.
- Consider applications and customization. Ask yourself if there are Salesforce apps or customizations in place that change unique identifier fields. For example, NPSP adds email and phone fields that may help identify unique records. Put a plan in place that ensures integrations are checking for existing records when importing data.
Staff and other users responsible for entering or managing data in Salesforce should have the tools they need to identify and rectify duplicates. This might be as simple as teaching users how to search for records before entering a new contact. Another option is to identify a custom field that allows users to flag potential duplicates and identify which is the master record.
A Time for Tools
Even after permissions have been established, integrations have been mapped, goals have been outlined, data entry rules have been set and users have been trained additional tools may be needed. Salesforce provides several features targeted at preventing and identifying duplicates in addition to third party applications that allow for the mass identification and merging of duplicates. These tools can be helpful to address legacy data or even to do periodic ‘sweeps’ of the information.
Salesforce comes with native duplicate management tools that are best used for identifying and preventing duplicates. They also allow for one-off merging of some standard records. The standard Salesforce duplicate management tools include:
- Matching rules. Create custom rules to identify duplicates either before or after they are entered using Duplicate Jobs.
- Duplicate rules. Determine what happens when a user is about to enter a duplicate. Currently, Salesforce provides options to prevent a record from being entered, alert the user and/or report the duplicate after allowing it to be entered. Reporting duplicates can be helpful if there are integrations creating or modifying data in Salesforce.
- Merging one-offs. Salesforce provides an interface for merging some standard objects on a one-off basis. This can be helpful if specific users need to merge duplicates on the fly or if there is a low volume of data that requires merge attention.
- Duplicate jobs. If integrations or users are importing large volumes of data into Salesforce, it can be helpful to do periodic ‘sweeps’ to identify and address duplicates. Duplicate Jobs allow administrators to identify duplicates in bulk. Note: Duplicate jobs are only available to Performance and Unlimited Salesforce users.
Salesforce Duplicate Management is included with Salesforce – however Duplicate Jobs are limited to only some license types and has other limitations around merging duplicates. Additionally, features like auto-merge are not available and advanced features like scheduled jobs require elevated licenses.
Duplicate Check is a third party tool that provides a native Salesforce experience as well as advanced duplicate management capabilities. It can be used to identify and prevent duplicates either as one-offs or in bulk through scheduled jobs or auto-merge duplicates based on user-defined criteria.
- Duplicate Check Local. When deduplicating larger data volumes, you can leverage the processing power of your own device and avoid web browser or other limitations by processing jobs locally.
- Duplicate automation. This feature reviews information that enters Salesforce through integrations. It attempts to clean the data before it is ever saved to Salesforce using merge rules. Additionally, these rules can be leveraged to automatically merge thousands of found duplicates.
- Duplicate prevention. Much like the Salesforce feature, duplicate prevention warns users if they are about to create a duplicate record based on user-defined criteria.
- Cross object duplicate checking. This allows administrators to define fields for finding duplicates across objects, even custom ones.
Duplicate Check has free and paid pricing tiers and discounts for nonprofit organizations. Some features, such as Duplicate Check Local or access to code-based integrations require higher tier licenses. Duplicate Check is a full-featured solution that offers valuable automation and other premium deduplication features such as API integrations and local applications.
Summary + Conclusion
Duplicate management is not a one-and-done project, it is an initiative that requires ongoing management and organizational resources. Just as Salesforce should be seen as a strategic organizational investment, data quality should be seen as a core component of that investment. Salesforce administrators and database managers should work towards creating shared ownership over data quality. This requires implementing a data management strategy and finding the right tools to create a strong foundation and ongoing support for data quality goals.