Entity resolution and knowledge graph: A powerful duo for faster and clearer contextual insights
Entity resolution and knowledge graphs mutually reinforce one another and when combined, offer significant advantages in terms of data management and analysis. Entity resolution, also known as data matching, consolidates disparate data sources by identifying and linking records referring to the same entity, reducing redundancy and facilitating data integration. Knowledge graphs, on the other hand, provide a structured representation of knowledge, organizing entities and their relationships in a semantically rich format.
In this article, we’ll dive deep into the mechanics of integrating entity resolution with knowledge graphs. We will explore how this integration not only improves data accuracy and clarity but also significantly speeds up the process of deriving actionable contextual insights. Additionally, we'll highlight how advanced technologies such as Neo4j, Senzing and Linkurious make the creation and utilization of Entity Resolved Knowledge Graphs (ERKG) more efficient and effective.
If the data you’re dealing with comes from a single structured data source, there’s probably a good chance that the entities within that data source (e.g. people, companies, products) have been somewhat deduplicated. If you’re dealing with unstructured data or trying to bring together different data sources, the situation may be different. How can you identify that the company called “Acme Inc” within your internal supplier database is the same as the “Acme Inc” from an external database, for example? A simple strategy of importing each dataset within a knowledge graph would result in very messy data that would limit the ability of analysts or investigators to easily and quickly derive key insights.
Using entity resolution technology when building a knowledge graph, disparate data sources are reconciled. The resulting knowledge graph is more accurate and can lead to deeper insights based on context and more effective, faster decision-making.
Paco Nathan, Managing Partner at Derwen.ai has written a tutorial that showcases the use of Senzing, an entity resolution software, and Neo4j, a leading graph database provider, to construct an entity resolved knowledge graph.
The project consists of building a knowledge graph with 85K records across 3 data sources related to businesses in Las Vegas:
- SafeGraph: data about locations in Las Vegas;
- US Department of Labor: data related to reporting around wage and hour compliance;
- US Small Business Administration: data related to the Paycheck Protection Program (PPP) Loans over $150K.
The code is simple to download and easy to follow and can be tested in your own environment. It’s a great way to get hands-on experience around combining entity resolution techniques and knowledge graph construction.
In this section, we will use the knowledge graph of the Las Vegas businesses. It has been built based on Paco Nathan’s tutorial that leverages entity resolution to bring 3 data sources into a unified knowledge graph. We will use this knowledge graph to show 1) the value of entity resolution and 2) how Linkurious Enterprise helps analyze the data.
Here’s a screenshot showing 9 distinct records coming from SafeGraph (1 record in green) and the US Department of Labor (8 records in purple). Without entity resolution, the knowledge graph doesn’t show any interconnection between the different data points.
The use of Senzing’s entity resolution technology allows us to identify that these records are interconnected in the knowledge graph. This is materialized in Linkurious Enterprise via a MGM Grand Hotel node which connects the otherwise unconnected 9 records via the “RESOLVES” relationship.
When looking at a significant sample of records, these interconnected records are easy to spot in Linkurious.
It’s possible to run graph algorithms on the entity resolved knowledge graph to help generate further key insights. The entity resolution process helps improve the quality of the results in the knowledge graph records.
Finally, we can leverage Linkurious’s alerts to identify subgraphs that are worth inspecting.
Entity resolution is essential for consolidating diverse data sources into a high-quality knowledge graph. By integrating best-of-breed technologies such as Neo4j, Senzing, and Linkurious, you can significantly reduce implementation time and enhance the value delivered to data professionals and business teams. A robust entity resolution process helps identify connections that would otherwise remain hidden. It also helps with the quality of investigations and analytics. Graph visualization and analytics tools like Linkurious provide an easy way to help analysts make sense of their entity resolved knowledge graph, ultimately boosting productivity and improving outcomes.
Want to learn more about entity resolution and graphs? Read the white paper "Combining entity resolution and graph technology for cost-effective, advanced decision intelligence."
A spotlight on graph technology directly in your inbox.