Technology

Supercharging investigations with unstructured data analysis through NLP and graph technology

May 20, 2024

In complex investigations involving networks of data, fragmented or partial views leave investigators with blind spots. Whether they are working in cybersecurity, law enforcement, or intelligence, investigators need to understand the full context around their entities of interest - the relationships, activities, flows of money, communications, and more. 

Some of the most valuable investigative data is unstructured. Emails, documents, social media, instant messages, and other forms of unstructured data hold key insights - but sorting through these massive haystacks to find the critical needles remains a monumental challenge for many organizations.

Analyzing unstructured data and giving it context requires de-siloing data across different systems to gain a unified and complete understanding of these essential connections. It also requires technology that gives investigators the ability to see deep into their data quickly and at scale, giving them faster, easier ways to identify suspicious activity.

Leveraging advanced technologies like natural language processing (NLP) in tandem with graph analytics and visualization lets investigators unlock their unstructured data, while lending power and speed to their investigations. 

Why analyze unstructured data?

A vast majority of the data most of us access and work with is considered unstructured. This means it has no regular format: text documents, images, videos, social media posts, etc. IDC actually estimates that as much as 93% percent of all data generated is unstructured.

Despite this fact, most organizations focus their data analytics efforts on structured information (e.g. numerical or categorical, date/time, geospatial data) which adheres to predefined formats, and is relatively easy to analyze and interpret. This means most of the world’s data remains largely untapped.

But trapped inside this data is a treasure trove of hidden insights that could help solve a wide range of investigative challenges. The ability to leverage unstructured data within high-stakes, time sensitive investigations involving national security or public safety can make all the difference when getting to the bottom of a case. 

So why isn’t everyone analyzing their unstructured data for use in their investigations? It’s because doing so comes with many inherent challenges. 

Unstructured data key figures

The challenges of unstructured data analysis

Making sense of unstructured data is costly, complicated, and time consuming. Interpreting and analyzing unstructured data is a major challenge due to its many complexities: perpetual exponential growth, variety and heterogeneity, and the nuance of human language - just to name a few.

Additionally, processing and analyzing unstructured data often requires significant computational resources, advanced algorithms, high-skilled staffing, and large volumes of labeled data. While large, well-capitalized businesses can throw money and bodies at the problem, most organizations simply choose to avoid the cost and complexity and optimize their efforts around what they know well: structured data.

But by focusing only on structured data, investigators risk missing essential information, leaving them one step behind bad actors. 

Unstructured data formats
Examples of unstructured data

Combining technologies for unstructured data analysis and investigation

Given the stakes at play and the unceasing explosion of unstructured data, now is the time for investigators to embrace new thinking and emerging technologies to address this issue head on. In our recent white paper, we put forth an approach that leverages a blend of natural language processing from Nuix, graph analytics from Memgraph, and curated knowledge graph visualization from Linkurious to quickly transform unstructured data into actionable insights to optimize and dramatically accelerate investigations.

Making sense of unstructured data with NLP and graph technology

The result of combining these technologies from Nuix, Linkurious, and Memgraph is a unique joint solution that facilitates and accelerates complex investigations for organizations requiring link analysis of their siloed, unstructured data. 

By integrating advanced NLP and AI with powerful and user-friendly graph technology, the joint solution called Nuix NLP AI empowers data-driven investigators to unlock the power of link analysis on complex connected data without the need for extensive resources or highly specialized expertise. This solution is opening new opportunities for teams to easily derive actionable insights from myriad sources of unstructured data to drive more informed decisions across use cases and industries.

One of the common problems organizations face when leveraging graph analytics across unstructured data is the “hairball” problem. Due to the common inability to properly tame unstructured data, they end up polluting their graph with excessive false positives and irrelevant information, which explodes the number of nodes and edges, rendering the graph useless. The joint solution’s complementary technologies overcome this issue by data enrichment, scoring, minimization, or organization (data structuring), to ensure that the graph data has clear and relevant field definitions. 

integrated link analysis solution
Linkurious, Nuix and Memgraph's integrated link analysis solution

In summary:

  • Nuix AI converts mountains of messy, unstructured data into highly searchable, well-structured data.
  • Memgraph seamlessly stores and analyzes this structured data and its myriad relationships in a scalable, high-performance graph structure.
  • Linkurious enables dynamic, interactive exploration of those relationships, offering meaningful insights into the data that are impossible to achieve with other solutions. 

Put together and enhanced with advanced alerting from Linkurious, the joint solution offers something that goes beyond the capabilities of other solutions on the market. 

Take a deep dive into unstructured data analysis using advanced technology

A white paper written in collaboration with Nuix and Memgraph details how to combine these technologies to gain unprecedented efficiency in the investigation. You’ll learn how to interpret and analyze unstructured data in depth, and how to transform it into a resource anyone, from technical to non-technical users can use.

Access the white paper to learn in depth how NLP and graph technologies can work together to deliver faster, more efficient investigations.

Subscribe to our newsletter

A spotlight on graph technology directly in your inbox.

TOP