The technology behind the FinCEN files investigation
ICIJ’s latest investigation, the FinCEN files, sheds light on how financial criminals use US banks to move money throughout the world. This blog post looks behind the scenes of the investigation to explain how ICIJ used Linkurious Enterprise coupled with the Neo4j graph database and other tools to uncover stories of corruption, fraud and money laundering.
NB: this article is based on the public information available as of September 21, 2020. It may be updated as more information is made available.
The FinCEN files investigation
Financial institutions operating in the US need to monitor their clients and warn the United States Treasury Department’s intelligence unit, the Financial Crimes Enforcement Network (FinCEN) when they suspect a client is conducting unusual transactions indicative of money laundering or terrorism funding. Some 2,100 Suspicious Activity Reports (SARs) were obtained by BuzzFeed News and subsequently shared with the International Consortium for Investigative Journalism (ICIJ) and its partners.
The data in the FinCEN files represents more than $2 trillion in transactions between 1999 and 2017 (including $514 billion at JPMorgan and $1.3 trillion at Deutsche Bank). It accounts for less than 0.02% of the more than 12 million SARs that financial institutions filed with FinCEN between 2011 and 2017.
ICIJ and its partners analyzed the data in the FinCEN files and collected additional documents. Together they were able to uncover evidence of how individuals suspected of financial crime such as Jho Low, Paul Manafort or Isabel dos Santos use the global financial system to move money.
How ICIJ used technology (and people) to find insights in the FinCEN files
An SAR sent to FinCEN mixes structured information (e.g. check-boxes), semi-structured information (e.g. addresses) and unstructured information (e.g. a written explanation of the report), all in a single PDF file. An SAR can be sent alongside a spreadsheet with raw transaction data.
How to make sense of 2,100 such files? First, ICIJ made all the records available to its partners via Datashare, its data sharing and analysis platform. After removing duplicates, standardizing bank names, investigators could start searching within the data for people, companies and more.
The biggest challenge was to make sense of more than 8,000 pages of narratives. At first, ICIJ’s partner, SVT, used machine learning to extract transactional data from the raw documents. The variations in language and the complexity of the reports proved too big of a challenge. In the end , ICIJ and its partners decided to go manual. For more than a year, 85 journalists in 30 countries reviewed and extracted transaction information from assigned suspicious activity reports and manually entered it into Excel files, which were then uploaded to ICIJ’s communications platform, the Global iHub.
The effort resulted in 55,000 records of structured data and included details on more than 200,000 transactions flagged by the banks in the SARs. ICIJ built a fact checking tool to help ensure the accuracy of the extracted data.
The role of Linkurious: Graph-enabled investigation
Over the years, ICIJ has used Linkurious Enterprise in many of its investigations including the Panama Papers and the Paradise Papers. This time, ICIJ used Linkurious Enterprise and Neo4j to investigate the FinCEN Files’ 400 spreadsheets containing data on 100,000 transactions.
With Linkurious Enterprise, journalists from ICIJ, BuzzFeed News and their 108 other media partners in 88 countries were able to visually navigate complex transactions to better understand the parties involved and what made their transactions deemed worthy of an SAR.
Linkurious Enterprise is used throughout the world by financial institutions and government agencies to detect and investigate fraud and money laundering networks. To learn more, please feel free to contact us!
A spotlight on graph technology directly in your inbox.