The biggest challenge was to make sense of more than 8,000 pages of narratives. At first, ICIJ’s partner, SVT, used machine learning to extract transactional data from the raw documents. The variations in language and the complexity of the reports proved too big of a challenge. In the end , ICIJ and its partners decided to go manual. For more than a year, 85 journalists in 30 countries reviewed and extracted transaction information from assigned suspicious activity reports and manually entered it into Excel files, which were then uploaded to ICIJ’s communications platform, the Global iHub.
The effort resulted in 55,000 records of structured data and included details on more than 200,000 transactions flagged by the banks in the SARs. ICIJ built a fact checking tool to help ensure the accuracy of the extracted data.