FinCEN Files

The International Consortium of Investigative Journalists (ICIJ) has recently published a new investigation, the FinCEN Files, which exposed a vast network of industrial-scale money laundering running through Western banks and generally ignored by U.S. regulators – and they used Neo4j to help crack the case wide open.

The results draw from more than 2100 suspicious activity reports (SARs) between 1997 to 2017, which accounted for transactions of more than $2 trillion USD in dirty money. These reports were filed by banks and financial firms with the U.S. Department of Treasury’s Financial Crimes Enforcement Network (FinCEN) but were largely ignored or overlooked.

Visual Exploration

If you don’t have so much experience with databases, we recommend you to use the "Neo4j Bloom" App in the Sandbox UI (make sure to copy the password to log in).

There you can explore this data visually by entering search phrases like

Entity Deutsche Bank Filing
Filing Benefits Entity Russia
Entity transferred Entity

And then expand / explore the resulting data visually or inspect the selected data in the bottom left card view.

Data and Data Model

The ICIJ made a subset of the data available, you can download CSV files with the SAR information from their website.

We have imported the Filings, and the money flow between the participating entities and their respective countries into Neo4j.

The resulting data model is straightforward but still allows interesting investigations.

Exploring the Data

You can run these queries by clicking on the code-block and then the triangular run button top-right.

Top Filings and the involved banks

MATCH (f:Filing)
WITH f ORDER BY f.amount DESC LIMIT 10
MATCH (f)-[rel]-(e:Entity)
RETURN *

Entities with highest Transaction Volume

MATCH (e:Entity)--(f:Filing)
WITH e, round(sum(f.amount)) as total
WITH e, total ORDER BY total DESC LIMIT 10
OPTIONAL MATCH (e)-[:COUNTRY]-(c:Country)
RETURN e.name, c.name, total

Money Flow

Looking at the ORIGINATOR and BENEFITS relationships, one can aggregate the transaction amounts flowing to the beneficiaries.

MATCH (from:Entity)<-[:ORIGINATOR]-(f:Filing)-[:BENEFITS]->(to:Entity)
WITH from, to, round(sum(f.amount)) as sum
ORDER BY sum DESC LIMIT 10
RETURN from.name, to.name, sum

We can use the same concept and construct a projected graph of Entity to Entity money transfers to run some graph algorithms on.

MATCH (from:Entity)<-[:ORIGINATOR]-(f:Filing)-[:BENEFITS]->(to:Entity)
WITH from, to, round(sum(f.amount)) as sum
MERGE (from)-[t:TRANSFERRED]->(to) SET t.amount = sum

Graph Data Science - Clustering

First we project our "Transfer" Graph into an in memory model for graph compute, using nodes with the Entity label and the TRANSFERRED relationship with its amount property.

CALL gds.graph.project('transfers',{
  nodeProjection: 'Entity',
  relationshipProjection: {
    relType: {type: 'TRANSFERRED', orientation: 'NATURAL',
      properties: { amount: { property: 'amount', defaultValue: 1.0 }}
    }
  },
});

Now we can run a clustering algorithm like Louvain.

CALL gds.louvain.write('transfers',
{
  relationshipWeightProperty: 'amount',
  includeIntermediateCommunities: false,
  writeProperty: 'cluster'
});

And display the results. I.e. what are the largest clusters found.

MATCH (e:Entity)
RETURN e.cluster as cluster, count(*) as size, collect(e.name)[0..5] as names
ORDER BY size DESC LIMIT 10;

Graph Data Science - MoneyTransfer - PageRank

Then we can find the main recipients of the transitively transferred money using the PageRank algorithm on the same projected graph.

CALL gds.pageRank.write('transfers',
{
  relationshipWeightProperty: 'amount',
  writeProperty: 'pagerank'
});

And again visualize the results, e.g. who is the biggest beneficiary in our largest cluster.

MATCH (e:Entity)
WITH e.cluster as cluster, count(*) as size, collect(e) as entities
ORDER BY size DESC LIMIT 1
UNWIND entities as e
MATCH ()-[trans:TRANSFERRED]->(e)
RETURN e.name, e.pagerank, sum(trans.amount) as total
ORDER BY e.pagerank DESC LIMIT 10;

Resources

Of course there is much more to do with this data, so feel free to play around more, or reach out to us.