Pandora Papers - The Data Connecting Politicians, Criminals and a Rogue Industry that hides their cash

Pandora Papers: Secrets of the Global Elite

About this data

The ICIJ Offshore Leaks database, which you are working with in Neo4j, contains information on almost 800,000 offshore entities that are part of the Pandora, Paradise and Panama Papers and the Offshore Leaks investigations. The data covers a long time of activities – and links to people and companies in more than 200 countries and territories.

The real value of the database is that it strips away the secrecy that cloaks companies and trusts incorporated in tax havens and exposes the people behind them. This includes, when available, the names of the real owners of those opaque structures. In all, it reveals more than 500,000 names of people and companies behind secret offshore structures. They come from leaked records and not a standardized corporate registry, so there may be duplicates. We suggest you confirm the identities of any individuals or entities located in the database based on addresses or other identifiable information.

There are legitimate uses for offshore companies and trusts. We do not intend to suggest or imply that any persons, companies or other entities included in the database have broken the law or otherwise acted improperly. If you find an error in the database please get in touch with ICIJ.

The Shape of the Data

data model

The Offshore Leaks Database was imported into Neo4j to be used by journalists and researchers to take advantage of the connections in the data. To the left is the basic "property graph" data model. Each data record is called a "node" representing an entity, intermediary, officer or address. They're connected to form a "graph" that reveals a complex web of relationships. To the left you can see a simplified diagram how the nodes connect to each other.

These are the types of nodes that you will encounter in the data:

  • Entity - The offshore legal entity. This could be a company, trust, foundation, or other legal entity created in a low-tax jurisdiction.

  • Officer - A person or company who plays a role in an offshore entity, such as beneficiary, director, or shareholder. The relationships shown in the diagram are just a sample of all the existing ones.

  • Intermediary - A go-between for someone seeking an offshore corporation and an offshore service provider — usually a law-firm or a middleman that asks an offshore service provider to create an offshore firm.

  • Address - The registered address as it appears in the original databases obtained by ICIJ.

  • Other - Other entities found in the data.

How Graphs Helped our Investigation

The Offshore Leaks data exposes a set of connections between people and offshore entities. Graph databases are the best way to explore the relationships between these people and entities — it’s much more intuitive to use for this purpose than a SQL database or other types of NoSQL databases.

For example, let's say we want to discover the shortest paths between two entity officers through a set of Entity or Address nodes. This is quite easy with Cypher, Neo4j's graph query language.

MATCH (a:Officer),(b:Officer)
          WHERE a.name CONTAINS 'Ross, Jr' AND a.sourceID STARTS WITH "Pandora Papers"
            AND b.name CONTAINS 'Grant' AND b.sourceID STARTS WITH "Pandora Papers"
          MATCH p=allShortestPaths((a)-[:officer_of|intermediary_of|registered_address*..10]-(b))
          RETURN p
          LIMIT 50

To execute this query, please click on the statement above to put the query in the query editor above.
Hit the triangular button or press Ctrl+Enter to run it and see the resulting visualization.

The resulting graph allows us to explore how these people are connected:

How to ask questions using a query language

Graph Patterns

Neo4j’s query language, Cypher, is centered around graph patterns which represents entities with parentheses, for example, (e:Entity) and connections with arrows, for example -[:intermediary_of]->. :Entity and :intermediary_of are the types of the entity and the connection, respectively.

Here is an example pattern: (:Intermediary)-[:intermediary_of]->(:Entity). These patterns may be found with the MATCH clause.

Other Clauses

The following clauses may follow a MATCH clause. They work with the properties stored at the nodes and relationships found in the graph matching that pattern.

filter

WHERE intermediary.name CONTAINS 'MOSSACK'

aggregate

WITH e.jurisdiction AS country, COUNT(*) AS frequency

return

RETURN country, frequency

order

ORDER BY frequency DESC

limit

LIMIT 20;

Jurisdiction distribution of intermediaries in the ICIJ offshore leaks DB
MATCH (intermediary:Intermediary)-[:intermediary_of]->(e:Entity)
          WHERE intermediary.name CONTAINS 'Cordero' AND e.sourceID STARTS WITH "Pandora Papers"
          RETURN e.jurisdiction AS country, COUNT(*) AS frequency
          ORDER BY frequency DESC LIMIT 20;

Click on the block to put the query in the topmost window on the query editor. Hit the triangular button or press Ctrl+Enter to run it and see the resulting visualization.

Querying the Data

investigating

This guide covers:

  • Statistics

  • Visual vs Tabular Results

  • Investigating invidual people and entities

  • Finding x-degree relationships

  • Finding shortest paths and connections between entities

  • Frequently recurring pairs

This interactive guide will help you explore the Pandora, Paradise and Panama Papers and the Offshore Leaks data using Cypher, Neo4j’s graph query language.

You’ll have the same investigative power we had in order to discover additional stories behind the data.

Be sure to check out the shape of the data section to understand the basics of Cypher and the data model used in the graph database.

Overview


Let’s see what data is in this database.

Run the embedded query to examine graph structure that we have just created. We see nodes for each type of entity, connected by the relationships they have.

Meta Graph

CALL db.schema.visualization()

To execute this query, please click on the statement above to put the query in the query editor above.
Hit the triangular button or press Ctrl+Enter to run it and see the resulting visualization.

Counts per entity type

MATCH (node) WHERE node.sourceID STARTS WITH "Pandora Papers"
RETURN labels(node) AS type,count(*)

We can also check how many entities of each type are in our database.

Which intermediaries have the most connections to which entities

MATCH (i:Intermediary) WHERE count {  (i)--()  } > 100 AND i.sourceID STARTS WITH "Pandora Papers"
MATCH (i)-[connection]-(entity)
RETURN i.name as intermediary, type(connection) as relationship, head(labels(entity)) as type, count(*) as count
ORDER BY count DESC LIMIT 20

Filter datasource

Note that the above query is searching the Offshore leaks, Pandora-, Paradise- and Panama Papers datasets. To filter for just one source, use a WHERE clause. For example:

MATCH (i:Intermediary) WHERE i.sourceID STARTS WITH "Pandora Papers" AND count {  (i)--()  } > 100 AND i.sourceID STARTS WITH "Pandora Papers"
MATCH (i)-[connection]-(entity)
RETURN i.name as intermediary, type(connection) as relationship, head(labels(entity)) as type, count(*) as count
ORDER BY count DESC LIMIT 20

Query for visual results

Querying the graph with Cypher is all about graph pattern matching. To query the graph we define graph patterns to be searched. For example, we can define the pattern "Intermediary connected to an Entity node through the intermediary_of relationship" as:

(i:Intermediary)-[r:intermediary_of]->(e:Entity)

We use variables i, e and r respectively for later filtering with WHERE and RETURNing results, these are aliases that can be reused within a single Cypher statement.

Entities registered by an Intermediary

MATCH (i:Intermediary)-[r:intermediary_of]->(e:Entity)
WHERE i.name CONTAINS "Cordero" AND e.sourceID STARTS WITH "Pandora Papers"
RETURN i, r, e LIMIT 100

The results of this query are visualized as a graph, you can switch between the graph visualization and tabular results with the icons on the left side of the results view.

Query for tabular results


Let's query for the pattern "Officer node connected to an Entity node through connected_to relationship:

(o:Officer)-[:officer_of]->(:Entity)

We then count the entities per officer in an aggregation and ORDER BY that count DESCending and return the top-10 results.

Officers with most entities

MATCH (o:Officer)-[:officer_of]->(:Entity) WHERE o.sourceID STARTS WITH "Pandora Papers"
RETURN o.name, count(*) as entities
ORDER BY entities DESC LIMIT 10

Search for Officer nodes by name


Enter any name (e.g. from our published investigations) into the form then click on the query to execute to see if that person appears in the data. Note that this search is case sensitive and searches exact matches only. We're setting a parameter for the officer which we can reuse later, just click and run the :param block.

:param officer=>"Smith"
MATCH (o:Officer)
WHERE o.name CONTAINS $officer AND o.sourceID STARTS WITH "Pandora Papers"
RETURN o
LIMIT 100
After running the query, you can double-click on any node to expand connections in the graph around that node.

Search for an Officer and find the connections


Let’s see with which entities our officer was involved with, including first and second degree connections.

1st degree

MATCH (o:Officer)
WHERE o.name CONTAINS $officer AND o.sourceID STARTS WITH "Pandora Papers"
MATCH path = (o)-[r]->(:Entity)
RETURN path LIMIT 100

2nd degree entities

MATCH (o:Officer) 
WHERE o.name CONTAINS $officer o.sourceID STARTS WITH "Pandora Papers"
MATCH path = (o)-[]->(:Entity)
      <-[]-(:Officer)-[]->(:Entity)
RETURN path LIMIT 100

Find who is behind an Entity and the roles that they play


:param entity=>"Cordero"
MATCH (e:Entity)-[r]-(o:Officer)
WHERE e.name CONTAINS $entity AND e.sourceID STARTS WITH "Pandora Papers"
RETURN *
LIMIT 100

Explore the Pandora, Panama, and Paradise Papers Yourself

Shape of the Data

Understand the data model.

  • What are the nodes?
  • What are the relationships?
  • What are the properties?

Pandora Papers

Explore the latest leak yourself.

  • Cypher query intro
  • Finding companies and individuals
  • Power Players

Paradise Papers

Explore the Appleby data yourself.

  • Cypher query intro
  • Finding companies and individuals
  • Path finding

Panama Papers

Explore the Mossack Fonseca data yourself.

  • Cypher query intro
  • Explore the Panama Law Firm files
  • Connect the dots

Send ICIJ a tip

Help us investigate.

  • Interesting connections
  • Entities that matter to you