Data Import

LOAD CSV is an ETL Power Tool


  • Load CSV data from a http or file URL

  • Create, update or extend graph structures

  • Transform and convert CSV values

  • Allows us to import into our graph model

  • Up to 10M nodes & relationships

This is our CSV Data


movies.csv

title

released

tagline

The Matrix

1999

Welcome to the Real World

Something’s Gotta Give

1975

people.csv

name

born

Michael Sheen

1969

Jack Nicholson

1937

actors.csv

movie

roles

person

Something’s Gotta Give

Julian Mercer

Keanu Reeves

Johnny Mnemonic

Johnny Mnemonic

Keanu Reeves

Reading data from CSV with Cypher


LOAD CSV     // load csv data
WITH HEADERS // optionally use first header row as keys in 'row' map
FROM 'url'   // 'file:///data.csv' or  'http://.../data.csv' URL
AS row       // return each row of the CSV as list of strings or map
FIELDTERMINATOR ';' // alternative delimiter

... rest of the Cypher statement ...
If you use file URLs, access is limited to within $NEO4J_HOME/import.

LOAD CSV - Data Inspection


Row Count

LOAD CSV FROM
 'http://data.neo4j.com/intro/movies/movies.csv' AS row
RETURN count(*);

Row as List Data

LOAD CSV FROM
 'http://data.neo4j.com/intro/movies/movies.csv' AS row
RETURN * LIMIT 5;

LOAD CSV - WITH HEADERS


Row as Map Data

LOAD CSV WITH HEADERS FROM
 'http://data.neo4j.com/intro/movies/movies.csv' AS row
RETURN row, keys(row) LIMIT 5;

Data Conversion

LOAD CSV WITH HEADERS FROM
     'http://data.neo4j.com/intro/movies/movies.csv' AS row
RETURN row.title as title, toInt(row.released) as released, row.tagline as tagline
ORDER BY released DESC LIMIT 10;

Let’s create some nodes & relationships


You know how it works, just use CREATE or MERGE

What would these statements look like:

CREATE (m:Movie {title:'The Matrix', released: 1999, tagline: '...'});
`MERGE`  (p:Person {name:'Keanu Reeves'}) ON CREATE SET p.born = 1964;
MATCH (p:Person {name:'Keanu Reeves'}), (m:Movie {title:'The Matrix'})
CREATE (p)-[:ACTED_IN {roles:['Neo']}]->(m);

LOAD CSV - Create Nodes


CREATE Movies

LOAD CSV WITH HEADERS FROM
     'http://data.neo4j.com/intro/movies/movies.csv' AS row
CREATE (m:Movie {title: row.title, released: toInt(row.released), tagline: row.tagline})
RETURN m;

MERGE People

LOAD CSV WITH HEADERS FROM
     'http://data.neo4j.com/intro/movies/people.csv' AS row
MERGE (p:Person {name: row.name}) ON CREATE SET p.born = toInt(row.born)
RETURN p;

LOAD CSV - Create Relationships


LOAD CSV WITH HEADERS FROM
     'http://data.neo4j.com/intro/movies/actors.csv' AS row
FIELDTERMINATOR ','
MATCH  (p:Person {name: row.person })
MATCH  (m:Movie  {title: row.movie})
MERGE (p)-[actedIn:ACTED_IN]->(m)
ON CREATE SET actedIn.roles = split(row.roles,';')
RETURN *;

Clean out Database


Click and run to clean out your database

MATCH (n)
DETACH DELETE n;

Import our Domain Data as a Script


:play movies