© Copyright 2023-2024 the original authors.

Abstract

This is the manual for the official Neo4j JDBC Driver.

This driver is officially supported and endorsed by Neo4j. It is a standalone driver, independent of and not built on top of the common Neo4j Java Driver. While the latter provides a Neo4j-idiomatic way to access Neo4j from Java, the JDBC driver adheres to JDBC 4.3.

This documentation refers to this driver as the Neo4j JDBC Driver and to the idiomatic Neo4j driver as the common Neo4j Java Driver.

1. Introduction

JDBC stands for "Java Database Connectivity" and is thus not bound exclusively to relational databases. Nevertheless, JDBC’s terms, definitions, and behavior are highly influenced by SQL and relational databases. As Neo4j is a graph database with quite a different paradigm than relational and a non-standardized behaviour in some areas, there might be some details that don’t map 100% in each place, and we make sure to educate you about these in this documentation.

This documentation focuses on install, use, and configure the Neo4j JDBC Driver, as well as discussing the driver’s design choices. While we do provide runnable examples showing how to use JDBC with Neo4j, this is not a documentation about how to correctly use JDBC as an API.

The Neo4j JDBC Driver requires JDK 17 on the client side and Neo4j 5.5+ on the server side. To use it with a Neo4j cluster, server-side routing must be enabled on the cluster.

1.1. Features

  • Fully supports the Java module system

  • Adheres to JDBC 4.3

  • Can run any Cypher statement

  • Implements DatabaseMetaData and ResultSetMetaData as fully as possible with a nearly schemaless database and general very flexible result sets, allowing for automatic metadata retrieval from ETL and ELT tools

  • Provides an SPI to hook in translators from SQL to Cypher

  • Provides an optional default implementation to translate many SQL statements into semantically similar Cypher statements

  • Can be safely used with JDBC connection pools as opposed to the common Neo4j Java Driver or any JDBC driver based on that, as it doesn’t do internal connection pooling and transaction management otherwise than dictated by the JDBC Spec

The absence of any connection pooling and transaction management is an advantage of the Neo4j JDBC Driver over the common Neo4j Java Driver. It allows to pick and choose any database connection pooling system such as HikariCP and transaction management such as Jakarta Transactions.

1.2. Limitations

  • The database metadata is retrieved using Neo4j’s schema methods, such as db.labels, db.schema.nodeTypeProperties(), which may not always be accurate

  • While single label nodes map naturally to table names, nodes with multiple labels don’t

  • There is no reliable way to always determine the datatype for properties on nodes, as it would require reading all of them (which this driver does not do)

  • Some JDBC features are not supported yet (such as the CallableStatement); some feature will never be supported

  • The SQL to Cypher translator supports only a limited subset of clauses and SQL constructs that can be equivalently translated to Cypher (See Supported statements)

  • There is no "right" way to map JOIN statements to relationships, so your mileage may vary

1.3. When to use the Neo4j JDBC Driver?

  • Integration with ETL and ELT tools that don’t offer an integration based on the common Neo4j Java driver

  • An easier on-ramp towards Neo4j for people familiar with JDBC, who want to keep using that API, but with Cypher and Neo4j

  • Integration for ecosystems like Jakarta EE whose transaction management directly supports any JDBC-compliant driver

  • Integration with database migration tools such as Flyway

There is no need to redesign an application that is built on the common Neo4j Java Driver to migrate to this driver. If your ecosystem already provides a higher-level integration based on the common Neo4j Java Driver, such as Spring Data Neo4j (SDN) for Spring, there is no need to switch to something else. In case of Quarkus, the Neo4j JDBC Driver is an option to consider: although we do provide an integration for the common Neo4j Java Driver, this integration does not support Quarkus' transaction systems in contrast to this driver.

As there is little incentive to use this driver with Hibernate (Neo4j-OGM or SDN are the best alternatives for Neo4j), it might be worth giving Spring Data JDBC a try.

1.4. Differences with the previous versions of this driver and other JDBC drivers for Neo4j

Several other JDBC drivers exists for Neo4j, most notably the previous versions 4 and 5 of this driver. Most (if not all) of them wrap the common Neo4j Java Driver and implement the JDBC spec on top of that. This comes with a number of issues:

  • You end up with a pool of connection pools, because the common Neo4j Java Driver manages a connection pool, whereas JDBC drivers delegate this task to dedicated pooling solutions.

  • The transaction management of the common Neo4j Java Driver is not aligned with the way JDBC manages transactions.

  • Older versions of the Neo4j JDBC driver shade a few dependencies, such as Jackson as well as additional logging frameworks. This takes a toll on the classpath and, in case of logging, it leads to runtime problems.

  • Existing drivers with an SQL-to-Cypher translation layer are "read-only" and don’t support write statements, so they cannot be used for ETL use-cases aiming to ingest data into Neo4j.

This driver does not support automatic reshaping or flattening of the result sets, as the previous versions do. If you query for nodes, relationships, paths, or maps, you should use getObject on the result sets and cast them to the appropriate type (you find all of them inside the package org.neo4j.jdbc.values). However, the default SQL-to-Cypher translator will (when connected to a database) figure out what properties nodes have and turn the asterisk (*) into individual columns of nodes and relationships, just like what you would expect when running a SELECT * statement.

For information on upgrade/migration from other drivers to this one, see Neo4j JDBC Driver.

2. Usage

Add the JDBC driver to your application, for example as a Gradle dependency:

Listing 1. Using the full bundle as a runtime dependency inside a Gradle-based project
dependencies {
    runtimeOnly(org.neo4j:neo4j-jdbc-full-bundle:6.1.0) (1)
}
1 The coordinates are the same for a Maven project.

You can then use the Neo4j JDBC driver as you would do with any other JDBC driver:

In case any tooling asks you for the name of the concrete driver class, it is org.neo4j.jdbc.Neo4jDriver.
Listing 2. Acquire a connection and execute a query
import java.sql.DriverManager;
import java.sql.SQLException;
import java.util.List;

public final class Quickstart {

    static void queryWithCypher() {
        var query = """
                MATCH (m:Movie)<-[:ACTED_IN]-(p:Person)
                RETURN m.title AS title, collect(p.name) AS actors
                ORDER BY m.title
                """;

        var url = "jdbc:neo4j://localhost:7687";
        var username = "neo4j";
        var password = "verysecret";

        try (var con = DriverManager.getConnection(url, username, password); (1)
                var stmt = con.createStatement();  (2)
                var result = stmt.executeQuery(query)) { (3)

            while (result.next()) { (4)
                var movie = result.getString(1); (5)
                var actors = (List<String>) result.getObject("actors"); (6)
                System.out.printf("%s%n", movie);
                actors.forEach(actor -> System.out.printf("\t * %s%n", actor));
            }
        }
        catch (SQLException ex) {
            throw new RuntimeException(ex);
        }
    }
}
1 Instantiate a JDBC connection. There’s no need to do any class loading beforehand, the driver will be automatically registered
2 Create a (reusable) statement
3 Execute a query
4 Iterate over the results, as with any other JDBC result set
5 JDBC’s indexing starts at 1
6 JDBC also allows retrieval of result columns by name; the Neo4j JDBC driver also supports complex objects, such as lists

In the example above we used Neo4j’s lingua franca, Cypher, to query the database. The Neo4j JDBC Driver has limited support for using SQL as well. It can do so automatically or on a case-by-case basis. To translate a single, call java.sql.Connection#nativeSQL(String) and use the result in your queries. For automatic translation, instantiate the driver setting the optional URL parameter sql2cypher to true. The following example shows how:

Listing 3. Configure the JDBC driver to automatically translate SQL to cypher.
var query = """
        SELECT m.title AS title, collect(p.name) AS actors
        FROM Person as p
        JOIN Movie as m ON (m.id = p.ACTED_IN)
        ORDER BY m.title
        """; (1)

var url = "jdbc:neo4j://localhost:7687?enableSQLTranslation=true";

try (var con = DriverManager.getConnection(url, username, password);
        var stmt = con.createStatement();
        var result = stmt.executeQuery(query)) {

    while (result.next()) {
        var movie = result.getString(1);
        var actors = (List<String>) result.getObject("actors");
        System.out.printf("%s%n", movie);
        actors.forEach(actor -> System.out.printf("\t * %s%n", actor));
    }
}
1 This SQL query will be translated into the same Cypher query of the previous example. The remainder of the method is identical to before.

For more information, see SQL to Cypher translation.

The JDBC specification does not support named parameters, only index-based parameters, starting at 1. So for all PreparedStatement instances you need to specify parameters like this:

Listing 4. Using parameters with a PreparedStatement
var cypher = "CREATE (m:Movie {title: $1})";
try (var con = DriverManager.getConnection(url, username, password);
        PreparedStatement stmt = con.prepareStatement(cypher);) {
    stmt.setString(1, "Test");
    stmt.executeUpdate();
}

This is independent of the SQL-to-Cypher translation mechanism:

Listing 5. Using parameters with a PreparedStatement (SQL variant)
var sql = "INSERT INTO Movie (title) VALUES (?)";
try (var con = DriverManager.getConnection(url + "?enableSQLTranslation=true", username, password);
        PreparedStatement stmt = con.prepareStatement(sql);) {
    stmt.setString(1, "Test");
    stmt.executeUpdate();
}

To use named parameters, downcast the PreparedStatement to Neo4jPreparedStatement.

Listing 6. Using named parameters with a Neo4jPreparedStatement
var match = "MATCH (n:Movie {title: $title}) RETURN n.title AS title";
try (var con = DriverManager.getConnection(url, username, password);
        Neo4jPreparedStatement stmt = (Neo4jPreparedStatement) con.prepareStatement(match);) {
    stmt.setString("title", "Test");
    try (var resultSet = stmt.executeQuery()) {
        while (resultSet.next()) {
            LOGGER.info(resultSet.getString("title"));
        }
    }
}

2.1. Getting a connection via environment variables

If you are happy to depend directly on org.neo4j.jdbc.Neo4jDriver and want to get a connection as easy as possible, you might want to use fromEnv:

Listing 7. Get a connection from environment variables
// import org.neo4j.jdbc.Neo4jDriver;

try (var con = Neo4jDriver.fromEnv().orElseThrow(); (1)
        var stmt = con.createStatement();
        var result = stmt.executeQuery(query)) {

    // Same loop as earlier
}
catch (SQLException ex) {
    throw new RuntimeException(ex);
}
1 Notice how we directly use the concrete driver class here and how the methods return an optional: no connection can be created if the required connection variables are not found.

The fromEnv method looks for a few specific system environment variables and it adheres to the 12 factor app principles:

  • First, it looks in the system environment

  • Second, it looks for a file named .env in the current working directory. There are overloads that let you configure the directory and the filename to look for.

The supported variables are:

NEO4J_URI

The address or URI of the instance to connect to.

NEO4J_USERNAME

(Optional) Username.

NEO4J_PASSWORD

(Optional) Password.

NEO4J_SQL_TRANSLATION_ENABLED

(Optional) Whether to enable full SQL-to-Cypher translation, defaults to false.

Information from both the system environment and the .env files are combined. If for example NEO4J_SQL_TRANSLATION_ENABLED is in the system environment but not in the .env file, it will still be picked up. Given the order of priority, information in the system environment always has precedence over the .env file.

This feature is especially useful with Neo4j AuraDB. When creating a new AuraDB instance, you download a .env file that you can directly use with the Neo4j JDBC Driver:

Listing 8. Using a .env file from AuraDB
try (
    var con = Neo4jDriver.fromEnv("Neo4j-cb3d8b2d-Created-2024-02-14.txt")
        .orElseThrow();
    var stmt = con.createStatement();
    var movies = stmt.executeQuery("MATCH (n:Movie) RETURN n.title AS title")
) {
    while (movies.next()) {
        System.out.println(movies.getString("title"));
    }
}

3. Distribution

3.1. Driver modules and dependencies

The driver consists of 3 modules:

neo4j-jdbc

This is the actual JDBC implementation for Neo4j

neo4j-jdbc-translator-spi

This is the SPI for the default SQL-to-Cypher translation implementation and for any further or custom implementation.

neo4j-jdbc-translator-impl

This is the default and optional implementation of an SQL-to-Cypher translator. It provides a somewhat opinionated approach of translating SQL statements into semantically equivalent Cypher statements.

If you just want to use the Neo4j JDBC Driver to run Cypher statements, you only need the module org.neo4j:neo4j-jdbc. As long as you are developing in an environment with proper dependency management, this is all you need to declare.

While the translator SPI is a required dependency and gets pulled in via Maven or Gradle, the actual implementation is not. This allows for:

  • Opting out of having additional dependencies if you don’t need the SQL-to-Cypher translation

  • Requiring different JDK baselines or licensing modules for our implementations

  • Allowing you to create alternative translators

3.1.1. Dependencies

There are two "dependency-free" modules, which shade everything into one binary artifact each. Those bundles cover a lot of tooling (see Available bundles).

The driver’s direct, compile-time dependencies are listed below:

Listing 9. Dependency tree of the core driver
org.neo4j:neo4j-jdbc
+- io.github.cdimascio:dotenv-java
+- io.netty:netty-handler
|  +- io.netty:netty-common
|  +- io.netty:netty-resolver
|  +- io.netty:netty-buffer
|  +- io.netty:netty-transport
|  +- io.netty:netty-transport-native-unix-common
|  \- io.netty:netty-codec
+- org.neo4j:neo4j-cypher-dsl-schema-name-support
+- org.neo4j:neo4j-jdbc-translator-spi

3.2. Available bundles

All bundles of the Neo4j JDBC Driver are distributed on Maven Central. The bundles have different characteristics: depending on your use-case or your environment, you can pick one or the other bundle, but not both at once.

3.2.1. Individual components

Pick this distribution if you have an application that uses dependencies management in which you don’t have to care about dealing with binary artifacts yourself.

We offer a Maven BOM project, which you should import when using the individual modules, so that you have always a consistent set of versions. The coordinates are org.neo4j:neo4j-jdbc-bom.

For Maven, use this dependency declaration:

Listing 10. Maven dependency for the core Neo4j JDBC Driver artifact
<dependency>
    <groupId>org.neo4j</groupId>
    <artifactId>neo4j-jdbc</artifactId>
    <version>6.1.0</version>
</dependency>

And for Gradle you would want to declare the following runtime dependency:

Listing 11. Neo4j JDBC Driver as Gradle runtime dependency
dependencies {
    runtimeOnly 'org.neo4j:neo4j-jdbc:6.1.0'
}

If you want to use the SQL-to-Cypher translation from Neo4j, you need to add the following dependency in your Maven build:

Listing 12. Maven dependency for the default SQL to Cypher translator
<dependency>
    <groupId>org.neo4j</groupId>
    <artifactId>neo4j-jdbc-translator-impl</artifactId>
    <version>6.1.0</version>
</dependency>

and the same coordinates, but for Gradle:

Listing 13. Gradle dependency for the default SQL to Cypher translator
dependencies {
    runtimeOnly 'org.neo4j:neo4j-jdbc-translator-impl:6.1.0'
}

3.2.2. Small bundle

Pick this distribution if you work with ETL tools or tooling for relational databases that allow adding the JDBC driver only as single artifacts or that make it unnecessary hard to add additional jars. This bundle does not contain the default SQL-to-Cypher translator! The coordinates of this bundle are org.neo4j:neo4j-jdbc-bundle and you can download it from Maven central:

All bundles can be used as normal project dependencies as well. This might be useful for example if your project depends on a different, potentially conflicting Netty version.

3.2.3. Full bundle

This bundle contains the default SQL-to-Cypher translator. Otherwise, it is identical to the small bundle. Its coordinates are org.neo4j:neo4j-jdbc-full-bundle and you can download it from Maven central here:

3.2.4. text2cypher bundle

We also ship a very experimental bundle that we call text2cypher:

See Neo4j JDBC Driver more information.

4. Configuration

4.1. Driver class name

The Neo4j JDBC Driver is org.neo4j.jdbc.Neo4jDriver. With modern Java tools, you should not need to touch this class directly, but there are some connection pools and front-ends that will ask you for this. The class is public API.

We also provide org.neo4j.jdbc.Neo4jDataSource as javax.sql.DataSource.

4.2. Causal clustering and bookmarks

The Neo4j JDBC Driver uses bookmarks by default to provide causal consistency in all Neo4j deployments. Bookmarks are managed on the driver level itself, not on the connections spawned by an instance of the driver, so all connections spawned by one instance will partake in the same causal chain of transactions. Connections from different driver instances will not use the same set of bookmarks and there is no built-in machinery that would enable this. If you want or need this, you can directly access the Neo4jDriver type to retrieve the current set of known bookmarks and pass them to another driver instance.

4.3. Neo4j transactional metadata

Neo4j supports attaching metadata to transactions, see SHOW TRANSACTIONS. As there is no explicit transaction object in the JDBC specification, the Neo4j JDBC driver needs another mechanism to make these configurable.

The JDBC driver provides the extension interface Neo4jMetadataWriter. Our driver, the connection implementation, and all statement variants can be unwrapped accordingly. The configuration is additive: metadata configured for a driver instance will be used for all connections spawned from that driver, connections can add further metadata, and statements can also add their own metadata. Metadata added on a statement has precedence over connection metadata which in turn has precedence over driver metadata:

Listing 14. Configuring transactional metadata
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.SQLException;
import java.sql.Statement;
import java.util.Map;
import java.util.Properties;
import java.util.logging.Level;
import java.util.logging.Logger;

import org.neo4j.jdbc.Neo4jDriver;
import org.neo4j.jdbc.Neo4jMetadataWriter;

public final class TransactionMetadata {

    private static final Logger LOGGER = Logger.getLogger(TransactionMetadata.class.getPackageName());

    public static void main(String... args) throws SQLException {
        var url = "jdbc:neo4j://localhost:7687";

        var driver = (Neo4jDriver) DriverManager.getDriver(url);
        driver.withMetadata(Map.of("md_from_driver", "v1", "will_be_overwritten", "irrelevant"));

        var properties = new Properties();
        properties.put("user", "neo4j");
        properties.put("password", "verysecret");

        try (
            var con = driver.connect(url, properties)
                .unwrap(Neo4jMetadataWriter.class)
                .withMetadata(Map.of("md_from_connection", "v2", "will_be_overwritten", "xxx"))
                .unwrap(Connection.class);
            var statement = con.createStatement()
                .unwrap(Neo4jMetadataWriter.class)
                .withMetadata(Map.of("md_from_stmt", "v3", "will_be_overwritten", "v4"))
                .unwrap(Statement.class)
        ) {
            try (var result = statement.executeQuery("SHOW TRANSACTIONS YIELD metaData")) {
                while (result.next()) {
                    var metaData = result.getObject("metaData", Map.class);
                    LOGGER.log(Level.INFO, "{0}", metaData);
                }
            }
        }
    }
}

The output will be similar to this:

Juli 17, 2024 1:18:16 PM org.neo4j.jdbc.docs.TransactionMetadata main
INFORMATION: {md_from_driver=v1, md_from_connection=v2, md_from_stmt=v3, will_be_overwritten=v4}

4.4. URL and connection properties

The canonical URL format for the Neo4j JDBC Driver is

jdbc:neo4j://<host>:<port>/<database>?param1=value1&param2=value2

The database name and all query parameters are optional and can be omitted. All configuration arguments can be passed either as query parameters or via a java.util.Properties object. The latter is sometimes defined by tooling for you.

With regard to authentication, we recommend to follow the JDBC specification, which discourages using any form of URL authentication. All query parameters must be percent-encoded if they contain special characters, e.g. …​?param1=space%20separated.

The driver supports the following URI schemes, which tweak the security configuration:

  • neo4j - No encryption.

  • neo4j+s - Enables encryption and only accepts SSL certificates from the server that are signed by a known Certificate Authority.

  • neo4j+ssc - Enables encryption and accepts self-signed certificates (which must be added to the certificate store).

bolt URI schemes are not supported.

The driver accepts the following configuration arguments, either as properties or as URL query parameters:

Table 1. Configuration arguments
Name Type Meaning Default

timeout

Integer

Timeout for connection acquisition in milliseconds

1000

agent

String

User agent

neo4j-jdbc

enableSQLTranslation

Boolean

Flag that enables automatic translation from SQL-to-Cypher (requires a translator on the classpath)

false

cacheSQLTranslations

Boolean

Flag that enables caching of translations. SQL translations are not "free": parsing of SQL costs a bit of time, and so does Cypher rendering. In addition, we might up look up metadata to be able to project individual properties. If this takes too long, translations may be cached.

false

rewritePlaceholders

Boolean

Flag that allows you to use ? as placeholder in Cypher statements (as required by JDBC). These will automatically be rewritten into $1, $2 … $n, starting at 1, so that the numbering matches the 1-based JDBC index.

true when enableSQLTranslation is false, false otherwise

ssl

Boolean

Optional flag, alternative to neo4j+s. It can be used for example to programmatically enable the full SSL chain.

null

sslMode

Enum<SSLMode>

Optional configuration for fine-grained control over SSL configuration. Allowed values are disable, require, verify-full. See <ssl_mode, Understanding the SSL mode>>.

null

user

String

The username (principal) to use for authentication. NOT RECOMMENDED as URL query parameter for security reasons.

neo4j

password

String

The password (credentials) to use for authentication. NOT RECOMMENDED as URL query parameter for security reasons.

password

authRealm

String

The realm to use for authentication. NOT RECOMMENDED as URL query parameter for security reasons.

null

authScheme

String

The authentication scheme to use. NOT RECOMMENDED as URL query parameter for security reasons. Currently supported values are:

  • basic (default) for basic authentication.

  • bearer for bearer authentication (SSO). password should be set to the bearer token; user and authRealm have no effect.

  • kerberos for kerberos authentication. Requires password to be set to the kerberos ticket; user and authRealm have no effect.

  • none if authentication is disabled on the server. The properties user, password, authRealm have no effect.

basic

useBookmarks

boolean

Enables bookmark management for full causal cluster support. This is enabled by default and the recommended setting for all scenarios that use a connection pool. If you disable it, it will only be disabled for the specific connection. Other connections retrieved from the driver instance to the same or to other databases are not affected, and the individual connections will still manage their bookmarks.

true

4.5. Getting a driver or a connection instance

This section likely only applies if you use the Neo4j JDBC Driver as part of application development in contrast to using it as part of front-end tool such as DBeaver, DataGrip or UI-based ETL tools.

The easiest way to acquire a connection is directly through the java.sql.DriverManager.

Listing 15. Acquiring a JDBC connection to a Neo4j server
import java.sql.DriverManager;
import java.sql.SQLException;
import java.util.Properties;

class Configuration {

    void obtainConnection() throws SQLException {

        var url = "jdbc:neo4j://localhost:7687";
        var username = "neo4j";
        var password = "verysecret";
        var connection = DriverManager.getConnection(url, username, password);
    }

}

While our connection implementation is thread-safe, it allows only one concurrent transaction per connection (as dictated by the JDBC specification). For a multi-thread application, use a connection pool. There’s HikariCP, but usually application server and containers/frameworks bring their own. It’s safe to use any of them, as the Neo4j JDBC Driver does no internal pooling.

If you need access to an instance of the Neo4j driver itself, you can use the following approach:

Listing 16. Acquiring an instance of the Neo4j JDBC Driver
import java.sql.DriverManager;
import java.sql.SQLException;
import java.util.Properties;

class Configuration {

    void obtainDriverAndConnection() throws SQLException {

        var url = "jdbc:neo4j://localhost:7687";
        var driver = DriverManager.getDriver(url);

        var properties = new Properties();
        properties.put("username", "neo4j");
        properties.put("password", "verysecret");
        var connection = driver.connect(url, properties);
    }

}

4.6. Securing your connection by using SSL

The Neo4j JDBC Driver supports the same SSL option of the common Java driver, with the same URL protocols, using +s or +ssc as additional indicators for the required level of security.

The same configuration can also be achieved with a URL query parameter or an entry in the properties passed to the DriverManager or driver instance when asking for a connection. As long as you don’t specify contradicting values, it’s fine to combine both approaches.

4.6.1. Understanding the SSL mode

The following list is ordered by ascending security:

  • disable — (default), "I don’t care about security and don’t want to pay the overhead for encryption."

  • require — "I want my data to be encrypted, and I accept the overhead. I trust that the network will make sure I always connect to the server I want." (Server must support encryption, no hostname/CA validation is done. This saves the hassle of proper certificates and is only secure on private networks; it should not really be used over public networks.)

  • verify-full — "I want my data encrypted, and I accept the overhead. I want to be sure that I connect to a server I trust, and that it’s the one I specify."

The Neo4j JDBC Driver does not include revocation checks.

The most secure option can also be enabled by setting ssl=true either as query parameter or as property entry passed to the DriverManager. This option corresponds to neo4j+s. On the other hand, require corresponds to neo4j+ssc.

The additional enum allows us to possibly support additional modes in the future, such as letting the service decide about SSL, or being able to express a preference towards SSL without requiring it.

Neo4j servers can offer both plain Bolt connections and encrypted SSL connection, or just one of them. The fact that you can connect using neo4j+s does not mean that you cannot connect using just neo4j, or viceversa. This is dependent on the server setup. Neo4j Aura, Neo4j’s managed cloud service, only supports encrypted connections, so you must use +s, ssl=true, or sslMode=verify-full.

4.6.2. Valid URLs

The following URLs are all valid:

neo4j+s://xyz.databases.neo4j.io

Use full verification with the xzy instance at AuraDB

neo4j://xyz.databases.neo4j.io?ssl=true

The same, but using the shorthand URL parameter

neo4j://xyz.databases.neo4j.io?sslMode=verfiy-full

The same, but using the explicit mode

neo4j+s://xyz.databases.neo4j.io?ssl=true&sslMode=verify-full

Not more secure, but does not fail on you

neo4j+ssc://this.is.a.trustworthy.instance.for.sure.com

Trust whatever certificate and hostname there is, but do use SSL

neo4j://my-testing-instance.local

Use a plain connection.

The driver only refuses contradicting configurations, such as:

  • +s with ssl=false, or sslMode set to disable

  • +ssc with ssl=false, or any sslmode not equal to require

Basically, you cannot ask to use SSL and not use it at the same time. The driver offers several mechanism so that you can use a fixed URL with dynamic query parameters, or dynamic URLs, or whatever way of configuring you prefer in a programmatic way.

4.6.3. Using .dotenv files

When you create a Neo4j Aura instance, you will be asked to download a text-file named similar to Neo4j-9df57663-Created-2023-06-12.txt. This is essentially a .dotenv file containing the information required to connect to the database.

These files can be directly used via Neo4jDriver.fromEnv() (see Getting a connection via environment variables). This method exists in several overloads, which let you configure both filename and directory. Additionally, the builder lets you configure options that are not contained in the files from Aura.

5. Metadata

5.1. Catalog and schema

All the methods on connection level dealing with metadata and information about the available content provide information in terms that are defined in the SQL standard, including catalogs and schemas.

In most relational databases, a catalog is equivalent to a specific database on a server or cluster, and the schema refers to the collection of tables, views and procedures in that catalog.

From the SQL 1992 standard (find an archived copy here):

(4.12) Catalogs are named collections of schemas in an SQL-environment. An SQL-environment contains zero or more catalogs. A catalog contains one or more schemas, but always contains a schema named INFORMATION_SCHEMA that contains the views and domains of the Information Schema.

This driver does not support catalogs (see discussion 55), so any metadata result set will return literal null when asked for the catalog of a database object. No metadata method supports filtering on a non-null catalog parameter and no catalog specifier can be used in a query. Future developments might use catalogs to describe composite databases, in essence listing the constituents of the composite database defined in the connection.

The same standard defines schemas as follows:

(4.11) An SQL-schema is a persistent descriptor that includes:

[…] the descriptor of every component of the SQL-schema.

In this International Standard, the term "schema" is used only in the sense of SQL-schema. Each component descriptor is either a domain descriptor, a base table descriptor, a view descriptor, an assertion descriptor, a privilege descriptor, a character set descriptor, a collation descriptor, or a translation descriptor. The persistent objects described by the descriptors are said to be owned by or to have been created by the <authorization identifier> of the schema.

We report the literal name public as schema for any result component of a metadata result set. We support querying for objects in any schema, however only literal null or public will potentially produce a result.

Labels will be reported as table objects with the TABLE_TYPE being literal TABLE.

5.1.1. Summary

  • Catalog: Always null; filtering on anything non-null yields no results.

  • Schema: Always public; filtering on public and literal will yield result, anything else won’t.

  • Table descriptors: Reported as TABLE in the TABLE_TYPE column.

5.2. Labels to tables

The CLG and Langstar groups speak about "Node type combinations" and gravitate towards "open node type semantics" in the GQL standard:

Node type combinations is a less permissive form of open node type semantics. The idea of node type combinations is that nodes also conform to a graph type if they are not of one of the node types in the node type set of the graph type, but of a node type that is an intersection of (a subset of) the node type in a node type set of the graph type.

An example for their proposal can be found here.

This driver therefore compute node types in a similar way:

  • Single label nodes will map naturally to a table name, the single label will become the table name

    • The label name will be taken as is and will be case-sensitive. The labels Movie, movie, MOVIE will result in three tables in the metadata

    • This is in line with the default SQL-to-Cypher translation

  • Node type combinations will map to table names composed as label1_label2, sorting the labels alphabetically to make them independent of the order Neo4j returns them

  • Property sets for these node type combinations will then be computed

6. SQL to Cypher translation

6.1. Introduction

The translation of queries from SQL to Cypher is an optional feature of this driver and it consists of two parts:

  • The translator SPI, located in the module org.neo4j:neo4j-jdbc-translator-spi. It consists of two interfaces: SqlTranslatorFactory and the actual SqlTranslator.

  • A concrete implementation of this SPI, published as org.neo4j:neo4j-jdbc-translator-impl.

The latter is covered in "Using the default translator" and available in the "full bundle", described in Available bundles. The former is provided for two reasons: it allows us to distribute the driver with and without the bundled, default translator and allows you to run your custom translator.

Translators can be chained, and there can be as many translators on the classpath as you want. Their precedence is configurable, with our default implementation having the lowest precedence. Thus, you can for example have a custom translator that takes care of a fixed set of queries and, if it receives a query it cannot translate, it will pass it down to our implementation.

Translating arbitrary SQL queries to Cypher is an opinionated task, as there is no right way to map table names to graph objects: a table name can be used as-is as a label, or may be transformed into a singular form, etc. Mapping relationships is even trickier: should relationship types derive from a join table, a join column (in that case, which one?), or a foreign key?

We believe our assumptions are appropriate for various use cases and instead of providing configuration to cater for all scenarios, we offer the possibility to write your own translation layer. The driver will use the standard Java service loader mechanism to find an implementation of the SPI on the module- or classpath.

Some tools (like Tableau) use a class-loader that won’t let the driver use the standard Java service loader mechanism. For these scenarios, we provide an additional configuration property named translatorFactory. Set this to DEFAULT to directly load our default implementation or to a fully-qualified classname for any other factory. Be aware that either our default implementation or your custom one must be on the classpath.

6.2. Translating SQL to Cypher

There’s only one requirement to enable the SQL-to-Cypher translation: you have to have one module implementing the SPI on the classpath. This is automatically the case if you use the full-bundle (org.neo4j:neo4j-jdbc-full-bundle). In that case, you don’t need to add any other dependency. If you use the individual distribution or the "small" bundle org.neo4j:neo4j-jdbc-bundle, you must add the artifact org.neo4j:neo4j-jdbc-translator-impl.

The implementation will be automatically loaded. If you use the translation on a case-by-case basis, it will be lazily loaded (i.e no additional classes are touched or loaded into memory). If you configure automatic translation for all statements, the implementation will be eagerly loaded. There are no further configuration options with regard to loading the implementation.

6.2.1. On a case-by-case basis

The translator can be used on a case-by-case basis through the official JDBC API nativeSQL, which you find on the java.sql.Connection class. With the following imports:

import java.sql.DriverManager;
import java.sql.SQLException;
import java.util.Properties;
import java.util.logging.Logger;

You just pass your SQL statement to nativeSQL and you will get Cypher back:

try (var connection = DriverManager.getConnection(url, username, password)) {
    var sql = connection.nativeSQL("SELECT * FROM Movie n");
    assert """
            MATCH (n:Movie)
            RETURN *""".equals(sql);
}

6.2.2. For all queries

If you open the connection to your Neo4j instance using enableSQLTranslation=true either as URL parameter or configuration property, all statements will be translated from SQL to Cypher. If you configure the driver in this way, the translator will be eagerly loaded.

var url = "jdbc:neo4j://localhost:7687?enableSQLTranslation=true";
try (var connection = DriverManager.getConnection(url, username, password);
        var stmnt = connection.createStatement();
        var result = stmnt.executeQuery("SELECT n.title FROM Movie n")) {
    while (result.next()) {
        LOGGER.info(result.getString("n.title"));
    }
}

Sometimes you may need to fall back to Cypher for some statements, either to use constructs that you cannot express with SQL, or because our default translator cannot handle your query. We offer a special comment that you can use as a hint in your statement to stop automatic translation: /*+ NEO4J FORCE_CYPHER */.

var url = "jdbc:neo4j://localhost:7687?enableSQLTranslation=true";
var query = """
        /*+ NEO4J FORCE_CYPHER */
        MATCH (:Station { name: 'Denmark Hill' })<-[:CALLS_AT]-(d:Stop)
            ((:Stop)-[:NEXT]->(:Stop)){1,3}
            (a:Stop)-[:CALLS_AT]->(:Station { name: 'Clapham Junction' })
        RETURN localtime(d.departs) AS departureTime,
            localtime(a.arrives) AS arrivalTime
        """;
try (var connection = DriverManager.getConnection(url, username, password);
        var stmnt = connection.createStatement();
        var result = stmnt.executeQuery(query)) {
    while (result.next()) {
        LOGGER.info(result.getTime("departureTime").toString());
    }
}

6.2.3. Possible error scenarios

A NoSuchElementException with a message of No SQL translators available will be thrown when there is no implementation of the SQL to Cypher translator available, and you either used java.sql.Connection.nativeSQL or enabled automatic translation. The exception will be thrown when you access the method or eagerly on opening a connection in the latter case.

6.3. Using the default translator

6.3.1. Supported SQL dialects

Our default translator uses the OSS parser from jOOQ, which supports a broad spectrum of SQL dialects already. We picked the generic, default dialect of jOOQ as our default dialect, but you can overwrite this in the SQL to Cypher configuration using the parameter s2c.sqlDialect with one of the supported dialects listed in configuration below. POSTGRES can be a good choice for several integrations.

Bear in mind though that any shortcomings in the translation are probably not due to a lack in the parser, but due to the lack of an obvious, semantically equivalent Cypher construct. That means we might be able to parse a certain piece of SQL, but are unable to translate in into something meaningful that Neo4j can understand without additional, contextual information.

6.3.2. Configuration

The default implementation provides a number of configuration settings. They must be prefixed with s2c in the URL or config options:

Name Meaning Default

parseNameCase

Whether to parse table names as is or not.

true

tableToLabelMappings

A map from table names to labels.

An empty map

joinColumnsToTypeMappings

A map from column names to relationship types.

An empty map

prettyPrint

Whether to format the generated Cypher or not.

true

alwaysEscapeNames

Whether to always escape names.

Unless explicitly configured false when pretty printing is on, otherwise true.

sqlDialect

Which dialect to use when parsing. Supported values are POSTGRES, SQLITE, MYSQL, H2, HSQLDB, DERBY and DEFAULT.

DEFAULT

The next few examples use the properties config to avoid terrible long URLs in this documentation, but all the attributes can be specified via URL as well.

Listing 17. Disable pretty printing; only escape if necessary; configure dedicated table mappings
var properties = new Properties();
properties.put("username", "neo4j");
properties.put("password", "verysecret");
properties.put("enableSQLTranslation", "true");
properties.put("s2c.prettyPrint", "false");
properties.put("s2c.alwaysEscapeNames", "false");
properties.put("s2c.tableToLabelMappings", "people:Person;movies:Movie;movie_actors:ACTED_IN");

var url = "jdbc:neo4j://localhost:7687";
var query = """
        SELECT p.name, m.title
        FROM people p
        JOIN movie_actors r ON r.person_id = p.id
        JOIN movies m ON m.id = r.person_id""";
try (var connection = DriverManager.getConnection(url, properties)) {
    var sql = connection.nativeSQL(query);
    assert "MATCH (p:Person)-[r:ACTED_IN]->(m:Movie) RETURN p.name, m.title".equals(sql);
}
Listing 18. Parse table names into upper case
var properties = new Properties();
properties.put("username", "neo4j");
properties.put("password", "verysecret");
properties.put("enableSQLTranslation", "true");
properties.put("s2c.parseNameCase", "UPPER");

var url = "jdbc:neo4j://localhost:7687";
var query = "SELECT * FROM people";
try (var connection = DriverManager.getConnection(url, properties)) {
    var sql = connection.nativeSQL(query);
    assert """
            MATCH (people:PEOPLE)
            RETURN *""".equals(sql);
}

Named parameter syntax in the SQL parser defaults to :name (such as supported by Oracle, JPA, Spring, a colon followed by a name). The following example changes that prefix to $ (the same prefix that Cypher uses):

Listing 19. Change parameters prefix and add mappings for join columns
var properties = new Properties();
properties.put("username", "neo4j");
properties.put("password", "verysecret");
properties.put("enableSQLTranslation", "true");
properties.put("s2c.parseNamedParamPrefix", "$");
properties.put("s2c.joinColumnsToTypeMappings", "people.movie_id:DIRECTED");

var url = "jdbc:neo4j://localhost:7687";
var query = """
        SELECT *
        FROM people p
        JOIN movies m ON m.id = p.movie_id
        WHERE p.name = $1
        """;
try (var connection = DriverManager.getConnection(url, properties)) {
    var sql = connection.nativeSQL(query);
    assert """
            MATCH (p:people)-[:DIRECTED]->(m:movies)
            WHERE p.name = $1
            RETURN *""".equals(sql);
}

This is helpful when a tool generates names like that and does not allow customization.

6.3.3. Supported statements

The following statements are all under tests and describe what you can expect from the default translation layer:

6.3.4. Translation concepts

Table names to labels

The most simple SELECT statement to translate is one without FROM clause, such as:

SELECT 1

It is equivalent to a Cypher RETURN:

RETURN 1

SELECT statements without JOIN clauses are pretty straightforward to translate. The challenge here is how to map the table name to labels:

  • We parse the SQL statement case-sensitive by default

  • Table names are mapped to node labels

  • Table aliases are used as identifiable symbolic names

SELECT t.a, t.b
FROM My_Table (1)
  AS t (2)
WHERE t.a = 1
1 Will be used as the label to match, as-is, i.e. My_Table
2 The table alias will become the node alias

The whole construct will be translated to

MATCH (t:My_Table)
WHERE t.a = 1
RETURN t.a, t.b

Table aliases are optional, if you omit them, we derive aliases from the labels and types. If you inspect the translated queries, we recommend using aliases, as this makes the queries better readable.

Star-Selects

A star- or *- select comes in different forms:

Unqualified

SELECT * FROM table

Qualified

SELECT t.* FROM table t

And a variant, selecting the relation itself: SELECT t FROM table t.

We make use of this to let you decide whether you want to return Neo4j nodes and relationships as entities, maps, or flattened to individual columns. The latter requires our translator to have access to the schema of the underlying Neo4j database. The following sections describe the use-cases.

Projecting individual properties

Don’t use a star-select but enumerate the properties:

SELECT m.title FROM Movie m

The table alias will be used as a symbolic name:

MATCH (m:Movie)
RETURN m.title;

You can omit the table alias:

SELECT title FROM Movie

The lower-cased table name will be the symbolic name:

MATCH (movie:Movie)
RETURN movie.title;

Accessing JDBC columns by name leads to code that is hard to maintain, as column renamings impact code as well. To avoid this, alias the column:

SELECT title AS title FROM Movie

So that it has a stable, well-known name:

MATCH (movie:Movie)
RETURN movie.title AS title;
Projecting all properties

A SELECT * statement gets translated differently depending on whether the connection to the Neo4j database is available.

SELECT * FROM Person p

If you are offline, you will get the following Cypher statement:

MATCH (p:Person) RETURN *

The above query will return one column (p) containing a Neo4j node. This is usually not what you expect in the relational world. If you run the translation online and Neo4j metadata can be retrieved, you will get a statement that flattens the properties of each node and relationship, plus their element IDs:

In case the Person node has properties born and name,

SELECT * FROM Person p

you will get this Cypher statement

MATCH (p:Person)
RETURN elementId(p) AS `v$id`,
       p.born AS born, p.name AS name

This works well with multiple tables as well (Movie has properties title and released):

SELECT * FROM Person p JOIN Movie m ON m.id = p.acted_in
MATCH (p:Person)-[acted_in:ACTED_IN]->(m:Movie)
RETURN elementId(p) AS `v$id`, p.born AS born, p.name AS name,
       elementId(m) AS `v$id1`, m.title AS title, m.released AS released

We append increasing numbers to column names to clashing ones (ex. with name and remark properties both in Movie and Person):

SELECT * FROM Person p JOIN Movie m ON m.id = p.acted_in

Note the increasing numbers per duplicate name:

MATCH (p:Person)-[acted_in:ACTED_IN]->(m:Movie)
RETURN elementId(p) AS `v$id`,
       p.born AS born, p.name AS name, p.remark AS remark,
       elementId(m) AS `v$id1`,
       m.name AS name1, m.released AS released, m.remark AS remark1

The following example uses a join-table to access relationships (we explain this later in this manual when discussing joins), and the flattening of properties works here as well:

SELECT *
FROM people p
JOIN movie_actors r ON r.person_id = p.id
JOIN movies m ON m.id = r.person_id
MATCH (p:Person)-[r:ACTED_IN]->(m:Movie)
RETURN elementId(p) AS `v$id`,
       p.born AS born, p.name AS name,
       elementId(p) AS `v$person_id`,
       elementId(r) AS `v$id1`, r.role AS role,
       elementId(m) AS `v$movie_id`,
       elementId(m) AS `v$id2`,
       m.title AS title, m.released AS released

Listing 20. Ordering without specifying a table alias
SELECT * FROM Person p ORDER BY name ASC
MATCH (p:Person)
RETURN elementId(p) AS `v$id`,
       p.born AS born, p.name AS name
ORDER BY p.name

A qualified alias can be used as well. If no Neo4j metadata is available, you will get a map of properties of the node/relationship:

SELECT m.*, p.*
FROM Person p
JOIN Movie m ON m.id = p.acted_in

The corresponding columns must be downcast to a map in JDBC:

MATCH (p:Person)-[acted_in:ACTED_IN]->(m:Movie)
RETURN m{.*} AS m, p{.*} AS p

If we add more data (ex. born and name to Person), the qualified star will project all of them (note how we also project one single, known column from the Movie table):

SELECT p.*, m.title AS title
FROM Person p
JOIN Movie m ON m.id = p.acted_in
MATCH (p:Person)-[acted_in:ACTED_IN]->(m:Movie)
RETURN elementId(p) AS `v$id`, p.born AS born, p.name AS name, m.title AS title
Returning nodes and relationships

A statement that projects a table alias such as

SELECT m FROM Movie m

will result in a Cypher statement returning the matched node as node.

MATCH (m:Movie)
RETURN m;

A node can be aliased as well:

SELECT m AS node FROM Movie m
MATCH (m:Movie)
RETURN m AS node;

Un-aliased tables can be used as well:

SELECT movie FROM Movie
MATCH (movie:Movie)
RETURN movie;

Multiple entities are supported, too:

SELECT p, r, m FROM Person p
JOIN ACTED_IN r ON r.person_id = p.id
JOIN Movie m ON m.id = r.movie_id
MATCH (p:Person)-[r:ACTED_IN]->(m:Movie) RETURN p, r, m
Comparing SQL with Cypher examples

The source of the following examples is: Comparing SQL with Cypher.

Find all Products
Select and Return Records

Select everything from the products table.

SELECT p.*
FROM products as p

Similarly, in Cypher, you just MATCH a simple pattern: all nodes with the label Product and RETURN them.

MATCH (p:Product)
RETURN p{.*} AS p

The above query will project all properties of the matched node. If you want to return the node itself, select it without using the asterisk:

SELECT p
FROM products as p
MATCH (p:Product)
RETURN p
Field Access, Ordering and Paging

It is more efficient to return only a subset of attributes, like ProductName and UnitPrice. And while we are at it, let’s also order by price and only return the 10 most expensive items. (Remember that labels, relationship-types and property-names are case sensitive in Neo4j.)

SELECT p.`productName`, p.`unitPrice`
FROM products as p
ORDER BY p.`unitPrice` DESC
LIMIT 10
MATCH (p:Product)
RETURN p.productName, p.unitPrice ORDER BY p.unitPrice DESC LIMIT 10

The default order direction will be translated as is:

SELECT * FROM Movies m ORDER BY m.title
MATCH (m:Movies)
RETURN * ORDER BY m.title
DISTINCT projections

The DISTINCT keyword for projections is handled:

SELECT DISTINCT m.released FROM Movies m
MATCH (m:Movies)
RETURN DISTINCT m.released

It works with * projections as well:

SELECT DISTINCT m.* FROM Movies m
MATCH (m:Movies)
RETURN DISTINCT m {.*} AS m

However, as the qualified asterisks will use metadata if available, the translation with a database connection is different:

SELECT DISTINCT m.* FROM Movies m
MATCH (m:Movies)
RETURN DISTINCT elementId(m) AS `v$id`, m.title AS title, m.released AS released

Note that each row includes the Neo4j element ID, making each row unique. This being said, the DISCTINCT clause is of limited use with the asterisk.

6.3.5. Expressions

Most SQL expressions have corresponding Cypher expressions and can be translated straightforward.

Literal Values

Literal values are 1:1 translations.

SELECT
    1, TRUE, FALSE, NULL, 'a'
RETURN 1, TRUE, FALSE, NULL, 'a'
Arithmetic expressions

Arithmetic expressions are 1:1 translations.

SELECT
    1 + 2,
    1 - 2,
    1 * 2,
    1 / 2,
    square(2)

Note that the underlying tech of the default translator uses Cypher-DSL internally, which will wrap arithmetic (and logical) expressions with parentheses:

RETURN
    (1 + 2),
    (1 - 2),
    (1 * 2),
    (1 / 2),
    (2 * 2)
Functions
Numeric functions

We can translate all numeric functions that are supported by Neo4j’s Cypher implementation: Mathematical functions - Numeric:

SELECT
    abs(1),
    ceil(1),
    floor(1),
    round(1),
    round(1, 1),
    sign(1)

will be translated to

RETURN
    abs(1),
    ceil(1),
    floor(1),
    round(1),
    round(1, 1),
    sign(1)
Logarithmic functions

Neo4j supports a broad range of logarithmic functions.

SELECT
    exp(1),
    ln(1),
    log(2, 1),
    log10(1),
    sqrt(1)

will be translated to

RETURN
    exp(1),
    log(1),
    (log(1) / log(2)),
    log10(1),
    sqrt(1)
Trigonometric functions

Calls to trigonometric functions

SELECT
    acos(1),
    asin(1),
    atan(1),
    atan2(1, 2),
    cos(1),
    cot(1),
    degrees(1),
    pi(),
    radians(1),
    sin(1),
    tan(1)

will be translated to the corresponding Neo4j functions:

RETURN
    acos(1),
    asin(1),
    atan(1),
    atan2(1, 2),
    cos(1),
    cot(1),
    degrees(1),
    pi(),
    radians(1),
    sin(1),
    tan(1)
String functions

The following string manipulations are guaranteed to work:

SELECT
    lower('abc'),
    cast(3 as varchar),
    trim(' abc '),
    length('abc'),
    left('abc', 2),
    ltrim(' abc '),
    replace('abc', 'b'),
    replace('abc', 'b', 'x'),
    reverse('abc'),
    right('abc', 2),
    rtrim(' abc '),
    substring('abc', 2 - 1),
    substring('abc', 2 - 1, 2),
    upper('abc')

and will be translated to Neo4j’s versions:

RETURN
    toLower('abc'),
    toString(3),
    trim(' abc '),
    size('abc'),
    left('abc', 2),
    ltrim(' abc '),
    replace('abc', 'b', NULL),
    replace('abc', 'b', 'x'),
    reverse('abc'),
    right('abc', 2),
    rtrim(' abc '),
    substring('abc', (2 - 1)),
    substring('abc', (2 - 1), 2),
    toUpper('abc')
Scalar functions

The input

SELECT
    coalesce(1, 2),
    coalesce(1, 2, 3),
    nvl(1, 2),
    cast('1' as boolean),
    cast(1 as float),
    cast(1 as double precision),
    cast(1 as real),
    cast(1 as tinyint),
    cast(1 as smallint),
    cast(1 as int),
    cast(1 as bigint)

will be translated to (see Scalar functions):

RETURN
    coalesce(1, 2),
    coalesce(1, 2, 3),
    coalesce(1, 2),
    toBoolean('1'),
    toFloat(1),
    toFloat(1),
    toFloat(1),
    toInteger(1),
    toInteger(1),
    toInteger(1),
    toInteger(1)
Query expressions

Several advanced SQL expressions are supported as well.

CASE simple

The simple CASE expressions

SELECT
    CASE 1 WHEN 2 THEN 3 END,
    CASE 1 WHEN 2 THEN 3 ELSE 4 END,
    CASE 1 WHEN 2 THEN 3 WHEN 4 THEN 5 END,
    CASE 1 WHEN 2 THEN 3 WHEN 4 THEN 5 ELSE 6 END
RETURN CASE 1 WHEN 2 THEN 3 END, CASE 1 WHEN 2 THEN 3 ELSE 4 END, CASE 1 WHEN 2 THEN 3 WHEN 4 THEN 5 END, CASE 1 WHEN 2 THEN 3 WHEN 4 THEN 5 ELSE 6 END

See Cypher → Conditional expressions (CASE) for more information.

CASE advanced

And CASE statement using a search:

SELECT
    CASE WHEN 1 = 2 THEN 3 END,
    CASE WHEN 1 = 2 THEN 3 ELSE 4 END,
    CASE WHEN 1 = 2 THEN 3 WHEN 4 = 5 THEN 6 END,
    CASE WHEN 1 = 2 THEN 3 WHEN 4 = 5 THEN 6 ELSE 7 END

will be translated to

RETURN
    CASE WHEN 1 = 2 THEN 3 END,
    CASE WHEN 1 = 2 THEN 3 ELSE 4 END,
    CASE WHEN 1 = 2 THEN 3 WHEN 4 = 5 THEN 6 END,
    CASE WHEN 1 = 2 THEN 3 WHEN 4 = 5 THEN 6 ELSE 7 END

See Cypher → Conditional expressions (CASE) for more information.

CASE abbreviations (which are not COALESCE or NVL)

The input

SELECT
    nullif(1, 2),
    nvl2(1, 2, 3)

will be translated to

RETURN
    CASE WHEN 1 = 2 THEN NULL ELSE 1 END,
    CASE WHEN 1 IS NOT NULL THEN 2 ELSE 3 END

6.3.6. Predicates

As with expressions a lot of logical SQL expressions and conditions used as predicates can be translated straightforward into Cypher predicates.

Conjunctions and disjunctions

Logical conjunctions and disjunctions are all supported.

SELECT 1 FROM p WHERE 1 = 1 AND 2 = 2 OR 3 = 3
MATCH (p:p)
WHERE ((1 = 1
    AND 2 = 2)
  OR 3 = 3)
RETURN 1

The input

SELECT 1 FROM p WHERE NOT 1 = 1 XOR 2 = 2

will be translated to

MATCH (p:p)
WHERE (NOT (1 = 1)
  XOR 2 = 2)
RETURN 1
Operators
Arithmetic operators

The input

SELECT 1 FROM p WHERE 1 = 1 AND 2 > 1 AND 1 < 2 AND 1 <= 2 AND 2 >= 1 AND 1 != 2

will be translated to

MATCH (p:p)
WHERE (1 = 1
  AND 2 > 1
  AND 1 < 2
  AND 1 <= 2
  AND 2 >= 1
  AND 1 <> 2)
RETURN 1
Between

Between in SQL is inclusive

SELECT 1 FROM p WHERE 2 BETWEEN 1 AND 3

and will be translated to (we can’t generate the shorter form (1 ⇐ 2 ⇐ 3) due to restrictions in the underlying generator):

MATCH (p:p)
WHERE (1 <= 2) AND (2 <= 3)
RETURN 1

SQL has a SYMMETRIC keyword for the BETWEEN clause, to indicate that you do not care which bound of the range is larger:

SELECT 1 FROM p WHERE 2 BETWEEN SYMMETRIC 3 AND 1

We translate this into a disjunction:

MATCH (p:p)
WHERE (3 <= 2) AND (2 <= 1) OR (1 <= 2) AND (2 <= 3)
RETURN 1
Logical row value expressions

The above examples are based on scalar expressions. Row value expressions will be translated as well:

SELECT 1
FROM p
WHERE (1, 2) = (3, 4)
OR (1, 2) < (3, 4)
OR (1, 2) <= (3, 4)
OR (1, 2, 3) <> (4, 5, 6)
OR (1, 2, 3) > (4, 5, 6)
OR (1, 2, 3) >= (4, 5, 6)

Leads to a semantically equivalent cypher:

MATCH (p:p)
WHERE 1 = 3 AND 2 = 4
OR (1 < 3 OR 1 = 3 AND 2 < 4)
OR (1 < 3 OR 1 = 3 AND 2 <= 4)
OR (1 != 4 AND 2 != 5 AND 3 != 6)
OR (1 > 4 OR 1 = 4 AND (2 > 5 OR 2 = 5 AND 3 > 6))
OR (1 > 4 OR 1 = 4 AND (2 > 5 OR 2 = 5 AND 3 >= 6))
RETURN 1
Null Handling
For scalar expressions

The input

SELECT 1 FROM p WHERE 1 IS NULL AND 2 IS NOT NULL

will be translated to

MATCH (p:p)
WHERE (1 IS NULL
  AND 2 IS NOT NULL)
RETURN 1
For row value expressions

The input

SELECT 1 FROM p WHERE (1, 2) IS NULL OR (3, 4) IS NOT NULL

will be translated to

MATCH (p:p)
WHERE
  (1 IS NULL AND 2 IS NULL)
  OR (3 IS NOT NULL AND 4 IS NOT NULL)
RETURN 1
LIKE operator

The LIKE operator

SELECT * FROM movies m WHERE m.title LIKE '%Matrix%'

will be translated into a regular expressions, replacing the % with .*:

MATCH (m:`movies`) WHERE m.title =~ '.*Matrix.*'
RETURN *

6.3.7. Using joins to map relationships

On the surface, joins are relationships materialized in SQL (foreign keys are not). Sadly, it’s not as straightforward to map to Cypher. There are several implementation options:

  • When joining two tables on a column, take the left hand table column, use its name as relationship type, and treat it as outgoing from left to right.

  • When joining two tables with an intersection table (which it’s usually modeled in SQL for m:n relationships with attributes), use the name of that intersection table as relationship type.

We implemented some variants thereof, however we don’t claim their absolute usefulness in all cases.

1:n joins
Natural joins

SQL NATURAL joins are the easiest way to denote relationship names without having to do any mapping. A one-hope NATURAL JOIN will translate to an anonymous, wildcard relationship.

SELECT p, m FROM Person p
NATURAL JOIN Movie m
MATCH (p:Person)-->(m:Movie) RETURN p, m

NATURAL joins can be chained, and the connecting join table does not need to exist. This will be turned into a Neo4j relationship:

SELECT p.name, r.roles, m.* FROM Person p
NATURAL JOIN ACTED_IN r
NATURAL JOIN Movie m
MATCH (p:Person)-[r:ACTED_IN]->(m:Movie)
RETURN p.name, r.roles,
       elementId(m) AS `v$id`, m.title AS title, m.released AS released
Simple join

Assume we configured the translator to use the following table mapping:

  • people mapped to label People

  • movies mapped to label Movie

With that in place, we translate

SELECT p.name, m.title
FROM people p
JOIN movies m ON m.id = p.directed

to

MATCH (p:Person)-[directed:DIRECTED]->(m:Movie)
RETURN p.name, m.title

DIRECTED is the uppercase version of the join column in the left table (p.directed).

We can add a join column mapping in the form of people.movie_id:DIRECTED if we have different column names:

SELECT p.name, m.title
FROM people p
JOIN movies m ON m.id = p.movie_id

to

MATCH (p:Person)-[directed:DIRECTED]->(m:Movie)
RETURN p.name, m.title
Using the ON clause

We used backticks here for the table and column names and no mapping.

SELECT p.name, m.title
FROM `Person` as p
JOIN `Movie` as m ON (m.id = p.`DIRECTED`)

The translation is same as before:

MATCH (p:Person)-[directed:DIRECTED]->(m:Movie)
RETURN p.name, m.title
m:n joins

An intersection table is a table that contains references to two other tables in the form of at least two columns. This construct is usually required in the relational model to create an m:n relationship. Such an auxiliary construct is not necessary in Neo4j. We can model as many outgoing and incoming relationships from one label to another, and they can also have properties. We can thus use that construct for our translator.

The following example uses a configured mapping as follows:

  • people mapped to label People

  • movies mapped to label Movie

  • movie_actors mapped to ACTED_IN

SELECT p.name, m.title
FROM people p (1)
JOIN movie_actors r ON r.person_id = p.id (2)
JOIN movies m ON m.id = r.person_id (3)
1 The table from which to map outgoing relationships
2 An intersection table, that is used again in the next JOIN clause
3 The final JOIN clause

We do no semantic analysis: the order of the joins matter, and will lead to the following query:

MATCH (p:Person)-[r:ACTED_IN]->(m:Movie)
RETURN p.name, m.title

Multiple joins will result in a chain of relationships:

SELECT p.name AS actor, d.name AS director, m.title
FROM people p
 JOIN movie_actors r ON r.person_id = p.id
 JOIN movies m ON m.id = r.person_id
 JOIN movie_directors r2 ON r2.movie_id = m.id
 JOIN people d ON r2.person_id = d.id
MATCH (p:`Person`)-[r:`ACTED_IN`]->(m:`Movie`)<-[r2:`DIRECTED`]-(d:`Person`)
RETURN p.name AS actor, d.name AS director, m.title

Notice how the direction of the DIRECTED relationship is defined by the order of the join columns.

6.3.8. DML statements

This section lists the supported Data-Manipulation-Language (DML) statements. Although a SELECT statement is technically DML as well, it is covered in Neo4j JDBC Driver.

Deleting nodes

Nodes can be deleted via the SQL DELETE statement.

For example, to unconditionally delete all person nodes:

DELETE FROM person
MATCH (person:person)
DELETE person

A WHERE clause can be added to prevent this:

DELETE FROM person
WHERE person.id = 1
MATCH (person:person)
WHERE person.id = 1
DELETE person

If you want to delete everything, but your tooling complains, just add a conditions that is always true:

DELETE FROM person
WHERE true
MATCH (person:person)
WHERE true
DELETE person

Of, the condition can also be that always evaluates to false, never deleting anything:

DELETE FROM person
WHERE false
MATCH (person:person)
WHERE false
DELETE person

Tables can be aliased, and the alias will be used in Cypher, too:

DELETE FROM person p
MATCH (p:person)
DELETE p

Aliasing tables is also supported in combination with specifying the label to which the table name is mapped. Using the same query with table_mappings=person:Person configured,

DELETE FROM person p

will be translated to

MATCH (p:Person)
DELETE p

You can use SQL TRUNCATE to DETACH DELETE nodes.

TRUNCATE TABLE people

will be translated to

MATCH (people:Person)
DETACH DELETE people
Inserting data

A single list of values with explicit columns and constant values can be inserted with a simple INSERT statement:

INSERT INTO People (first_name, last_name, born) VALUES ('Helge', 'Schneider', 1955)
CREATE (people:`Person` {first_name: 'Helge', last_name: 'Schneider', born: 1955})

All expressions, including parameters, are supported. Parameters will be named from 1 on upwards in Cypher.

INSERT INTO People (first_name, last_name, born) VALUES (?, ?, ?)
CREATE (people:`Person` {first_name: $1, last_name: $2, born: $3})

If you omit the column names on the insertion target, we generate names:

INSERT INTO People VALUES ('Helge', 'Schneider', 1955)

Note the unknown field xxx property names:

CREATE (people:`Person` {`unknown field 0`: 'Helge', `unknown field 1`: 'Schneider', `unknown field 2`: 1955})

The SQL VALUES clause actually supports lists of values:

INSERT INTO People (first_name, last_name, born) VALUES
    ('Helge', 'Schneider', 1955),
    ('Bela', 'B', 1962)

Those values will be translated into a Cypher array to be unwound in the Cypher statement. This is a great solution for batching inserts:

UNWIND [
  {first_name: 'Helge', last_name: 'Schneider', born: 1955},
  {first_name: 'Bela', last_name: 'B', born: 1962}]
AS properties
CREATE (people:`Person`)
SET people = properties

A returning clause is supported as well:

INSERT INTO People p (name) VALUES (?) RETURNING elementId(p)
CREATE (p:Person {name: $1}) RETURN elementId(p)
Upserts

We support a restricted range of "upserts" via the non-standard but pretty common ON DUPLICATE and ON CONFLICT SQL clauses. Upserts are translated into MERGE statements. While they work without constraints, you should really have uniqueness-constraints on the node properties you merge on, or Neo4j may create duplicates (see Understanding how merge works).

Upserts on all columns can happen via either ON DUPLICATE KEY IGNORE or ON CONFLICT IGNORE. While ON DUPLICATE KEY does offer upgrade options, it assumes the primary (or unique) key being violated to be known. Although this is most certainly the case in a relational system, this translation layer that runs without a database connection doesn’t know.

Listing 21. Upsert with ON DUPLICATE KEY IGNORE
INSERT INTO Movie(title, released) VALUES(?, ?) ON DUPLICATE KEY IGNORE
MERGE (movie:`Movie` {title: $1, released: $2})
Listing 22. Upsert with ON CONFLICT IGNORE
INSERT INTO actors(name, firstname) VALUES(?, ?) ON CONFLICT DO NOTHING
MERGE (actors:`Actor` {name: $1, firstname: $2})

If you want to define an action, you must use ON CONFLICT and specify the key you want to merge on.

INSERT INTO tbl(i, j, k) VALUES (1, 40, 700)
ON CONFLICT (i) DO UPDATE SET j = 0, k = 2 * EXCLUDED.k

Note how the special reference EXCLUDED can be used to refer to the values of columns that have not been part of the key. They will be reused with their values in the ON MATCH SET clause.

MERGE (tbl:`tbl` {i: 1})
ON CREATE SET tbl.j = 40, tbl.k = 700
ON MATCH SET tbl.j = 0, tbl.k = (2 * 700)

This works with parameters, too:

INSERT INTO tbl(i, j, k) VALUES (1, 2, ?)
ON CONFLICT (i) DO UPDATE SET j = EXCLUDED.k
MERGE (tbl:`tbl` {i: 1})
ON CREATE SET tbl.j = 2, tbl.k = $1
ON MATCH SET tbl.j = $1

It’s possible to just specify a concrete merge column instead of merging on all columns as well. It will be translated with ON CREATE:

INSERT INTO tbl(i, j, k) VALUES (1, 40, 700)
ON CONFLICT (i) DO NOTHING
MERGE (tbl:`tbl` {i: 1})
ON CREATE SET tbl.j = 40, tbl.k = 700

Using ON CONFLICT and specifying a key is the only way to insert multiple rows with a MERGE statement:

INSERT INTO People (first_name, last_name, born) VALUES
    ('Helge', 'Schneider', 1955),
    ('Bela', 'B', 1962)
ON CONFLICT(last_name) DO UPDATE SET born = EXCLUDED.born
UNWIND [{first_name: 'Helge', last_name: 'Schneider', born: 1955}, {first_name: 'Bela', last_name: 'B', born: 1962}] AS properties
MERGE (people:`People` {last_name: properties['last_name']})
ON CREATE SET
  people.first_name = properties.first_name,
  people.born = properties.born
ON MATCH SET people.born = properties['born']

7. text2cypher

This is an experimental translator inspired by the Neo4j Labs project text2cypher.

If you add this translator to the classpath or use the text2cypher bundle, all queries that start with

🤖,

will be treated as natural language queries written in plain english. The driver will strip the prefix, and use OpenAI to translate the input into a Cypher statement. The driver will augment the generation of the query by passing the current graph schema along with the input question.

The following data will be sent to an external API:

  • Your database schema, including label names

  • Any natural language query

Don’t use this translator if you don’t want the above, or are not allowed to do so.

This module requires one additional configuration: the OpenAI API key. You can use either a URL parameter, a JDBC property entry, or an environment variable:

  • URL parameter/property name is openAIApiKey

  • Environment variable name is OPEN_AI_API_KEY

Listing 23. Example of a valid URL
jdbc:neo4j://localhost:7687?openAIApiKey=sk-xxx-your-key

Additional configuration properties are

property name default value

openAIBaseUrl

https://api.openai.com/v1 (defined by langchain4j)

openAIModelName

gpt-4-turbo

openAITemperature

0.0

With that in place, a query such as the following can be translated into Cypher:

🤖, How was The Da Vinci Code rated?

The outcome of the LLM is not deterministic and is likely to vary. While you can execute it directly, we strongly recommend to use Connection#nativeSQL to retrieve the Cypher statement, inspect it, and then run it separately. In our test runs, the above questions was most often correctly translated to

MATCH (m:`Movie` {
  title: 'The Da Vinci Code'
})<-[r:`REVIEWED`]-(p:`Person`)
RETURN r.rating AS Rating, p.name AS ReviewerName

Other times the result was a syntactically correct statement, but it would only return the reviewers and the movie itself. Also note that while a human likely recognizes that you are actually thinking about the average rating, the LLM does not infer this. Making the question more explicit gives better results:

🤖, How was The Da Vinci Code rated on average?

is translated more accurately to:

MATCH (m:`Movie` {
  title: 'The Da Vinci Code'
})<-[:`REVIEWED`]-(p:`Person`)
RETURN avg(p.rating) AS AverageRating
Once a natural language query gets translated into Cypher, the result will be cached and further invocations of that query will use the cached result.

All that statements that do not start with 🤖 will be used as-is and treated as Cypher.

Appendix A: Neo4j specific conversions

Neo4j does not support all types used in the relational world. For some of them we offer conversions that are also available in frameworks such as Spring Data Neo4j (SDN). Those conversions are available in this driver as well, so you could use both SDN and this driver interchangeably.

A.1. Data types for fixed-point arithmetic

Neo4j does not support BigInteger and BigDecimal. The only way to store them is as String, and to read them back into the corresponding type. This is in line with SDN and OGM.

Any parameter of those types passed to PreparedStatement or CallableStatement will be stored as String, but can be read back through corresponding methods on the result sets.

A.2. SQL Date, Time and Timestamps

java.sql.Date

Maps to Cypher DATE

java.sql.Time

Maps to Cypher LOCAL TIME

java.sql.Timestamp

Maps to Cypher LOCAL DATETIME

For information on Cypher date types, see Temporal types.

For more precise a mapping, use a Neo4j Value instance with the appropriate type and its methods setObject and getObject.

Appendix B: Neo4j specific syntax for statements

This section deals with conventions for indexing and naming parameters in all types of statements (standard statements, prepared statements and callable statements) and other Neo4j specific syntax.

B.1. Callable statements

B.1.1. General syntax

You can invoke stored procedures as follows:

Common JDBC syntax

{? = call db.index.fulltext.queryNodes(?, ?)}

Standard Neo4j syntax

call db.index.fulltext.queryNodes(?, ?) yield *

Enumerating yielded columns

{$propertyName = call db.schema.nodeTypeProperties()}

Return (only applicable for functions)

RETURN sin(?)

B.1.2. Named parameters

Our callable statement implementation (org.neo4j.jdbc.Neo4jCallableStatement) does support named parameters. As per the JDBC specification, those are not named placeholders in a query- or statement-string, but the actual, formal parameters for the stored procedures to be called. We support both the $ and the colon syntax (i.e. either $name or :name).

The assigment {? = call xxx()} will be rewritten to call xxx() yield *, and {$x = call xxx()} will be rewritten to call xxx() yield x.

B.1.3. Result sets of callable statements

When you execute a callable statement via executeQuery, you must use the result set returned. If you just use execute, we assume that the underlying procedure returns only one row and that you use the getters on the statement itself.

Appendix C: Migrating from older versions or other JDBC drivers for Neo4j

There are some other JDBC drivers for Neo4j, under various licenses and with varying features and capabilities. In the following we outline possible migration processes. The basic usage patterns for all JDBC drivers for Neo4j are very similar: in the end, it’s "just" about using a JDBC compliant driver, and you would use it as described in the original Java tutorial about JDBC.

C.1. Migrating from version 4 or 5 of this driver

Version 5 and 4 of the Neo4j JDBC Driver have been mainly developed by Larus BA, Italy, a certified consulting and integration solutions partner for Neo4j.

The most important change that you need to make is removing the dependency on org.neo4j:neo4j-jdbc-bolt. You need to replace it with org.neo4j:neo4j-jdbc or one of the bundles we provide, see available bundles.

C.1.1. URL format, parameters and main driver class

The previous version mapped the URLs from the common Java driver onto jdbc:, i.e. using jdbc:neo4j:neo4j+s://foobar.io:7687/ for connecting against a database running on host foobar.io on port 7687.

The following URLs (direct bolt connection) are not supported:

  • jdbc:neo4j:bolt://<host>:<port>/

  • jdbc:neo4j:bolt+s://<host>:<port>/

  • jdbc:neo4j:bolt+ssc://<host>:<port>/

The following URLs behave the same but must be rewritten:

  • jdbc:neo4j:neo4j://<host>:<port>/ becomes jdbc:neo4j://<host>:<port>/

  • jdbc:neo4j:neo4j+s://<host>:<port>/ becomes jdbc:neo4j+s://<host>:<port>/

  • jdbc:neo4j:neo4j+ssc://<host>:<port>/ becomes jdbc:neo4j+ssc://<host>:<port>/

The following configuration properties are not supported and have no replacement:

  • leaked.sessions.logging

  • readonly

  • usebookmarks

  • max.transaction.retry.time (this driver has no built-in retry mechanism)

As with any persistent database connection you want to cater for failed transactions. We made good experience with resilience4j which fits in well with common Java frameworks, such as Spring Boot.

The following properties can be achieved with standardized JDBC settings:

  • encryption — Use the appropriate transport scheme (neo4j, neo4j+s or neo4j+ssc)

  • autocommit — Use java.sql.Connection.setAutoCommit

Connection pooling can be achieved with any JDBC-compliant connection pool. The following properties have no effect:

  • max.connection.lifetime

  • max.connection.poolsize

The following properties just have different names:

  • connection.acquisition.timeout is timeout (as query-parameter to the URL or inside the configuration properties)

The following properties can be achieved using a different URL:

  • database is now part of the URL: instead of specifying jdbc:neo4j:neo4j+s://foobar.io:7687?database=abc you would use the database name as path segment in the url, such as: jdbc:neo4j+s://foobar.io:7687/abc

If your tooling requires to use a concrete driver class, this JDBC driver has only org.neo4j.jdbc.Neo4jDriver. If you depend on a javax.sql.DataSource, we provide org.neo4j.jdbc.Neo4jDataSource.

C.1.2. Flattening

While the Neo4j JDBC Driver does not support the flatten option, it can emulate its effect. flatten un-nests returned nodes and relationships by providing all their properties as individual columns.

If you enable automatic SQL to Cypher translation, any *-select will inspect whether it affects nodes or relationships and will un-nest their properties, so that a SELECT * FROM Movie m will effectively become MATCH (m:Movie) RETURN m.title AS title, m.released AS released. For more information, see Star-Selects.

In case you want to access the actual node, return the whole table alias or just use Cypher.

The Neo4j JDBC Driver supports complex object types as return types.