[[ha-setup-tutorial]]
High Availability setup tutorial
================================

This is a guide to set up a Neo4j HA cluster and run embedded Neo4j or Neo4j Server instances participating as cluster nodes.

== Background ==

The members of the HA cluster (see <<ha>>) use a Coordinator cluster to manage themselves and 
coordinate lifecycle activity like electing a master. When running an Neo4j HA cluster, 
a Coordinator cluster is used for cluster collaboration and must be installed and configured 
before working with the Neo4j database HA instances.

[TIP]
Neo4j Server (see <<server>>) and Neo4j Embedded (see <<configuration-introduction>>) can both be used as nodes in the same HA cluster.
This opens for scenarios where one application can insert and update data via a Java or JVM language based application, and other instances can run Neo4j Server and expose the data via the REST API (<<rest-api>>).

Below, there will be 3 coordinator instances set up on one local machine.

=== Download and unpack Neoj4 Enterprise ===

Download and unpack three installations of Neo4j Enterprise 
(called +$NEO4J_HOME1+, +$NEO4J_HOME2+, +$NEO4J_HOME3+) from http://neo4j.org/download[the Neo4j download site].

== Setup and start the Coordinator cluster ==

Now, in the 'NEO4J_HOME1/conf/coord.cfg' file, adjust the coordinator +clientPort+ and let the coordinator search for other coordinator cluster members at the localhost port ranges:

[source]
----
#$NEO4J_HOME1/conf/coord.cfg

server.1=localhost:2888:3888
server.2=localhost:2889:3889
server.3=localhost:2890:3890

clientPort=2181
----

The other two config files in +$NEO4J_HOME2+ and +$NEO4J_HOME3+ will have a different +clienttPort+ set but the other parameters identical to the first one:

[source]
----
#$NEO4J_HOME2/conf/coord.cfg
...
server.1=localhost:2888:3888
server.2=localhost:2889:3889
server.3=localhost:2890:3890
...
clientPort=2182
----

[source]
----
#$NEO4J_HOME2/conf/coord.cfg
...
server.1=localhost:2888:3888
server.2=localhost:2889:3889
server.3=localhost:2890:3890
...
clientPort=2183
----

Next we need to create a file in each data directory called "myid" that contains an id for each server 
equal to the number in +server.1+, +server.2+ and +server.3+ from the configuration files.

[source,shell]
----
neo4j_home1$ echo '1' > data/coordinator/myid
neo4j_home2$ echo '2' > data/coordinator/myid
neo4j_home3$ echo '3' > data/coordinator/myid
----

We are now ready to start the Coordinator instances:

[source,shell]
----
neo4j_home1$ ./bin/neo4j-coordinator start
neo4j_home2$ ./bin/neo4j-coordinator start
neo4j_home3$ ./bin/neo4j-coordinator start
----

== Start the Neo4j Servers in HA mode ==

In your 'conf/neo4j.properties' file, enable HA by setting the necessary parameters for all 3 installations, adjusting the +ha.server_id+ for all instances:

[source]
----
#$NEO4J_HOME1/conf/neo4j.properties
#unique server id for this graph database
#can not be negative id and must be unique
ha.server_id = 1

#ip and port for this instance to bind to
ha.server = localhost:6001

#connection information to the coordinator cluster client ports
ha.coordinators = localhost:2181,localhost:2182,localhost:2183
----

[source]
----
#$NEO4J_HOME2/conf/neo4j.properties
#unique server id for this graph database
#can not be negative id and must be unique
ha.server_id = 2

#ip and port for this instance to bind to
ha.server = localhost:6002

#connection information to the coordinator cluster client ports
ha.coordinators = localhost:2181,localhost:2182,localhost:2183
----

[source]
----
#$NEO4J_HOME3/conf/neo4j.properties
#unique server id for this graph database
#can not be negative id and must be unique
ha.server_id = 3

#ip and port for this instance to bind to
ha.server = localhost:6003

#connection information to the coordinator cluster client ports
ha.coordinators = localhost:2181,localhost:2182,localhost:2183
----

To avoid port clashes when starting the servers, adjust the ports for the REST endpoints in all instances under 'conf/neo4j-server.properties' and enable HA mode:

[source]
----
#$NEO4J_HOME1/conf/neo4j-server.properties
...
# http port (for all data, administrative, and UI access)
org.neo4j.server.webserver.port=7474
...
# Allowed values:
# HA - High Availability
# SINGLE - Single mode, default.
# To run in High Availability mode, configure the coord.cfg file, and the
# neo4j.properties config file, then uncomment this line:
org.neo4j.server.database.mode=HA
----

[source]
----
#$NEO4J_HOME2/conf/neo4j-server.properties
...
# http port (for all data, administrative, and UI access)
org.neo4j.server.webserver.port=7475
...
# Allowed values:
# HA - High Availability
# SINGLE - Single mode, default.
# To run in High Availability mode, configure the coord.cfg file, and the
# neo4j.properties config file, then uncomment this line:
org.neo4j.server.database.mode=HA
----

[source]
----
#$NEO4J_HOME3/conf/neo4j-server.properties
...
# http port (for all data, administrative, and UI access)
org.neo4j.server.webserver.port=7476
...
# Allowed values:
# HA - High Availability
# SINGLE - Single mode, default.
# To run in High Availability mode, configure the coord.cfg file, and the
# neo4j.properties config file, then uncomment this line:
org.neo4j.server.database.mode=HA
----

To avoid JMX port clashes adjust the assigned ports for all instances under 'conf/neo4j-wrapper.properties':

[source]
----
#$NEO4J_HOME1/conf/neo4j-wrapper.properties
...
# Remote JMX monitoring, adjust the following lines if needed.
# Also make sure to update the jmx.access and jmx.password files with appropriate permission roles and passwords,
# the shipped configuration contains only a read only role called 'monitor' with password 'Neo4j'.
# For more details, see: http://download.oracle.com/javase/6/docs/technotes/guides/management/agent.html
wrapper.java.additional.4=-Dcom.sun.management.jmxremote.port=3637
...
----

[source]
----
#$NEO4J_HOME2/conf/neo4j-wrapper.properties
...
# Remote JMX monitoring, adjust the following lines if needed.
# Also make sure to update the jmx.access and jmx.password files with appropriate permission roles and passwords,
# the shipped configuration contains only a read only role called 'monitor' with password 'Neo4j'.
# For more details, see: http://download.oracle.com/javase/6/docs/technotes/guides/management/agent.html
wrapper.java.additional.4=-Dcom.sun.management.jmxremote.port=3638
...
----

[source]
----
#$NEO4J_HOME3/conf/neo4j-server.properties
...
# Remote JMX monitoring, adjust the following lines if needed.
# Also make sure to update the jmx.access and jmx.password files with appropriate permission roles and passwords,
# the shipped configuration contains only a read only role called 'monitor' with password 'Neo4j'.
# For more details, see: http://download.oracle.com/javase/6/docs/technotes/guides/management/agent.html
wrapper.java.additional.4=-Dcom.sun.management.jmxremote.port=3639
...
----

Now, start all three server instances.

[source,shell]
----
neo4j_home1$ ./bin/neo4j start
neo4j_home2$ ./bin/neo4j start
neo4j_home3$ ./bin/neo4j start
----

Now, you should be able to access the 3 servers (the first one being elected as master since it was started first) at 
http://localhost:7474/webadmin/\#/info/org.neo4j/High%20Availability/,
http://localhost:7475/webadmin/\#/info/org.neo4j/High%20Availability/
and
http://localhost:7476/webadmin/#/info/org.neo4j/High%20Availability/
and check the status of the HA configuration.
Alternatively, the REST API is exposing JMX, so you can check the HA JMX bean with e.g.

[source,shell]
----
curl -H "Content-Type:application/json" -d '["org.neo4j:*"]' http://localhost:7474/db/manage/server/jmx/query
----

And find in the response

[source,javascript]
----
"description" : "Information about all instances in this cluster",
    "name" : "InstancesInCluster",
    "value" : [ {
      "description" : "org.neo4j.management.InstanceInfo",
      "value" : [ {
        "description" : "address",
        "name" : "address"
      }, {
        "description" : "instanceId",
        "name" : "instanceId"
      }, {
        "description" : "lastCommittedTransactionId",
        "name" : "lastCommittedTransactionId",
        "value" : 1
      }, {
        "description" : "machineId",
        "name" : "machineId",
        "value" : 1
      }, {
        "description" : "master",
        "name" : "master",
        "value" : true
      } ],
      "type" : "org.neo4j.management.InstanceInfo"
    }
----

== Start Neo4j Embedded in HA mode ==

If you are using Maven and Neo4j Embedded, simply add the following dependency to your project:

[source,xml]
----
<dependency>
   <groupId>org.neo4j</groupId>
   <artifactId>neo4j-ha</artifactId>
   <version>${neo4j-version}</version>
</dependency>
----
_Where +$\{neo4j-version}+ is the Neo4j version used._


If you prefer to download the jar files manually, they are included in the http://neo4j.org/download/[Neo4j distribution].

The difference in code when using Neo4j-HA is the creation of the graph database service.

[source,java]
----
GraphDatabaseService db = new HighlyAvailableGraphDatabase( path, config );
----

The configuration can contain the standard configuration parameters (provided as part of the +config+ above or
in 'neo4j.properties' but will also have to contain:

[source]
----
#HA instance1
#unique machine id for this graph database
#can not be negative id and must be unique
ha.server_id = 1

#ip and port for this instance to bind to
ha.server = localhost:6001

#connection information to the coordinator cluster client ports
ha.coordinators = localhost:2181,localhost:2182,localhost:2183

enable_remote_shell = port=1331
----

First we need to create a database that can be used for replication. This is easiest done by just starting a normal embedded graph database, pointing out a path and shutdown.

[source,java]
----
Map<String,String> config = HighlyAvailableGraphDatabase.loadConfigurations( configFile );
GraphDatabaseService db = new HighlyAvailableGraphDatabase( path, config );
----


We created a config file with machine id=1 and enabled remote shell. The main method will expect the path to the db as first parameter and the configuration file as the second parameter. 

It should now be possible to connect to the instance using <<shell>>:

[source,shell]
----
neo4j_home1$ ./bin/neo4j-shell -port 1331
NOTE: Remote Neo4j graph database service 'shell' at port 1331
Welcome to the Neo4j Shell! Enter 'help' for a list of commands

neo4j-sh (0)$ hainfo
I'm currently master
Connected slaves:
----

Since it is the first instance to join the cluster it is elected master. Starting another instance would require a second configuration and another path to the db.

[source]
----
#HA instance2
#unique machine id for this graph database
#can not be negative id and must be unique
ha.server_id = 2

#ip and port for this instance to bind to
ha.server = localhost:6001

#connection information to the coordinator cluster client ports
ha.coordinators = localhost:2181,localhost:2182,localhost:2183

enable_remote_shell = port=1332
----

Now start the shell connecting to port 1332:

[source,shell]
----
neo4j_home1$ ./bin/neo4j-shell -port 1332
NOTE: Remote Neo4j graph database service 'shell' at port 1332
Welcome to the Neo4j Shell! Enter 'help' for a list of commands

neo4j-sh (0)$ hainfo
I'm currently slave
----

