We are setting up a simple 2 node cluster for cassandra. I'll skip the installation part because a lot of articles cover it.
The below guide covers setting up the nodes so they are aware of each other and setting up a keyspace which is physically replicated on both nodes. Please note by setting up a cluster it doesnt mean necessarily that your data is duplicated. If one of the node goes down your data may not be accessible.
Physical redundancy is an important aspect for the services we design, if one of the node goes down the service should work without any downtime.
node-1:
IP: 10.3.185.234
node-2:
IP: 10.3.185.239
authenticator:
authorizer:
-seeds:
listen_address:
start_rpc: true
rpc_address: 0.0.0.0
broadcast_rpc_address:
#make sure to use the GossipingPropertyFileSnitch in a production environment. Remember to comment out #endpoint_snitch: SimpleSnitch. Otherwise you will shit bricks setting up authentication
endpoint_snitch: GossipingPropertyFileSnitch
After these configurations are done on both nodes, restart cassandra and run
Command: nodetool status
Output:
Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 10.200.239.42 9.69 GB 256 ? c2b53de3-4371-4560-90f8-d5bec61d39d6 RAC1
UN 10.200.239.43 10.06 GB 256 ? 7634ac38-b8a6-4185-b087-6faa13dec19e RAC1
Output of nodetool status should show both nodes similar to above. Our nodes are now aware of each other.
The below guide covers setting up the nodes so they are aware of each other and setting up a keyspace which is physically replicated on both nodes. Please note by setting up a cluster it doesnt mean necessarily that your data is duplicated. If one of the node goes down your data may not be accessible.
Physical redundancy is an important aspect for the services we design, if one of the node goes down the service should work without any downtime.
node-1:
IP: 10.3.185.234
node-2:
IP: 10.3.185.239
Keys to configure in Cassandra.yaml
cluster_name:authenticator:
authorizer:
-seeds:
listen_address:
start_rpc: true
rpc_address: 0.0.0.0
broadcast_rpc_address:
#make sure to use the GossipingPropertyFileSnitch in a production environment. Remember to comment out #endpoint_snitch: SimpleSnitch. Otherwise you will shit bricks setting up authentication
endpoint_snitch: GossipingPropertyFileSnitch
After these configurations are done on both nodes, restart cassandra and run
Command: nodetool status
Output:
Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 10.200.239.42 9.69 GB 256 ? c2b53de3-4371-4560-90f8-d5bec61d39d6 RAC1
UN 10.200.239.43 10.06 GB 256 ? 7634ac38-b8a6-4185-b087-6faa13dec19e RAC1
Output of nodetool status should show both nodes similar to above. Our nodes are now aware of each other.
Setting up a replicated keyspace
use cqlsh to connect to cassandra on any node and create a keyspace with the command:
CREATE KEYSPACE my_dataspace WITH replication = {'class': 'NetworkTopologyStrategy', 'DC1': '2'} AND durable_writes = true;
Notice the following parameters which enable replication for the keyspace.
{'class': 'NetworkTopologyStrategy', 'DC1': '2'}
'DC1':'2' tells cassandra to physically replicate data of data center 'DC1'.
Verifying replication
Ok so you have created a keyspace on a single node. You can verify by logging on the other node and checking if the keyspace is indeed defined on the other node(where you didn't create the keyspace) as well.
To see all the keyspaces run
cqlsh> describe KEYSPACEs;
Make sure the keyspace exists on the node where you didn't define it.
Now create a table on this node
CREATE TABLE requests (
id bigint PRIMARY KEY,
request text
)
and verify the other node gets the table definition as well
cqlsh>DESCRIBE tables;
requests
Ok great if you see the other node has table definition, you have successfully setup cassandra replication
Comments