Friday, November 21, 2014

Get your Cassandra on with ccm (Cassandra Cluster Manager)

UPDATE! (2014-12-9)  After a small bit of experimentation it seems that running the Spark-Fu examples against Cassandra clusters spun up using CCM may be the cause of the performance issues I have been experiencing.  Using a single node Cassandra "cluster" from the tarball has presented the kind of performance I would expect for these simple examples on a laptop.  I followed the great tutorial from Datastax Academy to set this up.  You will need to sign up for the Academy - but it's free and has much great content so I have no issues recommending it.  That said, using CCM to experiment with Cassandra clusters locally seems to be otherwise wonderful and stable.

To do things with Apache Spark on Cassandra we need first have Cassandra.  The best/fastest way (that I know of) to get a Cassandra cluster locally to prototype with is to use CCM.

What is CCM? 

CCM is the Cassandra Cluster Manager.

What does CCM do?

CCM creates multi-node clusters for development and testing on a local machine.  It has no capacity for use in production.

How do I get started with all this CCM voodoo?

I am running this all on CentOS 6.5 - directions will vary for other environments.  We also assume you have java 7 or high installed.

Step 1: Download & install the epel packages:

See -> https://fedoraproject.org/wiki/EPEL

Download the package (e.g. CentOS 6.x)

http://mirror.metrocast.net/fedora/epel/6/i386/epel-release-6-8.noarch.rpm

Install:
[bkarels@ahimsa ~]$ sudo rpm -Uvh epel-release-6-8.noarch.rpm

Step 2: Install python-pip:

[bkarels@ahimsa ~]$ sudo yum -y install python-pip

[bkarels@ahimsa ~]$ pip install cql PyYAML

Step 3: Install Apache Ant (CCM depends on ant)

See ant.apache.org for install instructions.

Step 4: Install ccm: (Cassandra Cluster Manager)


[bkarels@ahimsa ~]$ git clone https://github.com/pcmanus/ccm.git
[bkarels@ahimsa ~]$ cd ccm/
[bkarels@ahimsa ~]$ sudo ./setup.py install

Step 5 (Optional): Get Help

To get help: (this is really a great way to dig in to ccm)
[bkarels@ahimsa ~]$  ccm -help
[bkarels@ahimsa ~]$  ccm [command] -help

Step 6: Do some stuff
CCM has two primary types of operations: 
  1. Cluster commands
  2. Node commands
Cluster commands take the form:
$ ccm [cluster command] [options]

Node commands take the form:
$ ccm [node name] [node command] [options]

So, lets spin up a three node local cluster real quick like: 

[bkarels@ahimsa ~]$ ccm create cluster0 -v 2.0.11
Downloading http://archive.apache.org/dist/cassandra/2.0.11/apache-cassandra-2.0.11-src.tar.gz to /tmp/ccm-bwFLa4.tar.gz (10.836MB)
  11362079  [100.00%]
Extracting /tmp/ccm-bwFLa4.tar.gz as version 2.0.11 ...
Compiling Cassandra 2.0.11 ...
Current cluster is now: cluster0
[bkarels@ahimsa ~]$ ccm list
 *cluster0
[bkarels@ahimsa ~]$ ccm populate --nodes 3
[bkarels@ahimsa ~]$ ccm start
[bkarels@ahimsa ~]$ ccm status
Cluster: 'cluster0'
-------------------
node1: UP
node3: UP
node2: UP
[bkarels@ahimsa ~]$ ccm node2 stop
[bkarels@ahimsa ~]$ ccm status
Cluster: 'cluster0'
-------------------
node1: UP
node3: UP
node2: DOWN
[bkarels@ahimsa ~]$

And just like that you have a three node cluster on your local machine that you can start to play with.  Of course this set of instructions barely scratches the surface of what is possible, but our focus is Spark so this is just to give us something we can read from and write to.

FIN

6 comments:

  1. Some us know all relating to the compelling medium you present powerful steps on this blog and therefore strongly encourage contribution from other ones on this subject while our own child is truly discovering a great deal. Have fun with the remaining portion of the year.
    Data Science with Python training in chenni
    Data Science training in chennai
    Data science training in velachery
    Data science training in tambaram
    Data Science training in OMR
    Data Science training in anna nagar
    Data Science training in chennai
    Data science training in Bangalore

    ReplyDelete
  2. I am a regular reader of your blog and being students it is great to read that your responsibilities have not prevented you from continuing your study and other activities. Love

    java training in marathahalli | java training in btm layout

    java training in jayanagar | java training in electronic city

    java training in chennai | java training in USA

    selenium training in chennai

    ReplyDelete
  3. Thanks Admin for sharing such a useful post, I hope it’s useful to many individuals for developing their skill to get good career.
    python training in pune
    python online training
    python training in OMR

    ReplyDelete
  4. Great post! I am actually getting ready to across this information, It’s very helpful for this blog.Also great with all of the valuable information you have Keep up the good work you are doing well.
    Devops training in velachery
    Devops training in annanagar
    Devops training in sholinganallur

    ReplyDelete