Tuesday, December 16, 2014

Simple Python application on Apache Spark Cluster


As an exercise, I am working on duplicating my previous examples in Python.  It is clear that Python has, and is gaining, traction in the data world.  So, it makes sense to have a working knowledge of it.

As with my other examples, everything will find it's way to my Github repositories - forking and enhancements welcome.

This effort is about the most simplistic Python submit to Spark Cluster example possible.  But, when you move beyond the REPL, you have to start somewhere right?


  • Java 1.7+ (Oracle JDK required)
  • A Spark cluster (how to do it here.)
  • git

Clone the Example

Begin by cloning the example project from github - super-simple-spark-python-app and cd into the project directory.

[bkarels@ahimsa work]$ git clone git@github.com:bradkarels/super-simple-spark-python-app.git
Initialized empty Git repository in /home/bkarels/work/super-simple-spark-python-app/.git/
remote: Counting objects: 13, done.
remote: Compressing objects: 100% (12/12), done.
remote: Total 13 (delta 3), reused 6 (delta 1)
Receiving objects: 100% (13/13), done.
Resolving deltas: 100% (3/3), done.
[bkarels@ahimsa work]$ cd super-simple-spark-python-app/

Move the file tenzingyatso.txt to your home directory.
[bkarels@ahimsa super-simple-spark-python-app]$ mv tenzingyatso.txt ~

Modify simple.py path to the sample file (save and close).

file = sc.textFile("/home/bkarels/tenzingyatso.txt")
file = sc.textFile("/home/yourUserNameHere/tenzingyatso.txt")
...or some such similar thing.

Spark it up (with python)!

If your local Spark cluster is not up and running, do that now.  If you need to review how to go about that, you can look here.

Make Sparks fly! (i.e. run it)

Since this example does not have a packaged application (e.g. jar, egg, etc.), we can invoke spark-submit with just our simple python file.

[bkarels@ahimsa super-simple-spark-python-app]$ $SPARK_HOME/bin/spark-submit --master spark:// ./simple.py
Your expected output to the console should be a line count of 7 wrapped in a nice battery of asterisks and the copy from the first line of the example file.  If you see that - this has worked.


  1. Hi, Great.. Tutorial is just awesome..It is really helpful for a newbie like me.. I am a regular follower of your blog. Really very informative post you shared here. Kindly keep blogging. If anyone wants to become a Java developer learn from Java Training in Chennai. or learn thru Java Online Training India . Nowadays Java has tons of job opportunities on various vertical industry.

  2. I believe there are many more pleasurable opportunities ahead for individuals that looked at your site.
    java training in bangalore

  3. AWS Training in Bangalore - Live Online & Classroom
    myTectra Amazon Web Services (AWS) certification training helps you to gain real time hands on experience on AWS. myTectra offers AWS training in Bangalore using classroom and AWS Online Training globally. AWS Training at myTectra delivered by the experienced professional who has atleast 4 years of relavent AWS experince and overall 8-15 years of IT experience. myTectra Offers AWS Training since 2013 and retained the positions of Top AWS Training Company in Bangalore and India.

    IOT Training in Bangalore - Live Online & Classroom
    IOT Training course observes iot as the platform for networking of different devices on the internet and their inter related communication. Reading data through the sensors and processing it with applications sitting in the cloud and thereafter passing the processed data to generate different kind of output is the motive of the complete curricula. Students are made to understand the type of input devices and communications among the devices in a wireless media.

  4. A really good post man, very thankful and hopeful that you will write many more posts like this one.

  5. Thank you so much for posting this. I really appreciate your work. Keep it up. Great work!Best software training company with placement in Hyderabad

  6. Hi,Very nice post.Thanks for the information provided.It is very interesting and very informative.I am always impressed with your post and helpful tips.Keep posting tips and relevant content as usual.
    big data training in btm


  7. نجار ابواب بالرياض نجار بالرياض
    تركيب ستائر بالرياض شركة تركيب ستائر بالرياض
    تنظيف مكيفات بالرياض شركة تنظيف مكيفات بالرياض
    شركة تنظيف افران الغاز بالرياض شركة تنظيف افران بالرياض


  8. Best Article buy Pain Pills online Excellent post. I appreciate this site. Stick with it! Because the admin of this web page is working, no doubt very quickly it will be well-known, due to its quality contents.This website was how do you say it? Relevant!! Finally, I’ve found something that helped me.
    Best Article buy Roxicodone online Excellent post
    buy Xanax online
    buy Oxycodone online

    Best Article buy Pain Medications online Excellent post. I appreciate this site. Stick with it! Because the admin of this web page is working, no doubt very quickly it will be well-known, due to its quality contents.This website was how do you say it? Relevant!! Finally, I’ve found something that helped me.

    buy Research Chemicals online

    buy Roxicodone online

    buy Cbd Isolate online

  9. Thanks for sharing your innovative ideas to our vision. I have read your blog and I gathered some new information through your blog. Your blog is really very informative and unique. Keep posting like this. Awaiting for your further update. If you are looking for any Python programming related information, please visit our website python training institute in Bangalore

  10. Usually I never comment on blogs but your article is so convincing that I never stop myself to say something about it. You’re doing a great job,Keep it up.

    Try Our Psychic Reader In Toronto Canada Services and Get All the benefits of it in your life, we make All your Personal problems solved in just minutes.


  11. Buy Farm Fresh seasonal Mango Fruits online at best price. Send Mangoes to all Pakistan like Karachi, Lahore, Islamabad, Rawalpindi, Peshawar, Multan, Faisalabad, Hyderabad and Jhelum on door step fruitoyepk.