Apache Zeppelin: Use with remote Spark cluster and Yarn

Apache Zeppelin is pretty usefull for interactive programming using the web browser. It even comes with its own installation of Apache Spark. For further information you can check my earlier post.
But the real power in using Spark with Zeppelin lies in its easy way to connect it to your existing Spark cluster using YARN. The following steps are necessary:

Copy your Hadoop configuration files to your Zeppelin installation under $ZEPPELIN_HOME/conf
Restart your Zeppelin Notebook
Insert the value “yarn-client” into the field master in the spark interpreter, as shown in the picture below.

After these steps you can use your notebooks with spark running on a yarn cluster. So you can make use of all the resources in the queue you assigned spark on you cluster.

Posted

September 29, 2016

Big Data, Tools, Visualization

marc

Tags:

Apache Zeppelin: Use with remote Spark cluster and Yarn

Comments

Leave a Reply Cancel reply