Apache Zeppelin on the Hortonworks 2.3 sandbox

A few notes from playing with zeppelin on the Hortonworks HDP 2.3 sandbox.

zeppelin

Download the HDP 2.3 sandbox from the Hortonworks download site.

Install maven first since we’ll compile zeppelin from source: Download Apache Maven 3.3.3. I just downloaded the binary tar.gz and installed in to /usr/local/bin since we just need it once.

Most of these instructions came from Introduction to Data Science with Apache Spark:

git clone https://github.com/apache/incubator-zeppelin.git
cd incubator-zeppelin
mvn clean install -DskipTests -Pspark-1.3 -Dspark.version=1.3.1 -Phadoop-2.6 -Pyarn
# go get a coffee, takes about 15 minutes to complete
cd conf
cp /etc/hive/conf/hive-site.xml .
cp zeppelin-env.sh.template zeppelin-env.sh
cp zeppelin-site.xml.template zeppelin-site.xml
vi hive-site.xml
# search for hive.metastore.client.connect.retry.delay
# change<value>5s</value> with <value>5</value> otherwise you get issue #1 below
# search for hive.metastore.client.socket.timeout
# change to <value>1800</value>
# :wq
vi zeppelin-site.xml
# search for zeppelin.server.port
# change to <value>10008</value> otherwise it conflicts with ambari

Issue #1: java.lang.NumberFormatException: For input string: “5s”

java.lang.NumberFormatException: For input string: “5s”

Fixed by editing conf/hive-site.xml and changing 5s to 5. https://issues.apache.org/jira/browse/ZEPPELIN-93

Issue #2: hql interpreter not found

hql interpreter not found
org.apache.zeppelin.notebook.NoteInterpreterLoader.get(NoteInterpreterLoader.java:148)
org.apache.zeppelin.notebook.Note.run(Note.java:267)
org.apache.zeppelin.socket.NotebookServer.runParagraph(NotebookServer.java:534)
org.apache.zeppelin.socket.NotebookServer.onMessage(NotebookServer.java:119)
org.java_websocket.server.WebSocketServer.onWebsocketMessage(WebSocketServer.java:469)
org.java_websocket.WebSocketImpl.decodeFrames(WebSocketImpl.java:368)
org.java_websocket.WebSocketImpl.decode(WebSocketImpl.java:157)
org.java_websocket.server.WebSocketServer$WebSocketWorker.run(WebSocketServer.java:657)

I was trying to use %hql, but it should have been %hive.hql:

%hive.hql
select * from default.sample_08