0

Im trying to get Hive on Spark working properly but it seems like it is not loading the hive-exec-2.0.1.jar. I can get Hive on mr to work perfectly fine. Im using Hive 2.0.1 and Spark 1.6.1. Followed the Hive on Spark tutorial. I set all the necessary properties on hive-site.xml, linked the spark assembly jar into the hive lib folder, I already have all the environment variables set (SPARK_HOME, etc). I started the Spark master and worker. Also started the hiveserver2 with DEBUG level. Tried to run a simple query "select count(*)..." and as far as I see in the hive logs, its executing the spark-submit command with all the nesscary arguments including the hive-exec-2.0.1.jar file but still I see that during the execution I get:


16/07/29 18:14:51 [RPC-Handler-3]: WARN rpc.RpcDispatcher: Received error message:io.netty.handler.codec.DecoderException: java.lang.NoClassDefFoundError: org/apache/hive/spark/client/Job
    at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:358)
    at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:230)
    at io.netty.handler.codec.ByteToMessageCodec.channelRead(ByteToMessageCodec.java:103)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
    at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NoClassDefFoundError: org/apache/hive/spark/client/Job
    at java.lang.ClassLoader.defineClass1(Native Method)
    at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
    at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
    at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
    at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:348)
    at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:154)
    at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:133)
    at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:670)
    at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:118)
    at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
    at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)
    at org.apache.hive.spark.client.rpc.KryoMessageCodec.decode(KryoMessageCodec.java:97)
    at io.netty.handler.codec.ByteToMessageCodec$1.decode(ByteToMessageCodec.java:42)
    at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:327)
    ... 15 more
Caused by: java.lang.ClassNotFoundException: org.apache.hive.spark.client.Job
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    ... 39 more
.
16/07/29 18:14:51 [RPC-Handler-3]: WARN client.SparkClientImpl: Client RPC channel closed unexpectedly.

Do you have any idea how to solve this issue? I tried everything: set the spark.jars property, link the hive-exec-2.0.1.jar, set the spark.executor..property, etc.

It seems like it should work without any problems, but for some reason I cant get it to work...

Anything else?

6
  • Do you have copy of hive-site.xml file in $SPARK_HOME/conf dir? Commented Jul 29, 2016 at 21:01
  • yes and the HiveContext initiates successfully Commented Jul 30, 2016 at 4:16
  • Looks like spark 1.6.1 is compatible with hive 1.2.1 - you should try downgrading hive to 1.2.1 version - spark.apache.org/docs/1.6.1/… Commented Jul 30, 2016 at 4:38
  • tried it as well - same problem. seems like the even though the spark-submit adding the hive-exec file, it still not being added into the executor classpath...any other idea? Commented Jul 30, 2016 at 6:57
  • 1
    solved it by adding the property: spark.driver.extraClassPath and point to: hive-exec-1.2.1.jar Commented Jul 30, 2016 at 19:06

3 Answers 3

1

These are the changes I had to make along with copying hive-site.xml to $SPARK_HOME/conf dir when I set up spark on cloudera VM:

Add these lines in $SPARK_HOME/conf/classpath.txt file:

/home/cloudera/spark-1.2.1-bin-hadoop2.4/lib/spark-1.2.1-yarn-shuffle.jar
/usr/jars/hive-exec-1.2.1.jar

Add this property in $SPARK_HOME/conf/spark-default.conf file (same assembly jar that you have copied to hive lib - I did not need to copy assembly jar to hive lib.):

spark.yarn.jar=local:/home/cloudera/spark-1.2.1-bin-hadoop2.4/lib/spark-assembly-1.2.1-hadoop2.4.0.jar

Also check jars version of hive set in classpath.txt files are the same version of yours, and location(absolute path) is also valid.

Sign up to request clarification or add additional context in comments.

Comments

0

solved it by adding the property: spark.driver.extraClassPath and point to: hive-exec-1.2.1.jar

1 Comment

right after that I had another issue: executing the query caused a java.lang.AbstractMethodError exception to be thrown. Finally the compatible versions are: Spark-1.6.2 with Hive-2.0.1 or Spark 1.4.1 with Hive-1.2.1
0

The version of Spark compatible with old newer version of Hive 1.2 onwards is Spark 1.3.1. That is the only one I have managed to make it work and it is production ready. This is not built on any vendor's version. I presented this in Hortonworks talk. Details from here

HTH,

Mich

1 Comment

Links to external resources are encouraged, but please add context around the link so your fellow users will have some idea what it is and why it’s there. Always quote the most relevant part of an important link, in case the target site is unreachable or goes permanently offline.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.