Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

java.lang.NoSuchMethodError: org.apache.spark.sql.internal.SharedState.externalCatalog()Lorg/apache/spark/sql/catalyst/catalog/ExternalCatalog; #3

Open
NithK45 opened this issue Jun 13, 2018 · 19 comments

Comments

@NithK45
Copy link

NithK45 commented Jun 13, 2018

Describe the bug
I am running standalone spark version 2.3 in an ec2-instance. And i have Hive standalone on the same instance, ranger-hive-plugin is setup and policies are working fine with hive connection.

I carefully followed your instructions to setup ranger for spark-sql. Only thing i did not perform is modifying ExpermientalMethods.scala presuming it is not required for testing.

Also, i built the spark-authorizer jar using "maven clean package -Pspark-2.3"

To Reproduce
Steps to reproduce the behavior:

  1. copied ranger*.xml conf files to spark_home/conf, copied ranger-hive*.jars to spark_home/jars along with spark-authorizer jar. gave full permissions to all xml and jar files. Modified conf xml files as advised.
  2. Environments are spark standalone, hive, ranger. All are on ec2.
  3. Tried running show databases command in spark-shell,
  4. See error
    image

Anything i missed?

My final goal is that spark-sql will be used from different sql clients such as "squirrel sql client" or "cassandra" etc. And hive policies should be enforced when they query the data. All clients will connect to spark-sql using a string that looks like
jdbc:hive2://hostname:10015/databasename;ssl=true;sslTrustStore=/pathtofile.jks;trustStorePassword=abcd

@yaooqinn
Copy link
Owner

sorry…master branch is not stable yet for most of cases I verified are agaist Spark 2.1.2.
I would love to have it fixed as soon as my vacation end. For now,you may switch to branch 2.3 or use the package i deployed for 2.3,more details in branch 2.3 readme

@yaooqinn
Copy link
Owner

@NithK45 I have it tested by spark-shell on yarn using maven clean package -Pspark-2.1 and it works fine with spark 2.1.2/2.2.1/2.3.0. I don't have an standalone env for testing, but i guess you are using cluster mode and standalone seems don't have an module like 'Yarn Distribute Cache' for deploy jars, which may be manually copied to all work nodes.

@NithK45
Copy link
Author

NithK45 commented Jun 21, 2018

@yaooqinn Thanks for testing that. I am using a ec2 large single node containing standalone spark 2.3, hive 2.4 setup. Data is in s3. We don't have yarn and hadoop setup. I copied your jar into spark_home/jars/ directory, restarted spark and hive services.

I also modified your code to comment out the validation that is causing the above error and rebuilt and tested, it did not throw the error this time but no enforcement of ranger policies happened from spark-shell.

I would also like to know whether you tested this by connecting to spark-sql through a SQL client (like Squirrel sql client or SQL developer tools) with a jdbc connection string?

@yaooqinn
Copy link
Owner

For SQL clients, I use Kyuubi, which is a multi-tenant JDBC/ODBC Server powered by Spark SQL

@jacibreiro
Copy link

Hi!

Same error here... NoSuchMethodError :-(. Is there any solution? Thanks!

@yaooqinn
Copy link
Owner

yaooqinn commented Dec 9, 2018

@jacibreiro please try v2.1.1 and follow the doc https://yaooqinn.github.io/spark-authorizer/docs/install_plugin.html

@jacibreiro
Copy link

@yaooqinn, thanks for you quick answer. I'm using v2.1.1 (builded from master branch)... I'm also using hive 2.3.2, spark 2.4 and ranger 1.2... Maybe the problem is the ranger version? Have you tested ranger versions higher than 0.5?

@jacibreiro
Copy link

With pyspark shell I don't obtain the NoSuchMethodError, but still it doesn't work... I have followed all the steps of the manual but it seems that the plugin is not connecting with ranger (maybe because of the version issue I comment in the previous post). I think is not connecting because I can't see the policy cached. With hive there is a script to enable the plugin, but here I don't know when the communication between spark and ranger start...:-S Maybe is there any extra step that is not in the documentation?
By the way, this is my ranger-hive-security-xml:

<property>
    <name>ranger.plugin.hive.policy.rest.url</name>
    <value>http://ranger-admin:6080</value>
</property>

<property>
    <name>ranger.plugin.hive.service.name</name>
    <value>cl1_hive</value>
</property>

<property>
    <name>ranger.plugin.hive.policy.cache.dir</name>
    <value>/tmp/cl1_hive/policycache</value>
</property>

<property>
    <name>ranger.plugin.hive.policy.pollIntervalMs</name>
    <value>5000</value>
</property>

<property>
    <name>ranger.plugin.hive.policy.source.impl</name>
    <value>org.apache.ranger.admin.client.RangerAdminRESTClient</value>
</property>

Do you see something wrong?

Thanks!

@yaooqinn
Copy link
Owner

@jacibreiro you are right. Higher versions of ranger are built with higher hive client jars than spark(1.2.1). We may have it fixed in https://issues.apache.org/jira/browse/RANGER-2128 later

@jacibreiro
Copy link

@yaooqinn I have used and old ranger version (0.5.3) but still doesn't work... I don't see anything either in the policy cache dir nor in the audit plugins sheet (in ranger). So it seems to be something related with communication between spark and ranger because is not able to load the policies. I have followed all the steps described in https://yaooqinn.github.io/spark-authorizer/docs/install_plugin.html . Watching my ranger-hive-security-xml (previous post) Do you miss something?

@yaooqinn
Copy link
Owner

yaooqinn commented Dec 19, 2018

@jacibreiro could you please detail about "still doesn't work..."

@jacibreiro
Copy link

sure @yaooqinn, I mean that I have followed every single step in the manual:

  • install version 0.5.3 of ranger
  • install and configure hive plugin in spark
  • enable hive plugin in spark-defaults.conf
    And when I query hive from spark using pyspark shell (following the instructions), nothing related with authorization happens. The behaviour is the same than before applying the plugin. policies are not applied.

@yaooqinn
Copy link
Owner

Maybe you should check the Ranger Admin is reachable first and let's start with spark-sql script to see if there is anything went wrong.

@alcpinto
Copy link

alcpinto commented Jan 7, 2019

@yaooqinn

I am getting the same error described in this thread.

image

Installation details:

  • Spark 2.4.0 (I am using the built-in Thrift Server)
  • Ranger 0.5.3
  • spark-authorizer 2.1.1

When I had spark-authorizer to /jars folder or application dependencies I couldn't connect to thrift server. Even when I tested the connection in the Ranger Admin UI I got the same error:

image

Without spark-authorizer I can connect with any problem.

I followed your steps but maybe I am missing something... Do you have any advice?

Thanks!

@yaooqinn
Copy link
Owner

yaooqinn commented Jan 8, 2019

Hi @alcpinto

Spark 2.4 is not supported by now. I proposed a pull request #14 to support 2.4.

You can build that branch with cmd mvn clean package -Pspark-2.4 and try again

@Neilxzn
Copy link

Neilxzn commented Mar 5, 2019

Hi @yaooqinn
I am try it in spark.2.4, too. But it doesn't work out. I use superuser hadoop to use any databases, it always throw Permission error. And In spark.2.2, it work!

@Neilxzn
Copy link

Neilxzn commented Mar 5, 2019

hi @alcpinto ,
do you solve the problem about spark2.4? how to solve it?

@alcpinto
Copy link

alcpinto commented Mar 11, 2019 via email

@chaogefeng
Copy link

我这边测试是可以的 spark2.4.3 hive2.3.3 ranger1.1.0 把patch 合并进去就不报这个错误了

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants