CDH6.0は、org.apache.spark.sql.hive.HiveSessionStateBuilderのインスタンス化中にpysparkエラーを初期化します



Cdh6 0 Initializes Pyspark Error While Instantiating Org



CDHクラスターでpysparkエラーを初期化しています '「org.apache.spark.sql.hive.HiveSessionStateBuilder」のインスタンス化中にエラーが発生しました:'

pysparkエラーは、「org.apache.spark.sql.hive.HiveSessionStateBuilder」のインスタンス化中にCDHクラスターのエラーで報告されます。具体的なエラーは次のとおりです。

[root@xxxxx:/opt/cloudera/parcels/CDH-6.0.0-1.cdh6.0.0.p0.537114/lib/spark/bin] # pyspark Python 2.7.15 |Anaconda, Inc.| (default, May 1 2018, 23:32:55) [GCC 7.2.0] on linux2 Type 'help', 'copyright', 'credits' or 'license' for more information. Setting default log level to 'WARN'. To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). Traceback (most recent call last): File '/opt/cloudera/parcels/CDH-6.0.0-1.cdh6.0.0.p0.537114/lib/spark/python/pyspark/shell.py', line 45, in spark = SparkSession.builder File '/opt/cloudera/parcels/CDH-6.0.0-1.cdh6.0.0.p0.537114/lib/spark/python/pyspark/sql/session.py', line 183, in getOrCreate session._jsparkSession.sessionState().conf().setConfString(key, value) File '/opt/cloudera/parcels/CDH-6.0.0-1.cdh6.0.0.p0.537114/lib/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py', line 1257, in __call__ File '/opt/cloudera/parcels/CDH-6.0.0-1.cdh6.0.0.p0.537114/lib/spark/python/pyspark/sql/utils.py', line 79, in deco raise IllegalArgumentException(s.split(': ', 1)[1], stackTrace) pyspark.sql.utils.IllegalArgumentException: u'Error while instantiating 'org.apache.spark.sql.hive.HiveSessionStateBuilder':'

原因:pysparkは、初期化時にハイブデータにアクセスする必要があります。この操作はrootユーザーによって引き起こされます。



回避策:コマンドを実行するためにハイブユーザーに切り替えます

[root@xxxxx:/opt/cloudera/parcels/CDH-6.0.0-1.cdh6.0.0.p0.537114/lib/spark/bin] # sudo -u hive pyspark Python 2.7.15 |Anaconda, Inc.| (default, May 1 2018, 23:32:55) [GCC 7.2.0] on linux2 Type 'help', 'copyright', 'credits' or 'license' for more information. Setting default log level to 'WARN'. To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). 19/05/23 20:26:01 WARN lineage.LineageWriter: Lineage directory /var/log/spark/lineage doesn't exist or is not writable. Lineage for this application will be disabled. 19/05/23 20:26:03 WARN lineage.LineageWriter: Lineage directory /var/log/spark/lineage doesn't exist or is not writable. Lineage for this application will be disabled. Welcome to ____ __ / __/__ ___ _____/ /__ _ / _ / _ `/ __/ '_/ /__ / .__/\_,_/_/ /_/\_ version 2.2.0-cdh6.0.0 /_/ Using Python version 2.7.15 (default, May 1 2018 23:32:55) SparkSession available as 'spark'. >>>

シャイリン