GitHub Page : exemple-pyspark-read-and-write
Common part
Libraries dependency
from pyspark import SparkContext, SparkConf
from pyspark.sql import SparkSession, HiveContext
Set Hive metastore uri
sparkSession = (SparkSession
.builder
.appName('example-pyspark-read-and-write-from-hive')
.enableHiveSupport()
.getOrCreate())
data = [('First', 1), ('Second', 2), ('Third', 3), ('Fourth', 4), ('Fifth', 5)]
df = sparkSession.createDataFrame(data)
Creating Spark Session
sparkSession = SparkSession.builder.appName("example-pyspark-read-and-write").getOrCreate()
How to write a table into Hive?
Code example
# Write into Hive
df.write.saveAsTable('example')
How to read a table from Hive?
Code example
This Code only shows the first 20 records of the file.
# Read from Hive
df_load = sparkSession.sql('SELECT * FROM example')
df_load.show()
How to use on Saagie?
Please refer to the Python application packaging guidelines
How to use on Saagie's Jupyter Notebooks?
Prior to spark session creation, you must add the following snippet:
import os
os.environ["HADOOP_USER_NAME"] = "hdfs"
os.environ["PYTHON_VERSION"] = "3.5.2"
Comments
0 comments
Article is closed for comments.