If the Datalake you' re working with has Kerberos enabled, you will need to adapt your jobs to log in to Kerberos.
Technologies on Saagie have been updated so you have the required libraries to login with a kinit before launching your job. To do that, simply call the kinit command with your login and password in the command line of your job as showed on the following Python job :
You'll notice the trick with | to avoid being prompted for your password when logging in :
echo $MY_PASSWORD | kinit $MY_LOGIN
After this kinit, some technologies (e.g. Spark) work transparently with Kerberos and no further code update is required. For others, you'll need to modify your code to update the WebHDFS port or your Hive JDBC Url for instance. The following articles show you how to do that for each technology :
WebHDFS
Read and Write files to/from HDFS
Java/Scala
Read and Write files to/from HDFS
Python
Read and Write files to/from HDFS
Talend
R
Read and Write files to/from HDFS
Comments
0 comments
Please sign in to leave a comment.