I am testing a Spark Java application which uses the Impala JDBC driver for executing queries written in Impala syntax and write the results to an Excel template file. This application works perfectly in local mode into a kerberized Cloudera cluster.
The fact is that if this application is executed in yarn-cluster mode, it waited infinetly for the Kerberos password, so I decided to include some necessary files into the spark-submit call:
spark-submit --verbose --master yarn-cluster
--files gss-jaas.conf,configuration.properties
--principal myPrincipal --keytab myKeytab
--name "app" --conf spark.executor.cores=2
--conf spark.executor.memory=8G --conf spark.executor.instances=3
--conf spark.driver.memory=256M
--class myClass output.jar queriesConfig.json configuration.properties
And modifed the Java code to perform a system call to kinit prior to the kerberos login, in order to obtain an automatic access, as when the application was executed in local mode. However, I had no luck using the following code:
System.setProperty("java.security.auth.login.config", prop.getProperty("jdbc.kerberos.jaas"));
System.setProperty("sun.security.jgss.debug", "true");
System.setProperty("sun.security.krb5.debug", "true");
System.setProperty("javax.security.auth.useSubjectCredsOnly", "true");
System.setProperty("java.security.debug", "gssloginconfig,configfile,configparser,logincontext");
System.setProperty("java.security.krb5.conf", prop.getProperty("kerberos.conf"));
if (prop.getProperty("ssl.enabled") != null && "true".equals(prop.getProperty("ssl.enabled"))) {
System.setProperty("javax.net.ssl.trustStore", prop.getProperty("trustStore.path"));
System.setProperty("javax.net.ssl.trustStorePassword", prop.getProperty("trustStore.password"));
}
StringBuffer output = new StringBuffer();
Process p;
try {
final String command = "kinit -k -t myKeytab myUser";
p = Runtime.getRuntime().exec(command);
p.waitFor();
BufferedReader reader = new BufferedReader(new InputStreamReader(p.getInputStream()));
String line = "";
while ((line = reader.readLine())!= null) {
output.append(line + "\n");
}
lc = new LoginContext(JDBC_DRIVER_JAAS, new TextCallbackHandler());
if (lc != null) {
lc.login();
}
} catch (LoginException le) {
LOGGER.error("LoginException . " + le.getMessage(), le);
} catch (SecurityException se) {
LOGGER.error("SecurityException . " + se.getMessage(), se);
} catch (Exception e) {
LOGGER.error("EXCEPTION !!! " + e.getMessage(), e);
}
Obtaining the following error:
ERROR hive.JDBCHandler: Cannot create LoginContext. Integrity check on decrypted field failed (31) - PREAUTH_FAILED
javax.security.auth.login.LoginException: Integrity check on decrypted field failed (31) - PREAUTH_FAILED
at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:804)
at com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:617)
Which alternatives do I have, considering the queries should be in Impala syntax? Is it possible to login kerberos when the application is executed in yarn-cluster mode?