Spark & SparkR Configuration on EC2 - Java Timeout

Question

I'm trying to get Spark, and SparkR, running on a small EC2 cluster using the provided scripts and directions. Whenever I ask for an operation that would require computation on an RDD (e.g., collect(), reduce()), I get the error logged below. Workers do appear to startup correctly -- if I only parallelize, I see the workers running via the master's web ui.

The error I get is similar to the one in Intermittent Timeout Exception using Spark and I've been through all of the solutions there (modifying the conf file for URL's, disabling the firewall, etc.), no luck.

Here is the error log, thank you in advance for your help:

15/02/17 19:10:22 INFO executor.CoarseGrainedExecutorBackend: Registered signal handlers for [TERM, HUP, INT]
15/02/17 19:10:22 INFO spark.SecurityManager: Changing view acls to: root,-
15/02/17 19:10:22 INFO spark.SecurityManager: Changing modify acls to: root,-
15/02/17 19:10:22 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root, -); users with modify permissions: Set(root, -)
15/02/17 19:10:23 INFO slf4j.Slf4jLogger: Slf4jLogger started
15/02/17 19:10:23 INFO Remoting: Starting remoting
15/02/17 19:10:23 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://driverPropsFetcher@-.ec2.internal:60218]
15/02/17 19:10:23 INFO util.Utils: Successfully started service 'driverPropsFetcher' on port 60218.
15/02/17 19:10:53 ERROR security.UserGroupInformation: PriviledgedActionException as:- cause:java.util.concurrent.TimeoutException: Futures timed out after [30 seconds]
Exception in thread "main" java.lang.reflect.UndeclaredThrowableException: Unknown exception in doAs
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1134)
        at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:59)
        at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:115)
        at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:161)
        at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
Caused by: java.security.PrivilegedActionException: java.util.concurrent.TimeoutException: Futures timed out after [30 seconds]
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
        ... 4 more
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [30 seconds]
        at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
        at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
        at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
        at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
        at scala.concurrent.Await$.result(package.scala:107)
        at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:127)
        at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:60)
        at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:59)
        ... 7 more

score 0 · Answer 1 · answered Mar 26 '15 at 01:11

This was ultimately resolved by a combination of

- Updates to SparkR, which have resolved a number of serialization issues.

- Recognizing that the Spark-ec2 scripts require that the control node and master node be the same machine.

and

- Replacing calls to parallelize() with distributing and then loading the data by hadoop.

I am writing an intro to SparkR for R programmers that I hope will help people with things like this in the future.

Spark & SparkR Configuration on EC2 - Java Timeout

1 Answers1