Почему сбой приложения исправить с помощью "executor.CoarseGrainedExecutorBackend: Driver Disassociated"?

Когда я выполняю запрос sql через spark-submit и spark-sql, соответствующее искровое приложение всегда терпит неудачу с ошибкой:

15/03/10 18:50:52 INFO util.AkkaUtils: Connecting to HeartbeatReceiver: akka.tcp://[email protected]:60697/user/HeartbeatReceiver
15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:35643] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.

и выше является лишь одной из ошибок, я использовал "журналы нитей-приложение-приложение_1425944520319_8102.log", чтобы получить весь журнал приложений и вычеркнуть ошибку, как показано ниже:

Line 46: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:55156] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 97: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:32852] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 149: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:45654] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 200: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:45702] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 251: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:21596] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 302: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:58845] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 353: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:1697] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 437: 15/03/10 18:52:06 WARN hdfs.DFSClient: error creating legacy BlockReaderLocal.  Disabling legacy local reads.
Line 481: 15/03/10 18:52:06 ERROR executor.Executor: Exception in task 3.0 in stage 0.0 (TID 10)
Line 504: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:6289] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 556: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:37070] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 607: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:43424] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 658: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:38083] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 710: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:3106] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 761: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:35533] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 812: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:63207] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 863: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:11250] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 910: 15/03/10 18:52:09 ERROR executor.CoarseGrainedExecutorBackend: RECEIVED SIGNAL 15: SIGTERM
Line 961: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:26917] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 1012: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:3058] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 1063: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:1885] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 1114: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:14795] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 1165: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:39794] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 1216: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:19614] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 1267: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:38776] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 1318: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:19231] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 1370: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:18816] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 1454: 15/03/10 18:52:06 WARN hdfs.DFSClient: error creating legacy BlockReaderLocal.  Disabling legacy local reads.
Line 1498: 15/03/10 18:52:06 ERROR executor.Executor: Exception in task 1.0 in stage 0.0 (TID 18)
Line 1524: 15/03/10 18:52:06 ERROR executor.Executor: Exception in task 1.1 in stage 0.0 (TID 28)
Line 1550: 15/03/10 18:52:06 ERROR executor.Executor: Exception in task 1.2 in stage 0.0 (TID 31)
Line 1576: 15/03/10 18:52:06 ERROR executor.Executor: Exception in task 1.3 in stage 0.0 (TID 32)
Line 1602: 15/03/10 18:52:06 ERROR executor.Executor: Exception in task 1.4 in stage 0.0 (TID 33)
Line 1628: 15/03/10 18:52:07 ERROR executor.Executor: Exception in task 1.5 in stage 0.0 (TID 36)
Line 1654: 15/03/10 18:52:07 ERROR executor.Executor: Exception in task 1.6 in stage 0.0 (TID 37)
Line 1680: 15/03/10 18:52:07 ERROR executor.Executor: Exception in task 1.7 in stage 0.0 (TID 39)
Line 1706: 15/03/10 18:52:07 ERROR executor.Executor: Exception in task 1.8 in stage 0.0 (TID 41)
Line 1732: 15/03/10 18:52:07 ERROR executor.Executor: Exception in task 1.9 in stage 0.0 (TID 42)
Line 1755: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:24322] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 1806: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:38508] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 1858: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:19707] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 1909: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:33683] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 1976: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:18587] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 2027: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:64531] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 2078: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:23333] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 2129: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:61136] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 2180: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:25118] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 2231: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:16274] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 2282: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:1324] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 2334: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:51664] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 2385: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:38854] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 2452: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:30088] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 2504: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:30778] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 2556: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:52263] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 2623: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:17806] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 2674: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:3251] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 2725: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:17832] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 2776: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:11629] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 2827: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://[email protected]:22629] -> [akka.tcp://[email protected]:60697] disassociated! Shutting down.
Line 2911: 15/03/10 18:52:07 WARN hdfs.DFSClient: error creating legacy BlockReaderLocal.  Disabling legacy local reads.

вы можете получить файл журнала из https://www.dropbox.com/s/lf50ger18v3ngtb/application_1425944520319_8102.log?dl=0, если я не смог четко выразить.

Сеть slave75 в порядке и хосты во всех узлах настроены правильно. Любой ответ поможет, спасибо!

Ответ 1

Наконец я нашел причину. Это связано с тем, что Yarn убивает исполнителя (контейнер), потому что исполнитель занимает слишком много памяти. Просто включите значения spark.yarn.driver.memoryOverhead или spark.yarn.executor.memoryOverhead или обоих.

Ответ 2

В моем случае я разрешаю эту проблему, увеличивая количество параллельных задач, которые считывают данные в RDD