Sqoop Export Oozie Workflow Fails with File Not Found, Works when ran from the console

Kirk Allen

I have a hadoop cluster with 6 nodes. I'm pulling data out of MSSQL and back into MSSQL via Sqoop. Sqoop import commands work fine, and I can run a sqoop export command from the console (on one of the hadoop nodes). Here's the shell script I run:

SQLHOST=sqlservermaster.local
SQLDBNAME=db1
HIVEDBNAME=db1
BATCHID=
USERNAME="sqlusername"
PASSWORD="password"


sqoop export --connect 'jdbc:sqlserver://'$SQLHOST';username='$USERNAME';password='$PASSWORD';database='$SQLDBNAME'' --table ExportFromHive --columns col1,col2,col3 --export-dir /apps/hive/warehouse/$HIVEDBNAME.db/hivetablename    

When I run this command from an oozie workflow, and it's passed the same parameters, I receive the error (when digging into the actual job run logs from the yarn scheduler screen):

**2015-10-01 20:55:31,084 WARN [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Job init failed
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.io.FileNotFoundException: File does not exist: hdfs://hadoopnode1:8020/user/root/.staging/job_1443713197941_0134/job.splitmetainfo
    at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1568)
    at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1432)
    at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1390)
    at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
    at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
    at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
    at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
    at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:996)
    at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:138)
    at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1312)
    at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1080)
    at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
    at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$4.run(MRAppMaster.java:1519)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
    at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1515)
    at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1448)
Caused by: java.io.FileNotFoundException: File does not exist: hdfs://hadoopnode1:8020/user/root/.staging/job_1443713197941_0134/job.splitmetainfo
    at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1309)
    at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)
    at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
    at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1301)
    at org.apache.hadoop.mapreduce.split.SplitMetaInfoReader.readSplitMetaInfo(SplitMetaInfoReader.java:51)
    at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1563)
    ... 17 more**

Has anyone ever seen this and been able to troubleshoot it? It only happens from the oozie workflow. There are similar topics but no one seems to have solved this specific problem.

Thanks!

Kirk Allen

I was able to solve this problem by setting the user.name property on the job.properties file for the oozie workflow to the user yarn.

user.name=yarn

I think the problem was it did not have permission to create the staging files under /user/root. Once I modified the running user to yarn, the staging files were created under /user/yarn which did have the proper permission.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

SQOOP Import Fails, File Not Found Exception

From Dev

oozie Sqoop action fails to import data to hive

From Dev

Case conversion from oozie workflow

From Dev

SQOOP export in shell script fails

From Dev

java.lang.RuntimeException: Stream '/jars/oozie-examples.jar' was not found. while running spark workflow from oozie in CDH 5.8

From Dev

HAProxy works fine ran manually, but as a Systemd-service fails to load SSL pem-file

From Dev

Running Script in CMD Works but when ran using Process.Start() in C# python script fails

From Dev

Java Jar file not sending email when ran from a batch file?

From Dev

Oozie error when trying to run a workflow in Hue

From Dev

Spring Boot - Could not resolve placeholder when ran from jar file

From Dev

how to trigger an Oozie workflow when previous workflow completes

From Dev

Example Oozie job works from Hue, but not from command line: SparkMain not found

From Dev

Example Oozie job works from Hue, but not from command line: SparkMain not found

From Dev

Sqoop export from hdfs to SQL Server 2005 using jdts driver fails

From Dev

FTP "put" not copying file to remote host when ran from shell script but copies the file to remote host when ran manually

From Dev

Script works in terminal but not when ran using ProcessBuilder

From Dev

Two different results for same JS,when ran through Browser's JS Console and through a JS File

From Dev

SpreadsheetApp getRange fails when ran via a trigger

From Dev

No class found when ran Spark on Yarn

From Dev

Grep from file fails but grep with individual lines from the file works

From Dev

Questions about Oozie/Sqoop

From Dev

Is it possible to use two "job.properties" file in a workflow oozie?

From Dev

Can Oozie pause a workflow until a certain file is generated/exists?

From Dev

oozie workflow should only run if there is an input hdfs file available

From Dev

Sqoop export from hive to sql is stuck

From Dev

How to copy data from HDFS to Local FS using oozie workflow?

From Dev

Cloudera - Oozie cannot get properties from workflow.xml

From Dev

Cloudera - Oozie cannot get properties from workflow.xml

From Dev

Is it possible to log a message from an Oozie Workflow without killing it

Related Related

  1. 1

    SQOOP Import Fails, File Not Found Exception

  2. 2

    oozie Sqoop action fails to import data to hive

  3. 3

    Case conversion from oozie workflow

  4. 4

    SQOOP export in shell script fails

  5. 5

    java.lang.RuntimeException: Stream '/jars/oozie-examples.jar' was not found. while running spark workflow from oozie in CDH 5.8

  6. 6

    HAProxy works fine ran manually, but as a Systemd-service fails to load SSL pem-file

  7. 7

    Running Script in CMD Works but when ran using Process.Start() in C# python script fails

  8. 8

    Java Jar file not sending email when ran from a batch file?

  9. 9

    Oozie error when trying to run a workflow in Hue

  10. 10

    Spring Boot - Could not resolve placeholder when ran from jar file

  11. 11

    how to trigger an Oozie workflow when previous workflow completes

  12. 12

    Example Oozie job works from Hue, but not from command line: SparkMain not found

  13. 13

    Example Oozie job works from Hue, but not from command line: SparkMain not found

  14. 14

    Sqoop export from hdfs to SQL Server 2005 using jdts driver fails

  15. 15

    FTP "put" not copying file to remote host when ran from shell script but copies the file to remote host when ran manually

  16. 16

    Script works in terminal but not when ran using ProcessBuilder

  17. 17

    Two different results for same JS,when ran through Browser's JS Console and through a JS File

  18. 18

    SpreadsheetApp getRange fails when ran via a trigger

  19. 19

    No class found when ran Spark on Yarn

  20. 20

    Grep from file fails but grep with individual lines from the file works

  21. 21

    Questions about Oozie/Sqoop

  22. 22

    Is it possible to use two "job.properties" file in a workflow oozie?

  23. 23

    Can Oozie pause a workflow until a certain file is generated/exists?

  24. 24

    oozie workflow should only run if there is an input hdfs file available

  25. 25

    Sqoop export from hive to sql is stuck

  26. 26

    How to copy data from HDFS to Local FS using oozie workflow?

  27. 27

    Cloudera - Oozie cannot get properties from workflow.xml

  28. 28

    Cloudera - Oozie cannot get properties from workflow.xml

  29. 29

    Is it possible to log a message from an Oozie Workflow without killing it

HotTag

Archive