Hadoop Streaming - 找不到文件错误
- 作者: 一顿吃两碗
- 来源: 51数据库
- 2023-01-06
问题描述
我正在尝试运行一个 hadoop-streaming python 作业.
I am trying to run a hadoop-streaming python job.
bin/hadoop jar contrib/streaming/hadoop-0.20.1-streaming.jar -D stream.non.zero.exit.is.failure=true -input /ixml -output /oxml -mapper scripts/mapper.py -file scripts/mapper.py -inputreader "StreamXmlRecordReader,begin=channel,end=/channel" -jobconf mapred.reduce.tasks=0
我确保 mapper.py 拥有所有权限.它错误地说
I made sure mapper.py has all the permissions. It errors out saying
Caused by: java.io.IOException: Cannot run program "mapper.py":
error=2, No such file or directory
at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:214)
... 19 more
Caused by: java.io.IOException: error=2, No such file or directory
at java.lang.UNIXProcess.forkAndExec(Native Method)
at java.lang.UNIXProcess.(UNIXProcess.java:53)
at java.lang.ProcessImpl.start(ProcessImpl.java:91)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:453)
我尝试将 mapper.py 复制到 hdfs 并提供相同的 hdfs://localhost/mapper.py 链接,但这也不起作用!有关如何修复此错误的任何想法?.
I tried copying mapper.py to hdfs and give the same hdfs://localhost/mapper.py link, that does not work too! Any thoughts on how to fix this bug?.
推荐答案
查看 HadoopStreaming wiki 上的示例页面,看来你应该改一下
Looking at the example on the HadoopStreaming wiki page, it seems that you should change
-mapper scripts/mapper.py -file scripts/mapper.py
到
-mapper mapper.py -file scripts/mapper.py
因为传送的文件进入工作目录".您可能还需要直接指定 python 解释器:
since "shipped files go to the working directory". You might also need to specify the python interpreter directly:
-mapper /path/to/python mapper.py -file scripts/mapper.py
推荐阅读
热点文章
Discord.py(重写)on_member_update 无法正常工作
0
Discord.py 在 vc 中获取用户分钟数
0
discord.py 重写 |为我的命令出错
0
Discord.py rewrite 如何 DM 命令?
0
播放音频时,最后一部分被切断.如何解决这个问题?(discord.py)
0
在消息删除消息 Discord.py
0
如何使 discord.py 机器人私人/直接消息不是作者的人?
0
(Discord.py) 如何获取整个嵌入内容?
0
Discord bot 尽管获得了许可,但不能提及所有人
0
Discord.py discord.NotFound 异常
0
