Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom Input File Formats #85

Open
sv2000 opened this issue May 26, 2014 · 0 comments
Open

Custom Input File Formats #85

sv2000 opened this issue May 26, 2014 · 0 comments

Comments

@sv2000
Copy link

sv2000 commented May 26, 2014

Does Dumbo support custom input file formats e.g. WholeFileInputFormat.class which treats the entire file contents as a single record? I compiled WholeFileInputFormat.java (from Hadoop: The Definitive Guide) and created a custom streaming jar with WholeFileInputFormat.class along with the other class files in hadoop-streaming.jar. I then run the wordcount.py example in dumbo with the -inputformat option to be WholeFileInputFormat, but I am hit with the following error:
"java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 255"

Is there some more work that needs to be done to get custom input formats working in Dumbo?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant