You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Does Dumbo support custom input file formats e.g. WholeFileInputFormat.class which treats the entire file contents as a single record? I compiled WholeFileInputFormat.java (from Hadoop: The Definitive Guide) and created a custom streaming jar with WholeFileInputFormat.class along with the other class files in hadoop-streaming.jar. I then run the wordcount.py example in dumbo with the -inputformat option to be WholeFileInputFormat, but I am hit with the following error:
"java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 255"
Is there some more work that needs to be done to get custom input formats working in Dumbo?
The text was updated successfully, but these errors were encountered:
Does Dumbo support custom input file formats e.g. WholeFileInputFormat.class which treats the entire file contents as a single record? I compiled WholeFileInputFormat.java (from Hadoop: The Definitive Guide) and created a custom streaming jar with WholeFileInputFormat.class along with the other class files in hadoop-streaming.jar. I then run the wordcount.py example in dumbo with the -inputformat option to be WholeFileInputFormat, but I am hit with the following error:
"java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 255"
Is there some more work that needs to be done to get custom input formats working in Dumbo?
The text was updated successfully, but these errors were encountered: