eu.dicodeproject.analysis.examples
Class UnquotedArchiveToSequenceFile
java.lang.Object
eu.dicodeproject.analysis.examples.MailArchiveToSequenceFile
eu.dicodeproject.analysis.examples.UnquotedArchiveToSequenceFile
- All Implemented Interfaces:
- org.apache.hadoop.fs.PathFilter
public class UnquotedArchiveToSequenceFile
- extends MailArchiveToSequenceFile
Implements converting mbox archives to sequence files ignoring all quoted
content to avoid text duplication.
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
UnquotedArchiveToSequenceFile
public UnquotedArchiveToSequenceFile(org.apache.hadoop.conf.Configuration conf,
String prefix,
org.apache.mahout.text.ChunkedWriter writer,
Charset charset)
throws IOException
- Throws:
IOException
Copyright © 2011. All Rights Reserved.