A B C D E F G H I L M N P R S T U V W

A

accept(Path) - Method in class eu.dicodeproject.analysis.examples.MailArchiveToSequenceFile
Accepts all files in a directory, splits each file into individual mails and concatenates them - each prefixed with the mail's message id.
ANALYZER_CLASS - Static variable in class eu.dicodeproject.analysis.hbase.HBaseDocumentProcessor
 

B

bytes() - Method in enum eu.dicodeproject.analysis.hbase.TweetCols
 

C

characters(char[], int, int) - Method in class eu.dicodeproject.analysis.examples.MailHandler
 
characters(char[], int, int) - Method in class eu.dicodeproject.analysis.examples.UnquotedHandler
 
CleansingAnalyzer - Class in eu.dicodeproject.analysis.lucene
In contrast to the Lucene standard analyser this one adds filtering tokens of less then minimum length (default two characters) and tokens that contain only digits.
CleansingAnalyzer() - Constructor for class eu.dicodeproject.analysis.lucene.CleansingAnalyzer
Default init of lower bound to be equal to 2.
CleansingAnalyzer(int, boolean) - Constructor for class eu.dicodeproject.analysis.lucene.CleansingAnalyzer
 
cleanup(Reducer<IntWritable, Text, ImmutableBytesWritable, Writable>.Context) - Method in class eu.dicodeproject.analysis.generic.GenericTableReducer
Write data if < Limit
createComponents(String, Reader) - Method in class eu.dicodeproject.analysis.lucene.TweetAnalyzer
 

D

DateHistogramDriver - Class in eu.dicodeproject.analysis.histogram
Generates an HBase table that contains creation date information for all tweets.
DateHistogramDriver() - Constructor for class eu.dicodeproject.analysis.histogram.DateHistogramDriver
 
DateHistogramMapper - Class in eu.dicodeproject.analysis.histogram
Reads creation dates from HBase and emits years, years+months, years+months+day and individual houts separately as keys with value 1.
DateHistogramMapper() - Constructor for class eu.dicodeproject.analysis.histogram.DateHistogramMapper
 
DateHistogramMapper.ErrorCases - Enum in eu.dicodeproject.analysis.histogram
Set of patterns with associated key for later output.
DateHistogramReducer - Class in eu.dicodeproject.analysis.histogram
Sums all the values coming from the mapper.
DateHistogramReducer() - Constructor for class eu.dicodeproject.analysis.histogram.DateHistogramReducer
 

E

endDocument() - Method in class eu.dicodeproject.analysis.examples.MailHandler
 
endDocument() - Method in class eu.dicodeproject.analysis.examples.UnquotedHandler
 
endElement(String, String, String) - Method in class eu.dicodeproject.analysis.examples.MailHandler
 
endElement(String, String, String) - Method in class eu.dicodeproject.analysis.examples.UnquotedHandler
 
endPrefixMapping(String) - Method in class eu.dicodeproject.analysis.examples.MailHandler
 
endPrefixMapping(String) - Method in class eu.dicodeproject.analysis.examples.UnquotedHandler
 
eu.dicodeproject.analysis.examples - package eu.dicodeproject.analysis.examples
 
eu.dicodeproject.analysis.export - package eu.dicodeproject.analysis.export
 
eu.dicodeproject.analysis.generic - package eu.dicodeproject.analysis.generic
 
eu.dicodeproject.analysis.hbase - package eu.dicodeproject.analysis.hbase
 
eu.dicodeproject.analysis.histogram - package eu.dicodeproject.analysis.histogram
 
eu.dicodeproject.analysis.lucene - package eu.dicodeproject.analysis.lucene
 
eu.dicodeproject.analysis.twitter - package eu.dicodeproject.analysis.twitter
 
eu.dicodeproject.analysis.util - package eu.dicodeproject.analysis.util
 
eu.dicodeproject.analysis.wordcount - package eu.dicodeproject.analysis.wordcount
 

F

fromCode(String) - Static method in enum eu.dicodeproject.analysis.util.Language
 

G

GenericDriver - Class in eu.dicodeproject.analysis.generic
Simple "word count" for HBase columns: Aggregates values from a configurable HBase table and column TODO: add Filters, e.g. for language etc.
GenericMapper - Class in eu.dicodeproject.analysis.generic
Simple word count mapper Either emits complete content of a column or parts of it using a configured separator
GenericMapper() - Constructor for class eu.dicodeproject.analysis.generic.GenericMapper
 
GenericReducer - Class in eu.dicodeproject.analysis.generic
Sums up counts for words and writes the counts *-1 to HDFS.
GenericReducer() - Constructor for class eu.dicodeproject.analysis.generic.GenericReducer
 
GenericTableDriver - Class in eu.dicodeproject.analysis.generic
Reads text from a configurable HBase table and column, extracts the content (or in some cases single items from a list seperated by a separator like '#') and writes the counts to HDFS.
GenericTableReducer - Class in eu.dicodeproject.analysis.generic
Sums up counts for hashtags (or more general: words) and writes the counts to HBase.
GenericTableReducer() - Constructor for class eu.dicodeproject.analysis.generic.GenericTableReducer
 
getCode() - Method in enum eu.dicodeproject.analysis.util.Language
 
getPrefix() - Method in class eu.dicodeproject.analysis.examples.MailHandler
 
getPrefix() - Method in class eu.dicodeproject.analysis.examples.UnquotedHandler
 
getStemmer(Language, Version, TokenStream) - Static method in class eu.dicodeproject.analysis.lucene.StemmerFactory
 

H

hasNext() - Method in class eu.dicodeproject.analysis.lucene.IterableAnalyzer
 
HBaseDocumentProcessor - Class in eu.dicodeproject.analysis.hbase
 
HBaseLuceneTokenizerDriver - Class in eu.dicodeproject.analysis.hbase
Reads text from a configurable HBase table and column, tokenizes with Lucene and writes the resulting tokenized stuff to HDFS for further processing by the Mahout colloc driver.

I

ignorableWhitespace(char[], int, int) - Method in class eu.dicodeproject.analysis.examples.MailHandler
 
ignorableWhitespace(char[], int, int) - Method in class eu.dicodeproject.analysis.examples.UnquotedHandler
 
IterableAnalyzer - Class in eu.dicodeproject.analysis.lucene
Wraps an analyzer in an iterable for strings.
IterableAnalyzer(Analyzer, String) - Constructor for class eu.dicodeproject.analysis.lucene.IterableAnalyzer
 
iterator() - Method in class eu.dicodeproject.analysis.lucene.IterableAnalyzer
 

L

Language - Enum in eu.dicodeproject.analysis.util
 

M

MailArchiveToSequenceFile - Class in eu.dicodeproject.analysis.examples
Converts a directory containing unzipped mail archives in mbox format to a sequencefiles.
MailArchiveToSequenceFile(Configuration, String, ChunkedWriter, Charset) - Constructor for class eu.dicodeproject.analysis.examples.MailArchiveToSequenceFile
 
MailArchiveToSequenceFile(Configuration, String, ChunkedWriter, Charset, MailContentHandler) - Constructor for class eu.dicodeproject.analysis.examples.MailArchiveToSequenceFile
 
MailContentHandler - Interface in eu.dicodeproject.analysis.examples
TODO add useful comment
MailHandler - Class in eu.dicodeproject.analysis.examples
Accepts events resulting from parsing mbox archives.
MailHandler(ChunkedWriter) - Constructor for class eu.dicodeproject.analysis.examples.MailHandler
 
main(String[]) - Static method in class eu.dicodeproject.analysis.export.TwitterExportDriver
 
main(String[]) - Static method in class eu.dicodeproject.analysis.generic.GenericDriver
 
main(String[]) - Static method in class eu.dicodeproject.analysis.generic.GenericTableDriver
 
main(String[]) - Static method in class eu.dicodeproject.analysis.hbase.HBaseLuceneTokenizerDriver
 
main(String[]) - Static method in class eu.dicodeproject.analysis.histogram.DateHistogramDriver
 
main(String[]) - Static method in class eu.dicodeproject.analysis.twitter.SchemaUpdaterDriver
 
main(String[]) - Static method in class eu.dicodeproject.analysis.wordcount.WordCountDriver
 
map(ImmutableBytesWritable, Result, Mapper.Context) - Method in class eu.dicodeproject.analysis.export.TwitterExportMapper
 
map(ImmutableBytesWritable, Result, Mapper.Context) - Method in class eu.dicodeproject.analysis.generic.GenericMapper
 
map(ImmutableBytesWritable, Result, Mapper<ImmutableBytesWritable, Result, Text, IntWritable>.Context) - Method in class eu.dicodeproject.analysis.histogram.DateHistogramMapper
Map
map(ImmutableBytesWritable, Result, Mapper<ImmutableBytesWritable, Result, NullWritable, NullWritable>.Context) - Method in class eu.dicodeproject.analysis.twitter.SchemaUpdaterMapper
 
map(ImmutableBytesWritable, Result, Mapper<ImmutableBytesWritable, Result, Text, IntWritable>.Context) - Method in class eu.dicodeproject.analysis.wordcount.WordCountMapper
 

N

next() - Method in class eu.dicodeproject.analysis.lucene.IterableAnalyzer
 

P

processingInstruction(String, String) - Method in class eu.dicodeproject.analysis.examples.MailHandler
 
processingInstruction(String, String) - Method in class eu.dicodeproject.analysis.examples.UnquotedHandler
 

R

reduce(Text, Iterable<IntWritable>, Reducer<Text, IntWritable, IntWritable, Text>.Context) - Method in class eu.dicodeproject.analysis.generic.GenericReducer
 
reduce(IntWritable, Iterable<Text>, Reducer<IntWritable, Text, ImmutableBytesWritable, Writable>.Context) - Method in class eu.dicodeproject.analysis.generic.GenericTableReducer
Aggregates top Hashtags in JSON format write to HBase
reduce(Text, Iterable<IntWritable>, Reducer<Text, IntWritable, Text, Writable>.Context) - Method in class eu.dicodeproject.analysis.histogram.DateHistogramReducer
Add up data and write to table.
reduce(Text, Iterable<IntWritable>, Reducer<Text, IntWritable, Text, Writable>.Context) - Method in class eu.dicodeproject.analysis.wordcount.WordCountReducer
 
remove() - Method in class eu.dicodeproject.analysis.lucene.IterableAnalyzer
 
run(String[]) - Method in class eu.dicodeproject.analysis.export.TwitterExportDriver
 
run(String[]) - Method in class eu.dicodeproject.analysis.generic.GenericDriver
 
run(String[]) - Method in class eu.dicodeproject.analysis.generic.GenericTableDriver
 
run(String[]) - Method in class eu.dicodeproject.analysis.hbase.HBaseLuceneTokenizerDriver
 
run(String[]) - Method in class eu.dicodeproject.analysis.histogram.DateHistogramDriver
 
run(String[]) - Method in class eu.dicodeproject.analysis.twitter.SchemaUpdaterDriver
 
run(String[]) - Method in class eu.dicodeproject.analysis.wordcount.WordCountDriver
 

S

SchemaUpdaterDriver - Class in eu.dicodeproject.analysis.twitter
 
SchemaUpdaterDriver() - Constructor for class eu.dicodeproject.analysis.twitter.SchemaUpdaterDriver
 
SchemaUpdaterMapper - Class in eu.dicodeproject.analysis.twitter
 
SchemaUpdaterMapper() - Constructor for class eu.dicodeproject.analysis.twitter.SchemaUpdaterMapper
 
setDocumentLocator(Locator) - Method in class eu.dicodeproject.analysis.examples.MailHandler
 
setDocumentLocator(Locator) - Method in class eu.dicodeproject.analysis.examples.UnquotedHandler
 
setPrefix(String) - Method in interface eu.dicodeproject.analysis.examples.MailContentHandler
 
setPrefix(String) - Method in class eu.dicodeproject.analysis.examples.MailHandler
 
setPrefix(String) - Method in class eu.dicodeproject.analysis.examples.UnquotedHandler
 
setup(Mapper<ImmutableBytesWritable, Result, Text, IntWritable>.Context) - Method in class eu.dicodeproject.analysis.generic.GenericMapper
Creates the Row Key from current date and query/topic
setup(Reducer<IntWritable, Text, ImmutableBytesWritable, Writable>.Context) - Method in class eu.dicodeproject.analysis.generic.GenericTableReducer
Creates the Row Key from current date and query/topic
setup(Mapper<ImmutableBytesWritable, Result, NullWritable, NullWritable>.Context) - Method in class eu.dicodeproject.analysis.twitter.SchemaUpdaterMapper
 
skippedEntity(String) - Method in class eu.dicodeproject.analysis.examples.MailHandler
 
skippedEntity(String) - Method in class eu.dicodeproject.analysis.examples.UnquotedHandler
 
splitExpression - Static variable in class eu.dicodeproject.analysis.histogram.DateHistogramDriver
 
StandardAnalyzerWrapper - Class in eu.dicodeproject.analysis.lucene
This class wraps the Lucene StandardAnalyzer - ties it to Lucene version 3.0.2 and provides a constructor w/o any arguments for instantiation in the Mahout collocation analysis.
StandardAnalyzerWrapper() - Constructor for class eu.dicodeproject.analysis.lucene.StandardAnalyzerWrapper
 
startDocument() - Method in class eu.dicodeproject.analysis.examples.MailHandler
 
startDocument() - Method in class eu.dicodeproject.analysis.examples.UnquotedHandler
 
startElement(String, String, String, Attributes) - Method in class eu.dicodeproject.analysis.examples.MailHandler
 
startElement(String, String, String, Attributes) - Method in class eu.dicodeproject.analysis.examples.UnquotedHandler
 
startPrefixMapping(String, String) - Method in class eu.dicodeproject.analysis.examples.MailHandler
 
startPrefixMapping(String, String) - Method in class eu.dicodeproject.analysis.examples.UnquotedHandler
 
StemmerFactory - Class in eu.dicodeproject.analysis.lucene
 
StemmerFactory() - Constructor for class eu.dicodeproject.analysis.lucene.StemmerFactory
 

T

tokenizeDocuments(String, String, String, Class<? extends Analyzer>, Path) - Static method in class eu.dicodeproject.analysis.hbase.HBaseDocumentProcessor
 
tokenStream(String, Reader) - Method in class eu.dicodeproject.analysis.lucene.CleansingAnalyzer
Delegate most of the analysis to the Lucene standard analyzer, add filtering tokens of less than minimum length and filtering tokens that are digit only.
tokenStream(String, Reader) - Method in class eu.dicodeproject.analysis.lucene.StandardAnalyzerWrapper
 
TweetAnalyzer - Class in eu.dicodeproject.analysis.lucene
 
TweetAnalyzer(Language, Version, Set<?>) - Constructor for class eu.dicodeproject.analysis.lucene.TweetAnalyzer
 
TweetAnalyzer(Language, Version) - Constructor for class eu.dicodeproject.analysis.lucene.TweetAnalyzer
 
TweetAnalyzer(String, Version, Set<?>) - Constructor for class eu.dicodeproject.analysis.lucene.TweetAnalyzer
 
TweetAnalyzer(String, Version) - Constructor for class eu.dicodeproject.analysis.lucene.TweetAnalyzer
 
TweetCols - Enum in eu.dicodeproject.analysis.hbase
 
TwitterExportDriver - Class in eu.dicodeproject.analysis.export
Reads text from a configurable HBase table and column and writes row key and column content to HDFS.
TwitterExportMapper - Class in eu.dicodeproject.analysis.export
Write column content to HDFS.
TwitterExportMapper() - Constructor for class eu.dicodeproject.analysis.export.TwitterExportMapper
 

U

UnquotedArchiveToSequenceFile - Class in eu.dicodeproject.analysis.examples
Implements converting mbox archives to sequence files ignoring all quoted content to avoid text duplication.
UnquotedArchiveToSequenceFile(Configuration, String, ChunkedWriter, Charset) - Constructor for class eu.dicodeproject.analysis.examples.UnquotedArchiveToSequenceFile
 
UnquotedHandler - Class in eu.dicodeproject.analysis.examples
Stores only content to disk that is not quoted in the original mail.
UnquotedHandler(ChunkedWriter) - Constructor for class eu.dicodeproject.analysis.examples.UnquotedHandler
 

V

valueOf(String) - Static method in enum eu.dicodeproject.analysis.hbase.TweetCols
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum eu.dicodeproject.analysis.histogram.DateHistogramMapper.ErrorCases
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum eu.dicodeproject.analysis.util.Language
Returns the enum constant of this type with the specified name.
values() - Static method in enum eu.dicodeproject.analysis.hbase.TweetCols
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum eu.dicodeproject.analysis.histogram.DateHistogramMapper.ErrorCases
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum eu.dicodeproject.analysis.util.Language
Returns an array containing the constants of this enum type, in the order they are declared.

W

WordCountDriver - Class in eu.dicodeproject.analysis.wordcount
Simple "word count" for HBase columns: Aggregates values from a configurable HBase table and column TODO: add Filters, e.g. for language etc.
WordCountDriver() - Constructor for class eu.dicodeproject.analysis.wordcount.WordCountDriver
 
WordCountMapper - Class in eu.dicodeproject.analysis.wordcount
Emits words with count 1, using a Lucene Tokenizer.
WordCountMapper() - Constructor for class eu.dicodeproject.analysis.wordcount.WordCountMapper
 
WordCountReducer - Class in eu.dicodeproject.analysis.wordcount
Write word counts to a table, The date is the row ID, the column qualifier is the word.
WordCountReducer() - Constructor for class eu.dicodeproject.analysis.wordcount.WordCountReducer
 

A B C D E F G H I L M N P R S T U V W

Copyright © 2011. All Rights Reserved.