org.apache.hadoop.hive.ql.io
Class BucketizedHiveInputFormat<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>
java.lang.Object
org.apache.hadoop.hive.ql.io.HiveInputFormat<K,V>
org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat<K,V>
- All Implemented Interfaces:
- org.apache.hadoop.mapred.InputFormat<K,V>, org.apache.hadoop.mapred.JobConfigurable
public class BucketizedHiveInputFormat<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>
- extends HiveInputFormat<K,V>
BucketizedHiveInputFormat serves the similar function as hiveInputFormat but
its getSplits() always group splits from one input file into one wrapper
split. It is useful for the applications that requires input files to fit in
one mapper.
|
Field Summary |
static org.apache.commons.logging.Log |
LOG
|
|
Method Summary |
org.apache.hadoop.mapred.RecordReader |
getRecordReader(org.apache.hadoop.mapred.InputSplit split,
org.apache.hadoop.mapred.JobConf job,
org.apache.hadoop.mapred.Reporter reporter)
|
org.apache.hadoop.mapred.InputSplit[] |
getSplits(org.apache.hadoop.mapred.JobConf job,
int numSplits)
|
protected org.apache.hadoop.fs.FileStatus[] |
listStatus(org.apache.hadoop.mapred.JobConf job,
org.apache.hadoop.fs.Path dir)
Recursively lists status for all files starting from the directory dir |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
LOG
public static final org.apache.commons.logging.Log LOG
BucketizedHiveInputFormat
public BucketizedHiveInputFormat()
getRecordReader
public org.apache.hadoop.mapred.RecordReader getRecordReader(org.apache.hadoop.mapred.InputSplit split,
org.apache.hadoop.mapred.JobConf job,
org.apache.hadoop.mapred.Reporter reporter)
throws IOException
- Specified by:
getRecordReader in interface org.apache.hadoop.mapred.InputFormat<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>- Overrides:
getRecordReader in class HiveInputFormat<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>
- Throws:
IOException
listStatus
protected org.apache.hadoop.fs.FileStatus[] listStatus(org.apache.hadoop.mapred.JobConf job,
org.apache.hadoop.fs.Path dir)
throws IOException
- Recursively lists status for all files starting from the directory dir
- Parameters:
job - dir -
- Returns:
-
- Throws:
IOException
getSplits
public org.apache.hadoop.mapred.InputSplit[] getSplits(org.apache.hadoop.mapred.JobConf job,
int numSplits)
throws IOException
- Specified by:
getSplits in interface org.apache.hadoop.mapred.InputFormat<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>- Overrides:
getSplits in class HiveInputFormat<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>
- Throws:
IOException
Copyright © 2014 The Apache Software Foundation. All rights reserved.