public class CoNLLExporterProperties
extends org.corpus_tools.pepper.modules.PepperModuleProperties
| Modifier and Type | Field and Description |
|---|---|
static String |
COLLAPSE_VALUE |
static String |
MARKER_USE_DEFAULT
this marker is supposed to be used when the user does not want to reconfigure a value and stick to the default.
|
static String |
PROP_ANNOS_AS_FEATURES
The annotations listed in csv-style will be exported into CoNLLs feature column as "KEY=VALUE"-pairs.
|
static String |
PROP_COL_CONFIG
In this string the annotation names (and collapse instructions) for the CoNLL columns are encoded.
|
static String |
PROP_DISCOURSE_UNIT_ANNO_NAME
Provide an annotation name that marks sentence spans (or another discourse unit to mark sentences in conll).
|
static String |
PROP_SEGMENTATION_NAME
if provided, a specific segmentation is selected rather than all tokens found in a document.
|
static String |
PROP_SPAN_ANNOS
This property contains all the annotations that will be found on spans over the tokens, but not the token itself.
|
pepperModuleProperties, PROP_AFTER_ADD_SLAYER, PROP_AFTER_COPY_RES, PROP_AFTER_REMOVE_ANNOTATIONS, PROP_AFTER_RENAME_ANNOTATIONS, PROP_AFTER_REPORT_CORPUSGRAPH, PROP_AFTER_TOKENIZE, PROP_BEFORE_ADD_SLAYER, PROP_BEFORE_READ_META| Constructor and Description |
|---|
CoNLLExporterProperties() |
| Modifier and Type | Method and Description |
|---|---|
Map<ConllDataField,String> |
getColumns() |
String |
getDiscourseUnit() |
List<String> |
getFeatureAnnos() |
String |
getSegmentationName() |
Set<String> |
getSpanAnnotations() |
addProperties, addProperty, checkProperties, checkProperty, getProperties, getProperty, getPropertyDesctriptions, getPropertyNames, removePropertyValue, setPropertyValue, setPropertyValues, setPropertyValues, stringToCharList, toStringpublic static final String PROP_COL_CONFIG
ID FORM LEMMA CPOS POS FEATS HEAD REL MISC TOK_INFOThe following configuration provides the annotation names for the first 6 (ID, FORM not counted) columns and collapses MISC and TOK_INFO (tokenization information):
cols=Lemma,_,Pos,Morph,Dep,,Further CPOS is filled with dashes. FORM is always taken from the Salt tokenization, so ID, FORM and HEAD are always omitted. If you want to use default values for single annotations, use
*:
cols=*,*,*,*,*,*,*is the trivial case of only using defaults. This only changes the annotation name for POS and dashes out the last two values:
cols=*,*,pos,*,*,_,_
public static final String PROP_SPAN_ANNOS
public static final String COLLAPSE_VALUE
public static final String MARKER_USE_DEFAULT
public static final String PROP_SEGMENTATION_NAME
public static final String PROP_DISCOURSE_UNIT_ANNO_NAME
public static final String PROP_ANNOS_AS_FEATURES
Copyright © 2010–2019 Humboldt-Universität zu Berlin, INRIA. All rights reserved.