|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
public interface TextCorpus
Interface TextCorpus represents TCF TextCorpus annotations.
Corresponds to TCF TextCorpus specification.
These annotations represent linguistic annotations on written connected text.
The annotations are divided into the annotation layers, were each layer
represents specific linguistic aspect. For example, TextCorpus can
contain TokensLayer, PosTagsLayer,
ConstituentParsingLayer, etc. In TextCorpus, annotations from any
layer usually annotate (directly or indirectly) Token annotations
from TokensLayer. An exception is TextLayer which is
independent from any other layer.
See also:
TCF Format description.
| Method Summary | |
|---|---|
LexicalSemanticsLayer |
createAntonymyLayer()
Creates empty antonymy layer in this TextCorpus. |
ConstituentParsingLayer |
createConstituentParsingLayer(String tagset)
Creates empty ConstituentParsingLayer with the given tagset in
this TextCorpus. |
DependencyParsingLayer |
createDependencyParsingLayer(boolean multipleGovernorsPossible,
boolean emptyTokensPossible)
Creates empty DependencyParsingLayer in this TextCorpus. |
DependencyParsingLayer |
createDependencyParsingLayer(String tagset,
boolean multipleGovernorsPossible,
boolean emptyTokensPossible)
Creates empty DependencyParsingLayer with the given tagset in
this TextCorpus. |
DiscourseConnectivesLayer |
createDiscourseConnectivesLayer()
Creates empty DiscourseConnectivesLayer in this
TextCorpus. |
DiscourseConnectivesLayer |
createDiscourseConnectivesLayer(String typeTagset)
Creates empty DiscourseConnectivesLayer in this
TextCorpus. |
GeoLayer |
createGeoLayer(String source,
GeoLongLatFormat coordFormat)
Creates empty GeoLayer in this TextCorpus. |
GeoLayer |
createGeoLayer(String source,
GeoLongLatFormat coordFormat,
GeoContinentFormat conitentFormat,
GeoCountryFormat countryFormat,
GeoCapitalFormat capitalFormat)
Creates empty GeoLayer in this TextCorpus. |
LexicalSemanticsLayer |
createHyperonymyLayer()
Creates empty hyperonymy layer in this TextCorpus. |
LexicalSemanticsLayer |
createHyponymyLayer()
Creates empty hyponymy layer in this TextCorpus. |
LemmasLayer |
createLemmasLayer()
Creates empty LemmasLayer in this TextCorpus. |
MatchesLayer |
createMatchesLayer(String queryLanguage,
String queryString)
Creates empty MatchesLayer layers of this TextCorpus, ready to be filled in with the corpus match annotations. |
MorphologyLayer |
createMorphologyLayer()
Creates empty MorphologyLayer in this TextCorpus. |
MorphologyLayer |
createMorphologyLayer(boolean hasSegmentation)
Creates empty MorphologyLayer in this TextCorpus. |
MorphologyLayer |
createMorphologyLayer(boolean hasSegmentation,
boolean hasCharOffsets)
Creates empty MorphologyLayer in this TextCorpus. |
NamedEntitiesLayer |
createNamedEntitiesLayer(String entitiesType)
Creates empty NamedEntitiesLayer with the given tagset for named
entity types in this TextCorpus. |
OrthographyLayer |
createOrthographyLayer()
Creates empty OrthographyLayer in this TextCorpus. |
PhoneticsLayer |
createPhotenicsLayer(String alphabet)
Creates empty PhoneticsLayer with the given alphabet for phonetic
transcriptions in this TextCorpus. |
PosTagsLayer |
createPosTagsLayer(String tagset)
Creates empty PosTagsLayer with the given tagset in this
TextCorpus. |
ReferencesLayer |
createReferencesLayer(String typetagset,
String reltagset,
String externalReferencesSource)
Creates empty references layers of this TextCorpus, ready to be filled in with the references data. |
RelationsLayer |
createRelationsLayer(String type)
|
SentencesLayer |
createSentencesLayer()
Creates empty SentencesLayer in this TextCorpus. |
SentencesLayer |
createSentencesLayer(boolean hasCharOffsets)
Creates empty SentencesLayer in this TextCorpus. |
LexicalSemanticsLayer |
createSynonymyLayer()
Creates empty synonymy layer in this TextCorpus. |
TextLayer |
createTextLayer()
Creates empty TextLayer in this TextCorpus. |
TextStructureLayer |
createTextStructureLayer()
Creates empty TextStructureLayer in this TextCorpus. |
TokensLayer |
createTokensLayer()
Creates empty TokensLayer in this TextCorpus. |
TokensLayer |
createTokensLayer(boolean hasCharOffsets)
Creates empty TokensLayer in this TextCorpus. |
WordSensesLayer |
createWordSensesLayer(String source)
Creates empty WordSensesLayer in this
TextCorpus. |
WordSplittingLayer |
createWordSplittingLayer(String type)
Creates empty WordSplittingLayer with the given type of the
splitting in this TextCorpus. |
LexicalSemanticsLayer |
getAntonymyLayer()
Gets antonymy layer of this TextCorpus. |
ConstituentParsingLayer |
getConstituentParsingLayer()
Gets constituent parsing layer of this TextCorpus. |
DependencyParsingLayer |
getDependencyParsingLayer()
Gets dependency parsing layer of this TextCorpus. |
DiscourseConnectivesLayer |
getDiscourseConnectivesLayer()
Gets discourse connectives layer of this TextCorpus. |
GeoLayer |
getGeoLayer()
Gets geo layer of this TextCorpus. |
LexicalSemanticsLayer |
getHyperonymyLayer()
Gets hyperonymy layer of this TextCorpus. |
LexicalSemanticsLayer |
getHyponymyLayer()
Gets hyponymy layer of this TextCorpus. |
String |
getLanguage()
Gets the language of the text/tokens in this TextCorpus. |
List<TextCorpusLayer> |
getLayers()
Gets all annotation layers of this TextCorpus. |
LemmasLayer |
getLemmasLayer()
Gets lemmas layer of this TextCorpus. |
MatchesLayer |
getMatchesLayer()
Gets matches layer of this TextCorpus. |
MorphologyLayer |
getMorphologyLayer()
Gets morphology layer of this TextCorpus. |
NamedEntitiesLayer |
getNamedEntitiesLayer()
Gets named entities layer of this TextCorpus. |
OrthographyLayer |
getOrthographyLayer()
Gets orthography layer of this TextCorpus. |
PhoneticsLayer |
getPhoneticsLayer()
Gets phonetics layer of this TextCorpus. |
PosTagsLayer |
getPosTagsLayer()
Gets part-of-speech layer of this TextCorpus. |
ReferencesLayer |
getReferencesLayer()
Gets references layer of this TextCorpus. |
RelationsLayer |
getRelationsLayer()
|
SentencesLayer |
getSentencesLayer()
Gets sentences layer of this TextCorpus. |
LexicalSemanticsLayer |
getSynonymyLayer()
Gets synonymy layer of this TextCorpus. |
TextLayer |
getTextLayer()
Gets text layer of this TextCorpus. |
TextStructureLayer |
getTextStructureLayer()
Gets text structure layer of this TextCorpus. |
TokensLayer |
getTokensLayer()
Gets tokens layer of this TextCorpus. |
WordSensesLayer |
getWordSensesLayer()
Gets word senses layer of this TextCorpus. |
WordSplittingLayer |
getWordSplittingLayer()
Gets word splitting layer of this TextCorpus. |
| Method Detail |
|---|
String getLanguage()
List<TextCorpusLayer> getLayers()
TextLayer getTextLayer()
TextLayer createTextLayer()
TextLayer in this TextCorpus.
TokensLayer getTokensLayer()
TokensLayer createTokensLayer()
TokensLayer in this TextCorpus.
TokensLayer createTokensLayer(boolean hasCharOffsets)
TokensLayer in this TextCorpus.
hasCharOffsets - true if the Token objects in this
TokensLayer will contain character offset in text information,
false otherwise.
LemmasLayer getLemmasLayer()
Token objects from
TokensLayer.LemmasLayer createLemmasLayer()
LemmasLayer in this TextCorpus.
PosTagsLayer getPosTagsLayer()
Token
objects from TokensLayer.PosTagsLayer createPosTagsLayer(String tagset)
PosTagsLayer with the given tagset in this
TextCorpus.
tagset - of the part-of-speech annotations.
SentencesLayer getSentencesLayer()
Token
objects from TokensLayer.SentencesLayer createSentencesLayer()
SentencesLayer in this TextCorpus.
SentencesLayer createSentencesLayer(boolean hasCharOffsets)
SentencesLayer in this TextCorpus.
hasCharOffsets - true if the Sentence objects in this
SentencesLayer will contain character offset in text
information, false otherwise.
ConstituentParsingLayer getConstituentParsingLayer()
Token
objects from TokensLayer.ConstituentParsingLayer createConstituentParsingLayer(String tagset)
ConstituentParsingLayer with the given tagset in
this TextCorpus.
tagset - of the parsing annotations.
DependencyParsingLayer getDependencyParsingLayer()
Token
objects from TokensLayer.
DependencyParsingLayer createDependencyParsingLayer(boolean multipleGovernorsPossible,
boolean emptyTokensPossible)
DependencyParsingLayer in this TextCorpus.
multipleGovernorsPossible - true if a dependent can be governed by
more than 1 governor, false otherwise.emptyTokensPossible - true if dependency annotations can contain
empty tokens.
DependencyParsingLayer createDependencyParsingLayer(String tagset,
boolean multipleGovernorsPossible,
boolean emptyTokensPossible)
DependencyParsingLayer with the given tagset in
this TextCorpus.
tagset - of the functions between dependent and governor.multipleGovernorsPossible - true if a dependent can be governed by
more than 1 governor, false otherwise.emptyTokensPossible - true if dependency annotations can contain
empty tokens.
MorphologyLayer getMorphologyLayer()
Token objects from TokensLayer.MorphologyLayer createMorphologyLayer()
MorphologyLayer in this TextCorpus.
MorphologyLayer createMorphologyLayer(boolean hasSegmentation)
MorphologyLayer in this TextCorpus.
hasSegmentation - true if morphology annotations contain
segmentation analysis.
MorphologyLayer createMorphologyLayer(boolean hasSegmentation,
boolean hasCharOffsets)
MorphologyLayer in this TextCorpus.
hasSegmentation - true if morphology annotations contain
segmentation analysis.hasCharOffsets - true if the MorphologyAnalysis objects in
this layer will contain character offset for segmentation within the
token information, false otherwise.
NamedEntitiesLayer getNamedEntitiesLayer()
Token
objects from TokensLayer.NamedEntitiesLayer createNamedEntitiesLayer(String entitiesType)
NamedEntitiesLayer with the given tagset for named
entity types in this TextCorpus.
entitiesType - tagset of the named entity annotations.
ReferencesLayer getReferencesLayer()
Token objects from TokensLayer.
ReferencesLayer createReferencesLayer(String typetagset,
String reltagset,
String externalReferencesSource)
typetagset - tagset for the mention type values of the references
(should be null if no types are defined)reltagset - tagset for relation values between the references
(should be null if no relations are defined)externalReferencesSource - name of external source (should be null
if entities from the external source are not referenced)
RelationsLayer getRelationsLayer()
RelationsLayer createRelationsLayer(String type)
MatchesLayer getMatchesLayer()
Token objects from
TokensLayer.
MatchesLayer createMatchesLayer(String queryLanguage,
String queryString)
queryLanguage - language of the query used to extract corpus matches
from a corpus.queryString - the query used to extract corpus matches from a
corpus.
WordSplittingLayer getWordSplittingLayer()
Token
objects from TokensLayer.WordSplittingLayer createWordSplittingLayer(String type)
WordSplittingLayer with the given type of the
splitting in this TextCorpus.
type - of the splitting, e.g. hyphenation.
PhoneticsLayer getPhoneticsLayer()
Token objects
from TokensLayer.PhoneticsLayer createPhotenicsLayer(String alphabet)
PhoneticsLayer with the given alphabet for phonetic
transcriptions in this TextCorpus.
alphabet - of the phonetic transcription annotations.
GeoLayer getGeoLayer()
Token objects from TokensLayer.
GeoLayer createGeoLayer(String source,
GeoLongLatFormat coordFormat)
GeoLayer in this TextCorpus.
source - of the geographical coordinates.coordFormat - format of the geographical coordinates.
GeoLayer createGeoLayer(String source,
GeoLongLatFormat coordFormat,
GeoContinentFormat conitentFormat,
GeoCountryFormat countryFormat,
GeoCapitalFormat capitalFormat)
GeoLayer in this TextCorpus.
source - of the geographical coordinates.coordFormat - format of the geographical coordinates.conitentFormat - format of the continent (in case no continent is
specified should be null).countryFormat - format of the country (in case no country is
specified should be null).capitalFormat - format of the capital (in case no capital is
specified should be null).
OrthographyLayer getOrthographyLayer()
Token objects from TokensLayer.OrthographyLayer createOrthographyLayer()
OrthographyLayer in this TextCorpus.
TextStructureLayer getTextStructureLayer()
Token objects from
TokensLayer.TextStructureLayer createTextStructureLayer()
TextStructureLayer in this TextCorpus.
LexicalSemanticsLayer getSynonymyLayer()
Lemma objects from
LemmasLayer.LexicalSemanticsLayer createSynonymyLayer()
LexicalSemanticsLayer getAntonymyLayer()
Lemma objects from
LemmasLayer.LexicalSemanticsLayer createAntonymyLayer()
LexicalSemanticsLayer getHyponymyLayer()
Lemma objects from
LemmasLayer.LexicalSemanticsLayer createHyponymyLayer()
LexicalSemanticsLayer getHyperonymyLayer()
Lemma objects from
LemmasLayer.LexicalSemanticsLayer createHyperonymyLayer()
DiscourseConnectivesLayer getDiscourseConnectivesLayer()
Token objects from TokensLayer.DiscourseConnectivesLayer createDiscourseConnectivesLayer()
DiscourseConnectivesLayer in this
TextCorpus.
DiscourseConnectivesLayer createDiscourseConnectivesLayer(String typeTagset)
DiscourseConnectivesLayer in this
TextCorpus.
typeTagset - tagset used to label semantic types of the connectives
WordSensesLayer getWordSensesLayer()
Token objects from TokensLayer.WordSensesLayer createWordSensesLayer(String source)
WordSensesLayer in this
TextCorpus.
source - from where the word senses are taken
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||