Package io.redlink.nlp.stanfordnlp
Class StanfordNlpPipeline
- java.lang.Object
-
- io.redlink.nlp.stanfordnlp.StanfordNlpPipeline
-
public abstract class StanfordNlpPipeline extends Object
A language specific configuration for OpenNLP based NER recognitions. Subclasses need to register them self asServices so that they can get injected to theStanfordNlpProcessor- Author:
- Rupert Westenthaler
-
-
Field Summary
Fields Modifier and Type Field Description protected org.slf4j.Loggerlog
-
Constructor Summary
Constructors Modifier Constructor Description protectedStanfordNlpPipeline(String name, Locale locale)
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description voidactivate()Activates the component based on its configuration defined in the #voiddeactivate()protected voiddoActivate()activation hookprotected voiddoDeactivate()Deactivation hookedu.stanford.nlp.pipeline.AnnotatorgetAnnotator(String name)Getter for theAnnotator.List<String>getAnnotators()Read only list of the annotators used by the configured pipeline.StringgetLanguage()The language supported by the NerModel.abstract edu.stanford.nlp.trees.TreebankLanguagePackgetLanguagePack()The Treebank LanguagePackLocalegetLocale()StringgetName()NerTaggetNerTag(String tag)protected abstract TagSet<NerTag>getNerTagset()PhraseTaggetPhraseTag(String tag)protected abstract TagSet<PhraseTag>getPhraseTagset()edu.stanford.nlp.pipeline.AnnotationPipelinegetPipeline()PosTaggetPosTag(String tag)protected abstract TagSet<PosTag>getPosTagset()PropertiesgetProperties()Getter for the properties holding the Stanford NLP configuration used by this componentRelTaggetRelationTag(String tag)protected abstract TagSet<RelTag>getRelTagset()protected voidinitAnnotatorPool(Properties properties)Initializes the Annotators as referenced by the 'annotators' field of the parsed properties.booleanisActive()booleanisCaseSensitive()If this pipeline uses case sensitive modelsprotected voidsetAnnotatorImplementation(edu.stanford.nlp.pipeline.AnnotatorImplementations annotatorImplementation)Sets a customAnnotatorImplementationsclass.protected voidsetCaseSensitive(boolean caseSensitive)Setter for the caseSensitive state.protected voidsetProperties(Properties properties)The properties used foractivate().StringtoString()
-
-
-
Method Detail
-
getPosTagset
protected abstract TagSet<PosTag> getPosTagset()
- Returns:
- the
PosTagset ornullif none is available.
-
getNerTagset
protected abstract TagSet<NerTag> getNerTagset()
- Returns:
- the
NerTagset ornullif none is available.
-
getPhraseTagset
protected abstract TagSet<PhraseTag> getPhraseTagset()
- Returns:
- the
PhraseTagset ornullif none is available.
-
getRelTagset
protected abstract TagSet<RelTag> getRelTagset()
- Returns:
- the
RelTagset ornullif none is available.
-
getLanguagePack
public abstract edu.stanford.nlp.trees.TreebankLanguagePack getLanguagePack()
The Treebank LanguagePack- Returns:
- The treebank LanguagePack or
nullif none available
-
setProperties
protected final void setProperties(Properties properties)
The properties used foractivate(). This MUST BE set before activating the component- Parameters:
properties- the properties
-
setCaseSensitive
protected void setCaseSensitive(boolean caseSensitive)
Setter for the caseSensitive state. If this pipeline uses caseless models this MUST BE set tofalseso that the extractor does parse a lower case version of the text- Parameters:
caseSensitive- the case sensitive state
-
getProperties
public Properties getProperties()
Getter for the properties holding the Stanford NLP configuration used by this component- Returns:
- the properties or
nullif not set.
-
isCaseSensitive
public boolean isCaseSensitive()
If this pipeline uses case sensitive models
-
getAnnotators
public List<String> getAnnotators()
Read only list of the annotators used by the configured pipeline. Only available afteractivation- Returns:
- the list of annotator (names) used in the pipeline
-
getAnnotator
public edu.stanford.nlp.pipeline.Annotator getAnnotator(String name)
Getter for theAnnotator. Only available afteractivation- Parameters:
name- the name of the annotator as configured in the pipeline. Valid strings are member of thegetAnnotators()list- Returns:
- the Annotator
- Throws:
IllegalStateException- if the parsed name is not part of thegetAnnotators()list
-
setAnnotatorImplementation
protected final void setAnnotatorImplementation(edu.stanford.nlp.pipeline.AnnotatorImplementations annotatorImplementation)
Sets a customAnnotatorImplementationsclass. Needs to be called beforeactivation(typically as part of the constructor of the sub-class)- Parameters:
annotatorImplementation-
-
activate
public final void activate() throws IOExceptionActivates the component based on its configuration defined in the #- Throws:
IOException
-
doActivate
protected void doActivate()
activation hook
-
initAnnotatorPool
protected void initAnnotatorPool(Properties properties)
Initializes the Annotators as referenced by the 'annotators' field of the parsed properties.NOTE: This will only initialize
AnnotatorFactoriesthat are actually used in the configured pipeline (thegetAnnotators()list)- Parameters:
properties- the propertiesannotatorImplementation- annotator impl instance
-
isActive
public final boolean isActive()
-
getLocale
public final Locale getLocale()
-
getLanguage
public final String getLanguage()
The language supported by the NerModel. This method can be called beforeactivation- Returns:
- the ISO 639-1 language code (e.g. "en" for English)
-
getName
public final String getName()
-
getPipeline
public final edu.stanford.nlp.pipeline.AnnotationPipeline getPipeline()
-
getPosTag
public final PosTag getPosTag(String tag)
Uses the#posTagsetandadhocPosTagsto return existing instances ofPosTags. If not present it will create a new one and add it toadhocPosTags
-
getNerTag
public final NerTag getNerTag(String tag)
Uses the#posTagsetandadhocPosTagsto return existing instances ofPosTags. If not present it will create a new one and add it toadhocPosTags- Parameters:
tag- the String pos tag as returned by the#tagger- Returns:
- the
PosTagguaranteed to be notnull
-
getRelationTag
public final RelTag getRelationTag(String tag)
Uses the#posTagsetandadhocPosTagsto return existing instances ofPosTags. If not present it will create a new one and add it toadhocPosTags- Parameters:
tag- the String pos tag as returned by the#tagger- Returns:
- the
PosTagguaranteed to be notnull
-
getPhraseTag
public final PhraseTag getPhraseTag(String tag)
Uses thegetPhraseTagset()andadhocPhraseTagsto return existing instances ofPhraseTags. If not present it will create a new one and add it toadhocPhraseTags- Parameters:
tag- the String phrase tag as returned by the parser- Returns:
- the
PhraseTagguaranteed to be notnull
-
deactivate
public final void deactivate()
-
doDeactivate
protected void doDeactivate()
Deactivation hook
-
-