public class SpacyParser extends PythonParser
PythonParser.ParseExecutor, PythonParser.PythonDispatcherCallback
Modifier and Type | Field and Description |
---|---|
static String |
PIPELINE_PROPERTY_PREFIX
The prefix for properties to be sent to the spaCy pipeline.
|
REGEXRULE_MARKER
Constructor and Description |
---|
SpacyParser()
Default constructor for an SpacyParser.
|
Modifier and Type | Method and Description |
---|---|
protected void |
applyEntityReferences(List<JsonObject> jsonEnts,
InformationExtraction ext,
InformationExtractionToken[] tokens)
Applies any entity references to the specified tokens.
|
protected void |
applyRegexPatterns(List<JsonObject> jsonEnts,
InformationExtraction ext,
InformationExtractionToken[] tokens)
Applies any REGEX entity type hits to the specified tokens.
|
protected void |
createInformationExtractionArtifacts(JsonObject jsonTop)
Create the InformationExtraction artifacts from the raw NLP information
generated in spaCy and represented in the specified JSON object.
|
protected InformationExtractionSentence[] |
createSentences(List<JsonObject> jsonSents,
List<JsonObject> jsonTokens,
InformationExtraction ext,
InformationExtractionToken[] tokens)
Creates the InformationExtractionSentence objects from the
specified JsonObjects.
|
protected InformationExtractionTokenList |
createTokenList(InformationExtractionToken[] tokens,
InformationExtraction ext)
Creates a TokenList.
|
protected InformationExtractionToken[] |
createTokens(List<JsonObject> jsonTokens,
InformationExtraction ext)
Creates the InformationExtractionToken objects from the
specified JsonObjects.
|
protected String |
getPythonScriptContent()
Gets the python script to execute.
|
protected void |
preParse()
Perform task prior to executing the parse.
|
protected void |
processJsonOutputs(JsonObject outputsObj)
Processes the outputs produced in Python, from the specified
JsonObject.
|
protected void |
setJsonInputs(JsonObject inputsObj)
Sets the inputs to be sent to Python, in the specified
JsonObject.
|
copyOutputFile, copyTempFile, createJsonResultDocument, determineJsonResultDocumentAcl, determineJsonResultDocumentName, determineJsonResultDocumentParentFolder, getDocument, getInformationExtraction, getPythonCommandList, getPythonScriptUniqueIdentifer, getSettings, getTempFileCounter, initialize, normalizeTempFilePath, parse, parse, setInformationExtraction
public static final String PIPELINE_PROPERTY_PREFIX
public SpacyParser()
protected void preParse() throws IfsException
Override to pick up settings.
preParse
in class PythonParser
IfsException
- if the operation failsprotected void setJsonInputs(JsonObject inputsObj) throws IfsException
setJsonInputs
in class PythonParser
inputsObj
- the JSON inputsIfsException
- if the operation failsprotected String getPythonScriptContent()
getPythonScriptContent
in class PythonParser
protected void processJsonOutputs(JsonObject outputsObj) throws IfsException
processJsonOutputs
in class PythonParser
outputsObj
- the JSON outputsIfsException
- if the operation failsprotected void createInformationExtractionArtifacts(JsonObject jsonTop) throws IfsException
jsonTop
- the top-level JSON object from the results fileIfsException
- if the operation failsprotected InformationExtractionToken[] createTokens(List<JsonObject> jsonTokens, InformationExtraction ext) throws IfsException
jsonTokens
- the JsonObjects representing the tokensext
- the target InformationExtractionIfsException
- if the operation failsprotected InformationExtractionTokenList createTokenList(InformationExtractionToken[] tokens, InformationExtraction ext) throws IfsException
tokens
- the InformationExtractionTokensext
- the target InformationExtractionIfsException
- if the operation failsprotected InformationExtractionSentence[] createSentences(List<JsonObject> jsonSents, List<JsonObject> jsonTokens, InformationExtraction ext, InformationExtractionToken[] tokens) throws IfsException
jsonSents
- the JsonObjects representing the sentencesjsonTokens
- the JsonObjects representing the tokensext
- the target InformationExtractiontokens
- the tokens for the entire documentIfsException
- if the operation failsprotected void applyEntityReferences(List<JsonObject> jsonEnts, InformationExtraction ext, InformationExtractionToken[] tokens) throws IfsException
jsonEnts
- the JsonObjects representing the entity hitsext
- the target InformationExtractiontokens
- the tokens for the entire documentIfsException
- if the operation failsprotected void applyRegexPatterns(List<JsonObject> jsonEnts, InformationExtraction ext, InformationExtractionToken[] tokens) throws IfsException
jsonEnts
- the JsonObjects representing the entity type hitsext
- the target InformationExtractiontokens
- the tokens for the entire documentIfsException
- if the operation failsCopyright © 2023. All rights reserved.