Extractor

Inheritance: java.lang.Object, com.groupdocs.search.events.EventHubBase

public class Extractor extends EventHubBase

Represents a tool for preliminary data extraction from documents for separating the stage of subsequent fast indexing.

Constructors

Constructor Description
Extractor() Initializes a new instance of the Extractor class.

Fields

Field Description
ErrorOccurred Occurs when an error happens during an extractor operation.
ImagePreparing Occurs when an image is going to be prepared for indexing.
PasswordRequired Occurs when a document requires password for opening.

Methods

Method Description
getSettings() Gets the extractor settings.
extract(Document document, ExtractionOptions extractionOptions) Extracts data from a document.
raiseErrorOccurredPublic(String message, boolean isCritical)
raiseImagePreparingPublic(String documentKey, String[] innerPath, int imageIndex, ImageFrame[] frames, InputStream stream)
raisePasswordRequiredPublic(String filePath)

Extractor()

public Extractor()

Initializes a new instance of the Extractor class.

ErrorOccurred

public final Event<EventHandler<IndexErrorEventArgs>> ErrorOccurred

Occurs when an error happens during an extractor operation.

ImagePreparing

public final Event<EventHandler<ImagePreparingEventArgs>> ImagePreparing

Occurs when an image is going to be prepared for indexing.

PasswordRequired

public final Event<EventHandler<PasswordRequiredEventArgs>> PasswordRequired

Occurs when a document requires password for opening.

getSettings()

public final ExtractorSettings getSettings()

Gets the extractor settings.

Returns: ExtractorSettings - The extractor settings.

extract(Document document, ExtractionOptions extractionOptions)

public final ExtractedData extract(Document document, ExtractionOptions extractionOptions)

Extracts data from a document.

Parameters:

Parameter Type Description
document Document The document from file system, stream or structure.
extractionOptions ExtractionOptions The extraction options.

Returns: ExtractedData - The extracted data of the document.

raiseErrorOccurredPublic(String message, boolean isCritical)

public final void raiseErrorOccurredPublic(String message, boolean isCritical)

Parameters:

Parameter Type Description
message java.lang.String
isCritical boolean

raiseImagePreparingPublic(String documentKey, String[] innerPath, int imageIndex, ImageFrame[] frames, InputStream stream)

public final void raiseImagePreparingPublic(String documentKey, String[] innerPath, int imageIndex, ImageFrame[] frames, InputStream stream)

Parameters:

Parameter Type Description
documentKey java.lang.String
innerPath java.lang.String[]
imageIndex int
frames ImageFrame[]
stream java.io.InputStream

raisePasswordRequiredPublic(String filePath)

public final String raisePasswordRequiredPublic(String filePath)

Parameters:

Parameter Type Description
filePath java.lang.String

Returns: java.lang.String