Redaction basics

GroupDocs.Redaction supports an effective set of document redaction features. It allows to apply redactions for text, metadata, annotations, images.

Wide range of document formats is supported, such as: PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX and others. See full list of supported formats at supported document formats article

Redaction types

GroupDocs.Redaction comes with the following redaction types:

Type Description Classes
Text Replaces or hides with color block a portion of text within document body ExactPhraseRedaction, RegexRedaction
Metadata Replace metadata values with empty ones or redacts metadata texts EraseMetadataRedaction, MetadataSearchRedaction
Annotations Deletes annotations from document or redacts its texts DeleteAnnotationRedaction, AnnotationRedaction
Images Replaces specific area of an image with a colored box ImageAreaRedaction
Pages Removes specific range of pages (slides, worksheets, etc.) RemovePageRedaction

Apply redaction

Applying redaction to a document is done through Redactor.Apply method. As a result, you receive RedactorChangeLog instance, containing a log entry for each redaction applied. The entry contains reference to Redacton instance including its options, status of the operation (see below) and textual descriptions when applicable. If at least one redaction failed, you will see Status == RedactionStatus.Failed:

from groupdocs.redaction import Redactor, RedactionStatus
from groupdocs.redaction.options import SaveOptions
from groupdocs.redaction.redactions import ExactPhraseRedaction, ReplacementOptions

def apply_redaction():
    # Specify the redaction options
    repl_opt = ReplacementOptions("[personal]")
    ex_red = ExactPhraseRedaction("John Doe", repl_opt)

    # Load the document to be redacted
    with Redactor("./sample.docx") as redactor:
        # Apply the redaction
        result = redactor.apply(ex_red)

        if result.status != RedactionStatus.FAILED:
            # Save the redacted document next to the source file
            so = SaveOptions()
            so.add_suffix = True
            so.rasterize_to_pdf = False
            so.redacted_file_suffix = "redacted"
            redactor.save(so)

if __name__ == "__main__":
    apply_redaction()

sample.docx is the sample file used in this example. Click here to download it.

Binary file (DOCX, 16 KB)

Download full output

All possible statuses of the RedactionStatus enumeration are listed in this table:

Status Description Possible reasons
Applied Redaction was fully and successfully applied All operations within redaction process were successfully applied
PartiallyApplied Redaction was applied only to a part of its matches 1) Trial limitations for replacements were exceeded2) At least one change was rejected by user
Skipped Redaction was skipped (not applied) 1) Trial limitations for redactions were exceeded2) Redaction cannot be applied to this type of document3) All replacements were rejected by user and no changes were made
Failed Redaction failed with exception An exception occurred in process of redaction

For detailed information you have to iterate through redaction log entries in RedactorChangeLog.RedactionLog and check for ErrorMessage property of any items with status other than Applied:

result = redactor.apply(redaction)
if result.status != RedactionStatus.FAILED:
    # By default, the redacted document is saved in PDF format
    save_options = SaveOptions()
    save_options.add_suffix = True
    save_options.rasterize_to_pdf = True
    save_options.redacted_file_suffix = "redacted"
    result_path = redactor.save(save_options)
    print(f"Document redacted successfully.\nCheck output in {result_path}")
else:
    # Dump all failed or skipped redactions
    print("Redaction failed!")
    for log_entry in result.redaction_log:
        if log_entry.result.status != RedactionStatus.APPLIED:
            print(f"Status is {log_entry.result.status}, details: {log_entry.result.error_message}")

Apply multiple redactions

You can apply as much redactions as you need in a single call to Redactor.Apply method, since its overload accepts an array of redactions and redaction policy. In this case, redactions will be applied in the same order as they appear in the array. As an alternative to specifying redaction sets in your code, you can create an XML file with redaction policy, as described here.

from groupdocs.redaction import Redactor, RedactionStatus
from groupdocs.redaction.options import SaveOptions
from groupdocs.redaction.redactions import (
    ExactPhraseRedaction,
    RegexRedaction,
    ReplacementOptions,
    DeleteAnnotationRedaction,
    EraseMetadataRedaction,
    MetadataFilters,
)
from groupdocs.pydrawing import Color

def apply_multiple_redactions():
    # Define the color of the redaction box
    color = Color.from_argb(255, 220, 20, 60)

    # Provide a list of redactions to apply in order
    redaction_list = [
        ExactPhraseRedaction("John Doe", ReplacementOptions("[Client]")),
        RegexRedaction("Redaction", ReplacementOptions("[Product]")),
        RegexRedaction("\\d{2}\\s*\\d{2}[^\\d]*\\d{6}", ReplacementOptions(color)),
        DeleteAnnotationRedaction(),
        EraseMetadataRedaction(MetadataFilters.ALL),
    ]

    # Load the document to be redacted
    with Redactor("./sample.docx") as redactor:
        # Apply the list of redactions
        result = redactor.apply(redaction_list)

        if result.status != RedactionStatus.FAILED:
            # By default, the redacted document is saved in PDF format
            save_options = SaveOptions()
            save_options.add_suffix = True
            save_options.rasterize_to_pdf = True
            save_options.redacted_file_suffix = "redacted"
            redactor.save(save_options)
        else:
            # Dump all failed or skipped redactions
            print("Redaction failed!")
            for log_entry in result.redaction_log:
                if log_entry.result.status != RedactionStatus.APPLIED:
                    print(f"Status is {log_entry.result.status}, details: {log_entry.result.error_message}")

if __name__ == "__main__":
    apply_multiple_redactions()

sample.docx is the sample file used in this example. Click here to download it.

Binary file (PDF, 1.2 MB)

Download full output