SER Blog  Customer Stories & Use Cases

Redacting PDFs

Some documents contain confidential information that is not intended for everyone. On paper documents, you can use a black marker to redact this information. For digital documents, there are software functions that allow you to redact the specific texts and passages.  

In this article, we will show you how to digitally redact documents and protect sensitive information.

What is document redaction?

When we speak about redacting a document, we mean making sensitive information undetectable. Typically, the affected texts, images, or information are covered with a black field so that they are no longer visible. For digital documents, it is also important to remove the underlying text layer.  

An alternative to redacting is pseudonymizing content. For example, personal names are replaced by placeholder names. This article does not cover that topic.

Why are documents redacted?

The purpose of redacting a document is to keep information secret. Because not everyone may be authorized to read all of the contents. Redacting a document ensures that only those parts that do not contain secret data are readable.  

In this sense, secret data is information that has to be treated confidentially and is only accessible to a specific group of people. For example, confidentiality agreements indicate what information in documents has to be concealed.  

Further examples include:  

  • Trade secrets  
  • Sensitive personal data such as health data  
  • Personal rights, e.g. for victims and perpetrators

Redacting documents: Here’s how it works

There are three ways to redact documents:  

1. Paper and pen  

You can redact paper documents using a felt-tip pen, permanent marker, or any other pen that has an opaque color. Typically, the redacted document is then scanned in order to digitize it. Sometimes, however, certain areas are still readable because the opacity is too low and the text shows through. You can solve this problem by printing the document again, blacking it out and scanning it again.  

2. Redacting image PDFs  

You can redact image files, such as a scanned document, using an image editing program. It is important that the file is an image-only format without any text layer. In this case, the black bar in the image editing program is enough to hide the desired image sections. Don't forget to save your redacted file.

3. Redacting text PDFs  

Redacting text PDF files is not as easy: you need a special program for this, such as Adobe Acrobat. However – and this is important – if you only overwrite the text in the PDF document with a black field using the free version of the PDF solution, you will nevertheless be sharing sensitive data. The text behind the black field can still be copied, and the text is thus visible to everyone with just a few clicks.  

The paid version of Adobe Acrobat redacts texts, and it also removes the text layer behind the black field. This ensures the content really cannot be detected.

Redacting documents in the DMS

In addition to Adobe Acrobat, there are many other software solutions that enable users to redact documents – including a document management system (DMS).  

There you can manage all of your documents centrally – from document creation to long-term archiving. A DMS is also always a digital archive that ideally promotes audit-proof archiving. Companies that use a state-of-the-art DMS such as Doxis can also redact their documents directly in the system before sharing them.

Redaction in Doxis – many roads lead to black

Suppose you want to anonymize a document. In Doxis you can use the “Redact” functionality for this activity. With this you can redact text manually, search-based, or AI-powered.

Hey Doxi, how does redaction work in Doxis?

  • Manually: Drag a black field over the area you want to redact.
  • Search-based: combine the search function with the redaction function by searching for a specific occurrence – for example, all entries for Sam Sample – and then click on “Redact found entries”.
  • AI-powered: proper name recognition provides you with a grouped overview of, for example, all names, numbers, and organizations. With a simple click you can start redacting elements.

The assisted search using artificial intelligence is particularly recommended for complex use cases. A complex use case is, for example, when you need to redact all the people or all money amounts in a document. Then manual or search-based redaction would be too time-consuming.

Who sees the redacted or unredacted versions?

In Doxis, whether you see a document in the redacted or original version depends on your access rights. Doxis works with document representations. This means that documents such as contracts are found in multiple versions in a content object.

  1. Representation: The contract as a Word document
  2. Representation: The contract document as a PDF file

The PDF version is used, for example, to share the document with third parties. To redact information for outside parties, Doxis creates a third representation. You now have three versions of the same contract document:

  1. Representation: the still modifiable Word document
  2. Representation: the unredacted PDF version of the Word document
  3. Representation: the redacted PDF file

In theory, it is possible to switch between all three representations. This function is only available for authorized employees. This is because the individual representations are linked to authorizations that you store in Doxis. For example, the contracts department may access all three representations, internal employees may only view the unredacted PDF version, and third-party employees may only view the redacted version.

Will only the data under the black field be deleted or everything?

From the outside, all redacted content looks the same. In the background, there are two ways to hide content in Doxis:

Doxis only cuts out the redacted text from the document. The rest of the text layer remains intact and is still full-text searchable for the people you share the document with.

Doxis transforms the document into an image page by page, overlaying the sensitive information with black fields (100% opacity). After that, there will be no text layer left in the document.

Document management guide

How can a DMS boost your organization’s efficiency? Which system is right for you? This practical guide helps you to find & implement the right DMS. Incl. checklists, real-life examples, etc.

Read now

Redact documents in batches

Ideally, you can use the convenient function for redacting documents in batches. This means you don't have to go through document after document manually. Instead, you mark the documents you want to redact and Doxis starts the process. The system takes care of automatically redacting the relevant parts of the document in the background without any further action on your part. This option is available for selected data points such as: names, places, numbers, email addresses, etc. and makes use of AI functionality.

However, since errors can also creep in here, we advise that you check the results again before sending them to outside parties for particularly critical documents.

Redaction protects confidential content

To redact passages in a document, use a permanent marker or digital tools such as Doxis that have a redact function. This function can help you make secret text passages undetectable for unauthorized third parties. It is important that redacted content can no longer be copied. Otherwise, this content is still readable. With Doxis you can manage files securely – whether redacted or in their original form. Everyone only sees the representation of a file to which they have access.

Redacting documents: FAQs

What is the purpose of redacting documents?
The purpose of redacting documents is to protect confidential information from unauthorized access. The redacted areas mean that people only see the information in the document that is shared with them.
How can I redact a document?
You can redact paper documents using an opaque pen that cannot be erased. Digitally, documents can be redacted using programs that remove the text layer and draw a black bar over the text.
Can PDFs be restored?
Redacted content is usually not recoverable. Some software has a “remove redaction” function. This should be treated with caution, as the contents can then be read by third parties, if the document has been shared directly.

You might also be interested in

The latest digitization trends, laws and guidelines, and helpful tips straight to your inbox: Subscribe to our newsletter.

How can we help you?

+49 (0) 30 498582-0
Please add 3 and 7.

Your message has reached us!

We appreciate your interest and will get back to you shortly.

Contact us