Skip to main content

This section of the ISM provides guidance on content filtering.

Content filtering techniques

Content filters reduce the likelihood of unauthorised or malicious content transiting a security domain boundary by assessing data based on defined security policies. The following techniques can assist with assessing the suitability of data to transit a security domain boundary.



Antivirus scan

Scans the data for viruses and other malicious code.

Automated dynamic analysis

Analyses email and web content in a sandbox before delivering it to users.

Data format check

Inspects data to ensure that it conforms to expected and permitted formats.

Data range check

Checks the data in each field to ensure that it falls within the expected and permitted ranges.

Data type check

Inspects each file header to determine the actual file type.

File extension check

Inspects the file name extension to determine the purported file type.

Keyword search

Searches data for keywords or ‘dirty words’ that could indicate the presence of inappropriate or undesirable material.

Metadata check

Inspects files for metadata that should be removed prior to release.

Protective marking check

Validates the protective marking of the data to ensure that it is correct.

Manual inspection

The manual inspection of data for suspicious content that an automated system could miss, which is particularly important for the transfer of multimedia or content rich files.

Verification against file specification

Verifies that the file conforms to the defined file specification and can be effectively processed by subsequent content filters.

Content filtering

Implementing an effective content filter which cannot be bypassed reduces the likelihood of malicious content successfully passing into a security domain. Content filtering is only effective when suitable components are selected and appropriately configured with consideration of an organisation’s business processes and threat environment.

When content filters are protecting classified environments as a component of a CDS, their assurance requirements necessitate rigorous security testing.

Security Control: 0659; Revision: 4; Updated: Sep-18; Applicability: O, P, S, TS
When importing data into a security domain, by any means including a CDS, the data is filtered by a content filter designed for that purpose.

Security Control: 1524; Revision: 1; Updated: Dec-19; Applicability: S, TS
Content filters deployed in a CDS are subject to rigorous security assessment to ensure they mitigate content-based threats and cannot be bypassed.

Active, malicious and suspicious content

Many files are executable and are potentially harmful if executed by a user. Many file type specifications allow active content to be embedded in the file, which increases the attack surface. The definition of suspicious content will depend on the system’s security risk profile and what is considered to be normal system behaviour.

Security Control: 0651; Revision: 4; Updated: Sep-18; Applicability: O, P, S, TS
All suspicious, malicious and active content is blocked from entering a security domain.

Security Control: 0652; Revision: 2; Updated: Sep-18; Applicability: O, P, S, TS
Any data identified by a content filtering process as suspicious is blocked until reviewed and approved for transfer by a trusted source other than the originator.

Automated dynamic analysis

Analysing email and web content in a sandbox is a highly effective strategy to detect suspicious behaviour including network traffic, new or modified files, or other configuration changes.

Security Control: 1389; Revision: 1; Updated: Sep-18; Applicability: O, P, S, TS
Email and web content entering a security domain is automatically run in a dynamic malware analysis sandbox to detect suspicious behaviour.

Content validation

Content validation aims to ensure that the content received conforms to an approved standard. For example, content validation can be used to identify malformed content thereby allowing potentially malicious content to be blocked.

Examples of content validation include:

  • ensuring numeric fields only contain numeric numbers
  • ensuring content falls within acceptable length boundaries
  • ensuring Extensible Markup Language (XML) documents are compared to a strictly defined XML schema.

Security Control: 1284; Revision: 2; Updated: Oct-19; Applicability: O, P, S, TS
Content validation is performed on all data passing through a content filter with content which fails content validation blocked.

Content conversion and transformation

Content conversion or transformation can be an effective method to render potentially malicious content harmless by separating the presentation format from the data. By converting a file to another format, the exploit, active content and/or payload can be removed or disrupted.

Examples of content conversion and transformation to mitigate the threat of content exploitation include:

  • converting a Microsoft Word document to a Portable Document Format (PDF) file
  • converting a Microsoft PowerPoint presentation to a series of image files
  • converting a Microsoft Excel spreadsheet to a comma-separated values file
  • converting a PDF document to a plain text file.

Some file types, such as XML, will not benefit from conversion. Applying the conversion process to any attachments or files contained within other files (e.g. archive files or encoded files embedded in XML) can increase the effectiveness of a content filter.

Security Control: 1286; Revision: 1; Updated: Sep-18; Applicability: O, P, S, TS
Content conversion is performed for all ingress or egress data transiting a security domain boundary.

Content sanitisation

Sanitisation is the process of attempting to make potentially malicious content safe to use by removing or altering active content while leaving the original content as intact as possible. Sanitisation is not as secure a method of content filtering as conversion, though many techniques may be combined. Inspecting and filtering extraneous application and protocol data, including metadata, will assist in mitigating the threat of content exploitation. Examples include:

  • removal of document property information in Microsoft Office documents
  • removal or renaming of JavaScript sections from PDF files
  • removal of metadata from within image files.

Security Control: 1287; Revision: 1; Updated: Sep-18; Applicability: O, P, S, TS
Content sanitisation is performed on suitable file types if content conversion is not appropriate for data transiting a security domain boundary.

Antivirus scanning

Antivirus scanning is used to prevent, detect and remove malicious code that includes computer viruses, worms, Trojans, spyware and adware.

Security Control: 1288; Revision: 1; Updated: Sep-18; Applicability: O, P, S, TS
Antivirus scanning, using multiple different scanning engines, is performed on all content.

Archive and container files

Archive and container files can be used to bypass content filtering processes if the content filter does not handle the file type and embedded content correctly. Ensuring the content filtering process recognises archived and container files will ensure the embedded files they contain are subject to the same content filtering measures as un-archived files.

Archive files can be constructed in a manner which can pose a denial of service security risk due to processor, memory or disk space exhaustion. To limit the likelihood of such an attack, content filters can specify resource constraints/quotas while extracting these files. If these constraints are exceeded the inspection is terminated, the content blocked and a security administrator alerted.

Security Control: 1289; Revision: 1; Updated: Sep-18; Applicability: O, P, S, TS
The contents from archive/container files are extracted and subjected to content filter checks.

Security Control: 1290; Revision: 1; Updated: Sep-18; Applicability: O, P, S, TS
Controlled inspection of archive/container files is performed to ensure that content filter performance or availability is not adversely affected.

Security Control: 1291; Revision: 1; Updated: Sep-18; Applicability: O, P, S, TS
Files that cannot be inspected are blocked and generate an alert or notification.

Allowing access to specific content types

Creating and enforcing a list of allowed content types, based on business requirements and the results of a risk assessment, is a strong content filtering method that can reduce the attack surface of a system. As a simple example, an email content filter might only allow Microsoft Office documents and PDF files.

Security Control: 0649; Revision: 7; Updated: Apr-20; Applicability: O, P, S, TS
A list of allowed content types is implemented.

Data integrity

Ensuring the authenticity and integrity of content reaching a security domain is a key component in ensuring its trustworthiness. It is also essential that content that has been authorised for release from a security domain is not modified (e.g. by the addition or substitution of information). If content passing through a filter contains a form of integrity protection, such as a digital signature, the content filter needs to verify the content’s integrity before allowing it through. If the content fails these integrity checks it may have been spoofed or tampered with and should be dropped.

Examples of data integrity checks include:

  • an email server or content filter verifying an email protected by DomainKeys Identified Mail
  • a web service verifying the XML digital signature contained within a Simple Object Access Protocol request
  • validating a file against a separately supplied hash
  • checking that data to be exported from a security domain has been digitally signed by a release authority.

Security Control: 1292; Revision: 1; Updated: Sep-18; Applicability: O, P, S, TS
The integrity of content is verified where applicable and blocked if verification fails.

Security Control: 0677; Revision: 4; Updated: Sep-18; Applicability: S, TS
If data is signed, the signature is validated before the data is exported.

Encrypted data

Encryption can be used to bypass content filtering if encrypted content cannot be subject to the same checks performed on unencrypted content. Organisations should consider the need to decrypt content, depending on the security domain they are communicating with and depending on whether the need-to-know principle needs to be enforced.

Choosing not to decrypt content poses a security risk that malicious code’s encrypted communications and data could move between security domains. In addition, encryption could mask information at a higher classification being allowed to pass to a security domain of lower classification, which could result in a data spill.

Where a business need to preserve the confidentiality of encrypted data exists, an organisation may consider a dedicated system to allow encrypted content through external, boundary or perimeter controls to be decrypted in an appropriately secure environment, in which case the content should be subject to all applicable content filtering controls after it has been decrypted.

Security Control: 1293; Revision: 1; Updated: Sep-18; Applicability: O, P, S, TS
All encrypted content, traffic and data is decrypted and inspected to allow content filtering.