Body Transform

Syntax

":raw" / ":content" <content-types: string-list> / ":text"

Description

Prior to matching content in a message body, "transformations" can be applied that filter and decode certain parts of the body. These transformations are selected by a "BODY-TRANSFORM" keyword parameter.

The default transformation is :text.

The ":raw" transform matches against the entire undecoded body of a message as a single item.

If the specified body-transform is ":raw", the [MIME] structure of the body is irrelevant. The implementation MUST NOT remove any transfer encoding from the message, MUST NOT refuse to filter messages with syntactic errors (unless the environment it is part of rejects them outright), and MUST treat multipart boundaries or the MIME headers of enclosed body parts as part of the content being matched against, instead of MIME structures to interpret.

If the body transform is ":content", the MIME parts that have the specified content types are matched against independently.

If an individual content type begins or ends with a '/' (slash) or contains multiple slashes, then it matches no content types. Otherwise, if it contains a slash, then it specifies a full <type>/<subtype> pair, and matches only that specific content type. If it is the empty string, all MIME content types are matched. Otherwise, it specifies a <type> only, and any subtype of that type matches it.

The search for MIME parts matching the :content specification is recursive and automatically descends into multipart and message/rfc822 MIME parts. All MIME parts with matching types are searched for the key strings. The test returns true if any combination of a searched MIME part and key-list argument match.

If the :content specification matches a multipart MIME part, only the prologue and epilogue sections of the part will be searched for the key strings, treating the entire prologue and the entire epilogue as separate strings; the contents of nested parts are only searched if their respective types match the :content specification.

If the :content specification matches a message/rfc822 MIME part, only the header of the nested message will be searched for the key strings, treating the header as a single string; the contents of the nested message body parts are only searched if their content type matches the :content specification.

For other MIME types, the entire part will be searched as a single string.

(Matches against container types with an empty match string can be useful as tests for the existence of such parts.)

The ":text" body transform matches against the results of an implementation's best effort at extracting UTF-8 encoded text from a message.

It is unspecified whether this transformation results in a single string or multiple strings being matched against. All the text extracted from a given non-container MIME part MUST be in the same string.

In simple implementations, :text MAY be treated the same as :content "text".

Sophisticated implementations MAY strip mark-up from the text prior to matching, and MAY convert media types other than text to text prior to matching.

(For example, they may be able to convert proprietary text editor formats to text or apply optical character recognition algorithms to image data.)

Module

Body : Requires import of the "body" capability

Resources

RFC5173