textmail - mail filter to replace MS Word/HTML attachments with plain text
textmail - mail filter to replace MS Word/HTML attachments with plain text
usage: textmail [options] options: -h - Print the help message then exit -m - Print the manpage then exit -w - Print the manpage in html format then exit -r - Print the manpage in nroff format then exit -M - Output in mailbox format (mboxrd) -T - Output in raw mail format (for smtp) -W - Don't replace MS Word attachments with text -E - Don't replace MS Excel attachments with csv -H - Don't replace HTML attachments with text -R - Don't replace RTF attachments with text -P - Don't replace PDF attachments with text -U - Don't translate winmail.dat attachments -L - Don't reduce appledouble attachments -I - Don't delete image attachments -A - Don't delete audio attachments -V - Don't delete video attachments -X - Don't delete MS Windows executable attachments -B - Don't recode text that was base64-encoded -S - Don't replace spaces in filenames with underscores -Z - Do translate signed content (discards signatures) -O - Delete all application/octet-stream attachments -! - Delete all application/* attachments -D hdrs - Delete headers (list of header prefixes and filenames) -K types - Keep attachments (list of mimetypes and filenames) -f - On translation error, keep translation, not original -? - Print paths of helper applications then exit
textmail filters a mail message or mbox, replacing MS Word, MS
Excel, HTML, RTF and PDF attachments with the plain text contained therein.
By default, the following attachments are also deleted: image, audio, video
and MS Windows executables. MS winmail.dat attachments are replaced by
any attachments contained therein which are then replaced by text or deleted
in the same fashion. Any of these actions can be suppressed with the command
line options. Mail headers can also be selectively deleted.
This is useful for increasing the accessibility of mail messages (by reducing their dependence on proprietary file formats), for dramatically reducing their size (and the time it takes to download them and the time it takes to read them), and for dramatically reducing the risk of mail-borne viruses. Its intended use is as a preprocessor for mailing lists. This is more friendly than a strict ``No Attachments'' policy.
-h-mman textmail
but this works even when the manpage isn't installed.
-wmkdir -p /usr/local/share/doc/textmail/html && textmail -w > /usr/local/share/doc/textmail/html/textmail.1.html
-rtextmail -r > /usr/local/share/man/man1/textmail.1
-MFrom line at the top if there isn't one already and ensures that there is
a blank line at the bottom of the output. It also performs mailbox quoting
on any lines in the body that look like mailbox From headers. Use this
when the output is to be stored directly in a mailbox file. It is not
necessary when textmail is being used as a mail filter by procmail(1).
-TFrom line and by not performing mailbox quoting. Use this when
the output is to be sent directly to an SMTP server. It is not necessary
when textmail is being used as a mail filter by procmail(1).
-W-E-H-R-P-Uwinmail.dat) attachments
with the attachments contained therein which are then translated to text as
normal. This option leaves winmail.dat attachments intact. This option,
together with the -! option will cause winmail.dat attachments to be
deleted rather than translated.
-Lmultipart/appledouble attachments with
just the data fork attachment contained therein which is then translated to
text as normal. This option leaves appledouble attachments intact. However,
the data fork attachment will still be translated as normal resulting in a
probably inappropriate and possibly broken resource fork attachment.
Therefore, this option should probably only be used in conjunction with
other options that suppress the translation of the data fork attachment.
-I-A-V-Xapplication/octet-stream attachments with the
following filename extensions: com, exe, pif, dll, ocx,
scr, vbs and js. This option leaves MS Windows executable
attachments intact. To delete zip files as well, you could use either the
-O option or the -! option.
-Bbase64-encoded, textmail
will recode it as either 7bit or quoted-printable, whichever is
appropriate. This option suppresses this recoding. Note that if the text is
large enough and contains a high enough proportion of non-ASCII characters,
it will remain base64-encoded to minimise space.
-S-Zmultipart/signed attachments.
This option causes multipart/signed attachments to be replaced by the
signed attachment contained therein, discarding the signature control data.
The no-longer-signed data is then translated to text as normal. Note that
multipart/encrypted attachments are never translated.
-Oapplication/octet-stream attachments, not just MS Windows
executables. Note that this overrides -X but -K overrides this.
-!application/* attachments. Note that this overrides -X but
-K overrides this. Also note that translated documents are no longer
application/* attachments so they aren't deleted unless their translation
is suppressed with the appropriate command line option.
-D hdrstextmail -DX- deletes all headers whose names begin with X-.
-K types-O and -! options delete even more. This option specifies, by mimetype
and/or filename extension, a list of attachments not to delete. This
overrides all deletions.
The types argument is a comma separated list of mimetypes and/or filename
extensions and/or the names of files containing mimetypes and/or filename
extensions (blank lines, whitespace and shell style comments are ignored).
Note that the elements are interpreted as a complete mimetype, if they
contain a slash character, or as either the * in application/* or as a
filename extension if they do not contain a slash character. For example,
textmail -Wf!Kdoc deletes all application/* attachments except MS Word
documents.
-fwinmail.dat
attachments are corrupt. This option causes the empty translation to take
the place of the original attachment. Only the name of the attachment is
preserved. This is needed to ensure plain text even in the face of an MS
Word document that contains no text (e.g. only images).
-?A procmail(1) recipe that insists on pure text and no X- headers (with
output in mailbox format):
:0 fw | textmail -Mf!DX-
Do the same but to an existing mailbox file:
textmail -Mf!DX- < mailbox > mailbox-as-text
Delete all application/* attachments except for PostScript and PDF (and
don't translate PDF into text):
textmail -!PKps,pdf
Delete all application/* attachments except for zip files and gzipped tar
files:
textmail -!Ktar.gz,zip
A procmail(1) recipe that just unpacks winmail.dat attachments but doesn't translate the attachments contained therein into text and doesn't delete windows executables (with output in mailbox format):
:0 fw | textmail -MWEHRPLIAVXS
MS Word and RTF documents are translated into plain text using
antiword(1) or catdoc(1). If textmail can't find antiword(1) or
catdoc(1), then MS Word and RTF attachments are left intact. So make sure
that antiword(1) or catdoc(1) is installed and in the $PATH.
MS Excel documents are translated into csv files using xls2csv(1). If
textmail can't find xls2csv(1), then MS Excel attachments are left
intact. So make sure that xls2csv(1) is installed and in the $PATH.
HTML documents are translated into plain text using lynx(1). If
textmail can't find lynx(1), then HTML attachments are left intact. So
make sure that lynx(1) is installed and in the $PATH.
PDF documents are translated into plain text using pdftotext(1). If
textmail can't find pdftotext(1), then PDF attachments are left
intact. So make sure that pdftotext(1) is installed and in the $PATH.
textmail also requires perl(1) and pod2man(1) and pod2html(1) (which come with perl(1)) and mktemp(1).
If textmail fails to create a temporary directory, or if it is instructed
to do nothing (i.e. -WEHRPULIAVX), then it degenerates into cat(1).
The latest version of xls2csv(1) at the time of writing (i.e. catdoc-0.93.3) loses data.
If textmail is unable to create a temporary directory (in /tmp), then
it degenerates into cat(1). Without a temporary directory, no attachments
will be translated or deleted no matter what options (even -f) were given
to textmail. So make sure that /tmp is writable. Also make sure that
mktemp(1) is available otherwise an insecure temporary directory will be
created.
procmail(1),
antiword(1),
catdoc(1),
xls2csv(1),
lynx(1),
pdftotext(1),
pod2man(1),
pod2html(1),
http://raf.org/minimail/
20070803 raf <raf@raf.org>
http://raf.org/textmail/