[uug] elinks and character sets
Andrew McNabb
amcnabb at mcnabbs.org
Sat Apr 18 14:55:55 MDT 2009
I use Mutt for email, and I've been bugged for a long time by email
messages that have lots of � characters in them. I've seen this
especially frequently in messages from Gmail, where there's in every
sentence. Anyway, I've finally tracked down the problem and found a
solution, so I'm posting it here for posterity.
It turns out that Gmail encodes its messages with the ISO-8859-1
character set. Some people. Anyway, Mutt sends the messages to elinks
to dump to text, and elinks assumes that the character set is Unicode.
So all of the nbsp characters were getting converted to �. It turns out
that you can get Mutt to pass the charset to elinks with the following
mailcap entry:
text/html; elinks -dump -eval 'set document.codepage.assume = "%{charset}"' %s; copiousoutput; nametemplate=%s.html
Now if only the Gmail people would stop killing kittens and switch to
UTF-8.
--
Andrew McNabb
http://www.mcnabbs.org/andrew/
PGP Fingerprint: 8A17 B57C 6879 1863 DE55 8012 AB4D 6098 8826 6868
More information about the uug-list
mailing list