formataddr() and unicode
I often see code like this:
message["To"] = formataddr((name, email))
This looks like it should work, especially since the docstring of
says that it will return a string value suitable for a
However, while it works most of the time, it fails if
name is a
unicode string containing non-ascii characters. It may look ok if you look
message["To"], but as soon as you convert the message or header to
a byte string, you will see the problem.
>>> from email.Message import Message >>> from email.Utils import formataddr >>> msg = Message() >>> msg["To"] = formataddr((u"Björn", "email@example.com")) >>> msg["To"] u'Bj\xf6rn <firstname.lastname@example.org>' >>> msg.as_string() 'To: =?utf-8?b?QmrDtnJuIDxiam9ybkB0aWxsZW5pdXMubWU+?=\n\n'
Most code that will use the
To address in the example will fail, since there's no visible e-mail address in there. The header should look like this, i.e. only the name itself should be encoded:
To: =?utf-8?b?QmrDtnJu?= <email@example.com>
I wish Python would handle this better. I usually end up writing a helper function like this for projects I work on:
def format_address(name, email): email = str(email) if not name: return email name = str(Header(name)) return formataddr((name, email))