Quantcast
Channel: CodeSection,代码区,Python开发技术文章_教程 - CodeSec
Viewing all articles
Browse latest Browse all 9596

Fixing Python Unicode Errors

$
0
0

python has a lot of issues handling Unicode (there seem to be backwards compatibility issues https://en.wikipedia.org/wiki/History_of_Python).

One common error you get will look like this:

Traceback (most recent call last):
File "D:\Software\Anaconda3\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 777-778: character maps to

If this occurs while you are doing an HTTP request, do this:

httpresponse = urlopen(url).read().decode('utf8') response = json.loads(httpresponse)

If this occurs while you are reading a file in Beautiful soup, do this (the ‘rb’ triggers binary mode):

soup = BeautifulSoup(open(file, 'rb'), 'html.parser')

If this occurs in a print statement, like so, you may need to start logging to a file.

print u'\u0420\u043e\u0441\u0441\u0438\u044f'

If you're looking for a Python book, Natural Language Processing with Python is a great way to learn the language while building some really interesting projects.

Citations:

Viewing all articles
Browse latest Browse all 9596

Trending Articles