near_me
Linear Algebra
keyboard_arrow_down 54 guides
chevron_leftCookbooks
Adding whitespace to strings in PythonChecking if a string is empty in PythonConcatenating stringsConverting a string to uppercase in PythonConverting list to a stringConverting string to a listCounting the Occurrence of Characters in StringsDetermining encoding of textDifference between casefold() and lower()Splitting a stringStripping empty spaces in PythonWriting a long string in Python
check_circle
Mark as learned thumb_up
0
thumb_down
0
chat_bubble_outline
0
Comment auto_stories Bi-column layout
settings
Determining encoding of text in Python
schedule Aug 10, 2023
Last updated local_offer
Tags Python
tocTable of Contents
expand_more Master the mathematics behind data science with 100+ top-tier guides
Start your free 7-days trial now!
Start your free 7-days trial now!
We can use the chardet
module to help us determine the encoding of text in Python.
Examples
Let us assume we have a file sample.csv
that we want to check encoding for. We can do this using chardet.detect(~)
as follows:
import chardet
# check the first five thousand bytes to guess the encodingwith open("sample.csv", 'rb') as text: encoding = chardet.detect(text.read(5000))
# check what the character encoding might beprint(encoding)
{'encoding': 'UTF-8', 'confidence': 0.95, 'language': ''}
chardet
simply makes a best guess of the coding but may not be correct all the time.
Published by Arthur Yanagisawa
Edited by 0 others
Did you find this page useful?
thumb_up
thumb_down
Comment
Citation
Ask a question or leave a feedback...
thumb_up
0
thumb_down
0
chat_bubble_outline
0
settings
Enjoy our search
Hit / to insta-search docs and recipes!