search
Search
Login
Unlock 100+ guides
menu
menu
web
search toc
close
Comments
Log in or sign up
Cancel
Post
account_circle
Profile
exit_to_app
Sign out
What does this mean?
Why is this true?
Give me some examples!
search
keyboard_voice
close
Searching Tips
Search for a recipe:
"Creating a table in MySQL"
Search for an API documentation: "@append"
Search for code: "!dataframe"
Apply a tag filter: "#python"
Useful Shortcuts
/ to open search panel
Esc to close search panel
to navigate between search results
d to clear all current filters
Enter to expand content preview
icon_star
Doc Search
icon_star
Code Search Beta
SORRY NOTHING FOUND!
mic
Start speaking...
Voice search is only supported in Safari and Chrome.
Navigate to
chevron_leftCreating DataFrames Cookbook
Combining multiple Series into a DataFrameCombining multiple Series to form a DataFrameConverting a Series to a DataFrameConverting list of lists into DataFrameConverting list to DataFrameConverting percent string into a numeric for read_csvConverting scikit-learn dataset to Pandas DataFrameConverting string data into a DataFrameCreating a DataFrame from a stringCreating a DataFrame using listsCreating a DataFrame with different type for each columnCreating a DataFrame with empty valuesCreating a DataFrame with missing valuesCreating a DataFrame with random numbersCreating a DataFrame with zerosCreating a MultiIndex DataFrameCreating a Pandas DataFrameCreating a single DataFrame from multiple filesCreating empty DataFrame with only column labelsFilling missing values when using read_csvImporting DatasetImporting tables from PostgreSQL as Pandas DataFramesInitialising a DataFrame using a constantInitialising a DataFrame using a dictionaryInitialising a DataFrame using a list of dictionariesInserting lists into a DataFrame cellKeeping leading zeroes when using read_csvParsing dates when using read_csvPreventing strings from getting parsed as NaN for read_csvReading data from GitHubReading file without headerReading large CSV files in chunksReading n random lines using read_csvReading space-delimited filesReading specific columns from fileReading tab-delimited filesReading the first few lines of a file to create DataFrameReading the last n lines of a fileReading URL using read_csvReading zipped csv file as a DataFrameRemoving Unnamed:0 columnResolving ParserError: Error tokenizing dataSaving DataFrame as zipped csvSkipping rows without skipping header for read_csvSpecifying data type for read_csvTreating missing values as empty strings rather than NaN for read_csv
check_circle
Mark as learned
thumb_up
4
thumb_down
1
chat_bubble_outline
0
Comment
auto_stories Bi-column layout
settings

Resolving ParserError: Error tokenizing data in Pandas

schedule Aug 10, 2023
Last updated
local_offer
PythonPandas
Tags
mode_heat
Master the mathematics behind data science with 100+ top-tier guides
Start your free 7-days trial now!

Common reasons for ParserError: Error tokenizing data when initiating a Pandas DataFrame include:

  • Using the wrong delimiter

  • Number of fields in certain rows do not match with header

To resolve the error we can try the following:

  • Specifying the delimiter through sep parameter in read_csv(~)

  • Fixing the original source file

  • Skipping bad rows

Examples

Specifying sep

By default the read_csv(~) method assumes sep=",". Therefore when reading files that use a different delimiter, make sure to explicitly specify the delimiter to use.

Consider the following slash-delimited file called test.txt:

col1/col2/col3
1/A/4
2/B/5
3/C,D,E/6

To initialize a DataFrame using default sep=",":

import pandas as pd
df = pd.read_csv('test.txt')
ParserError: Error tokenizing data. C error: Expected 1 fields in line 4, saw 3

An error is raised as the first line in the file does not contain any commas, so read_csv(~) expects all lines in the file to only contain 1 field. However, line 4 has 3 fields as it contains two commas, which results in the ParserError.

To initialize the DataFrame by correctly specifying slash (/) as the delimiter:

df = pd.read_csv('test.txt', sep='/')
df
col1 col2 col3
0 1 A 4
1 2 B 5
2 3 C,D,E 6

We can now see the DataFrame is initialized as expected with each line containing the 3 fields which were separated by slashes (/) in the original file.

Fixing original source file

Consider the following comma-separated file called test.csv:

col1,col2,col3
1,A,4
2,B,5,
3,C,6

To initialize a DataFrame using the above file:

import pandas as pd
df = pd.read_csv('test.csv')
ParserError: Error tokenizing data. C error: Expected 3 fields in line 3, saw 4

Here there is an error on the 3rd line as 4 fields are observed instead of 3, caused by the additional comma at the end of the line.

To resolve this error, we can correct the original file by removing the extra comma at the end of line 3:

col1,col2,col3
1,A,4
2,B,5
3,C,6

To now initialize the DataFrame again using the corrected file:

df = pd.read_csv('test.csv')
df
col1 col2 col3
0 1 A 4
1 2 B 5
2 3 C 6

Skipping bad rows

Consider the following comma-separated file called test.csv:

col1,col2,col3
1,A,4
2,B,5,
3,C,6

To initialize a DataFrame using the above file:

import pandas as pd
df = pd.read_csv('test.csv')
ParserError: Error tokenizing data. C error: Expected 3 fields in line 3, saw 4

Here there is an error on the 3rd line as 4 fields are observed instead of 3, caused by the additional comma at the end of the line.

To skip bad rows pass on_bad_lines='skip' to read_csv(~):

df = pd.read_csv('test.csv', on_bad_lines='skip')
df
col1 col2 col3
0 1 A 4
1 3 C 6

Notice how the problematic third line in the original file has been skipped in the resulting DataFrame.

WARNING

This should be your last resort as valuable information could be contained within the problematic lines. Skipping these rows means you lose this information. As much as possible try to identify the root cause of the error and fix the underlying problem.

robocat
Published by Arthur Yanagisawa
Edited by 0 others
Did you find this page useful?
thumb_up
thumb_down
Comment
Citation
Ask a question or leave a feedback...
thumb_up
4
thumb_down
1
chat_bubble_outline
0
settings
Enjoy our search
Hit / to insta-search docs and recipes!