Difference between Series and DataFrame in Pandas
Start your free 7-days trial now!
You can think of a DataFrame data structure as a standard table that is composed of rows and columns. Each column is represented by a Series data structure and a DataFrame (table) is simply a container that holds many Series objects (columns) together.
Generally speaking, the APIs available for DataFrame and Series objects are very similar with considerable overlap, however, APIs for DataFrames cater for multi-column operations, while Series APIs only cater for a single column.
DataFrame
DataFrames can be used to represent the following table that has 3 rows and 3 columns:
Name | Age | Class | |
---|---|---|---|
0 | Alex | 16 | A |
1 | Cathy | 17 | B |
2 | Bob | 17 | A |
To create a DataFrame representing this table in Pandas, use the DataFrame
constructor:
Name Age Class0 Alex 16 A1 Cathy 17 B2 Bob 17 A
Series
A Series is a data structure representing a single row or column of a DataFrame.
To access a particular column of a DataFrame, use the []
notation with the column label like so:
df["Name"]
0 Alex1 Cathy2 BobName: Name, dtype: object
Here, we are accessing the Name
column and the return type is Series
. You can access individual values in a Series using integer indices, just as you would for standard arrays:
col_name = df["Name"] # col_name is a Seriescol_name[1]
'Cathy'