search
Search
Publish
menu
menu search toc more_vert
Robocat
Guest 0reps
Thanks for the thanks!
close
chevron_left DataFrame
Basic and Descriptive Statistics29 topics
Sorting and Restructuring DataFrames14 topics
Functions and Aggregations8 topics
Miscellaneous2 topics
Meta Information2 topics
Time Series12 topics
Binary Operators25 topics
Combining DataFrames7 topics
Iterators6 topics
Type Conversion7 topics
Constructor DataFrame
Data Indexing and Masks12 topics
Handling Missing Values4 topics
Properties10 topics
Data Selection and Renaming29 topics
Comments
Log in or sign up
Cancel
Post
account_circle
Profile
exit_to_app
Sign out
help Ask a question
Share on Twitter
search
keyboard_voice
close
Searching Tips
Search for a recipe: "Creating a table in MySQL"
Search for an API documentation: "@append"
Search for code: "!dataframe"
Apply a tag filter: "#python"
Useful Shortcuts
/ to open search panel
Esc to close search panel
to navigate between search results
d to clear all current filters
Enter to expand content preview
icon_star
Doc Search
icon_star
Code Search Beta
SORRY NOTHING FOUND!
mic
Start speaking...
Voice search is only supported in Safari and Chrome.
Navigate to
A
A
share
thumb_up_alt
bookmark
arrow_backShare
Twitter
Facebook
chevron_left DataFrame
Basic and Descriptive Statistics29 topics
Sorting and Restructuring DataFrames14 topics
Functions and Aggregations8 topics
Miscellaneous2 topics
Meta Information2 topics
Time Series12 topics
Binary Operators25 topics
Combining DataFrames7 topics
Iterators6 topics
Type Conversion7 topics
Constructor DataFrame
Data Indexing and Masks12 topics
Handling Missing Values4 topics
Properties10 topics
Data Selection and Renaming29 topics
thumb_up
0
thumb_down
0
chat_bubble_outline
0
auto_stories new
settings

Pandas | DataFrame constructor

Programming
chevron_right
Python
chevron_right
Pandas
chevron_right
Documentation
chevron_right
DataFrame
schedule Mar 10, 2022
Last updated
local_offer PythonPandas
Tags

Pandas' DataFrame(~) constructor is used to initialise a new DataFrame.

Parameters

1. data | scalar or 2D ndarray or iterable or dict or DataFrame

The dict can contain scalars and array-like objects such as lists, Series and NumPy arrays.

2. indexlink | Index or array-like | optional

The index to use for the DataFrame. By default, if index is not passed and data provides no index, then integer indices will be used.

3. columnslink | Index or array-like | optional

The column labels to use for the DataFrame. By default, if columns is not passed and data provides no column labels, then integer indices will be used.

4. dtypelink | dtype | optional

The data type to use for the DataFrame if possible. Only one type is allowed, and no error is thrown if type conversion is unsuccessful. By default, dtype=None, that is, the data type is inferred.

5. copy | boolean | optional

This parameter is only relevant if data is a DataFrame or a 2D ndarray.

  • If True, then a new DataFrame is returned. Modifying this returned DataFrame will not affect data, and vice versa.

  • If False, then modifying the returned DataFrame will also mutate the original data, and vice versa.

By default, copy=False.

Return value

A DataFrame object.

Examples

Using a dictionary of arrays

To create a DataFrame using a dictionary of arrays:

df = pd.DataFrame({"A":[3,4], "B":[5,6]})
df
A B
0 3 5
1 4 6

Here, the key-value pair of the dictionary is as follows:

  • key: column label

  • value: values of that column

Also, since the data does not contain any index (i.e. row labels), the default integer indices are used.

Using a nested dictionary

To create a DataFrame using a nested dictionary:

col_one = {"a":3,"b":4}
col_two = {"a":5,"b":6}
df = pd.DataFrame({"A":col_one, "B":col_two})
df
A B
a 3 5
b 4 6

Here, we've specified the index in col_one and col_two.

Using a Series

To create a DataFrame using a Series:

s_one = pd.Series([3,4], index=["a","b"])
s_two = pd.Series([5,6], index=["a","b"])
df = pd.DataFrame({"A":s_one, "B":s_two})
df
A B
a 3 5
b 4 6

Using 2D array

We can pass in a 2D list or 2D NumPy array like so:

df = pd.DataFrame([[3,4],[5,6]])
df
0 1
0 3 4
1 5 6

Notice how the default row and column labels are integer indices.

Using a constant

To initialise a DataFrame using a single constant, we need to specify parameters columns and index so as to define the shape of the DataFrame:

pd.DataFrame(2, index=["a","b"], columns=["A","B","C"])
A B C
a 2 2 2
b 2 2 2

Specifying column labels and index

To explicitly set the column labels and index (i.e. row labels):

df = pd.DataFrame([[3,4],[5,6]], columns=["A","B"], index=["a","b"])
df
A B
a 3 4
b 5 6

Specifying dtype

To set a preference for the type of all columns:

df = pd.DataFrame([["3",4],["5",6]], dtype=float)
df
0 1
0 3.0 4.0
1 5.0 6.0

Notice how "3" was casted to a float.

Note that no error will be thrown even if the type conversion is unsuccessful. For instance:

df = pd.DataFrame([["3@@@",4],["5",6]], dtype=float)
df
0 1
0 3@@@ 4.0
1 5 6.0

Here, the dtypes of the columns are as follow:

0 object
1 float64
dtype: object
robocat
Published by Isshin Inada
Edited by 0 others
Did you find this page useful?
thumb_up
thumb_down
Ask a question or leave a feedback...