Pandas DataFrame | info method
Start your free 7-days trial now!
Pandas' DataFrame.info(~)
method outputs a brief summary of the DataFrame, which includes information such as the data-types and memory consumption.
Parameters
1. verbose
| boolean
| optional
Whether or not to output a detailed summary. The default value depends on your machine.
2. buf
| writable buffer
| optional
The location to output. By default, buf=sys.stdout
(the standard output).
3. max_cols
| int
| optional
The maximum number of columns to output. If the number of columns in the source DataFrame exceeds this value, then some columns will be truncated. The default value depends on your machine.
4. memory_usage
| string
or boolean
| optional
Whether or not to show the memory usage of each column:
Value | Description |
---|---|
| Show memory usage. For DataFrames that contain object types (e.g. strings), the memory usage would be not be accurate. This is because the method takes a crude estimate on memory consumed by object types. |
| Do not show memory usage. |
| Perform some heavy lifting to calculate actual memory usage of object types, and show the memory usage. |
The default value depends on your machine.
5. null_counts
| boolean
| optional
Whether or not to show the number of non-null values in each column. Again, the default value depends on your machine.
Return Value
Nothing is returned since all we're doing here is printing a summary of the DataFrame.
Examples
Consider the following DataFrame:
df
A B C D0 3 5.0 True K1 4 6.0 False KK
Basic usage
Calling info()
without any parameters:
df.info()
<class 'pandas.core.frame.DataFrame'>RangeIndex: 2 entries, 0 to 1Data columns (total 4 columns):A 2 non-null int64B 2 non-null float64C 2 non-null boolD 2 non-null objectdtypes: bool(1), float64(1), int64(1), object(1)memory usage: 178.0+ bytes
Here, my machine has the following default options:
verbose=True
memory_usage=True
null_counts=True
Setting verbose=False
If you do not need information about each column, set verbose=False
like so:
df.info(verbose=False)
<class 'pandas.core.frame.DataFrame'>RangeIndex: 2 entries, 0 to 1Columns: 4 entries, A to Ddtypes: bool(1), float64(1), int64(1), object(1)memory usage: 178.0+ bytes
Writing to an external file
Instead of showing the output on the screen, we can write the output to an external file by using the buf
parameter.
import io
buffer = io.StringIO()df.info(buf=buffer)str_summary = buffer.getvalue()
with open("df_summary.txt", "w") as file: file.write(str_summary)
This will create a file called "df_summary.txt"
in the same directory as your Python script. The content of this file would just be the same as what you would have seen on the screen.
Setting memory_usage=deep
Since our DataFrame contains a column of data-type object (column D
), the value returned by memory_usage=True
will be off:
df.info()
<class 'pandas.core.frame.DataFrame'>...memory usage: 178.0+ bytes
To get a more accurate representation of the memory consumed by the DataFrame, set memory_usage="deep"
:
df.info(memory_usage="deep")
<class 'pandas.core.frame.DataFrame'>...memory usage: 287.0 bytes
We see that the our DataFrame occupies 287
bytes so we were off by nearly 100
bytes.