Pandas DataFrame | align method
Start your free 7-days trial now!
Pandas DataFrame.align(~)
method ensures that two DataFrames have the same column or row labels.
Parameters
1. other
| DataFrame
or Series
The DataFrame or Series that you want to align with.
2. join
| string
| optional
The type of join to perform:
"outer"
"inner"
"left"
"right"
By default, join="outer"
. See examples below for clarification.
3. axis
| None
or int
or string
| optional
The axis along which to perform the alignment:
Axis | Description |
---|---|
| Align using row labels |
| Align using column labels |
By default, axis=None
.
4. level
| int
or string
| optional
The level to target. This is only relevant for Multi-index DataFrames. By default, level=None
.
5. copy
| boolean
| optional
Whether to return a new copy. If copy=False
and no reindexing is performed, then the original DataFrames/Series will be returned. By default, copy=True
.
6. fill_value
| scalar
| optional
The value to fill missing values (NaN
). By default, fill_value=np.NaN
, that is, the missing values are left as is.
7. method
| None
or string
| optional
The method by which to fill missing values:
Method | Description |
---|---|
| Fill using the previous valid observation |
| Fill using the next valid observation |
By default, method=None
.
8. limit
| int
| optional
The maximum number of consecutive fills allowed. For instance, if you have 3 consecutive NaN
s, and you set limit=2
, then only the first two NaN
s will be filled, and the third will be left as is. By default, limit=None
.
9. fill_axis
| int
or string
| optional
Whether to apply the method
horizontally or vertically:
Axis | Description |
---|---|
| Filling is applied vertically. |
| Filling is applied horizontally. |
By default, fill_axis=0
.
10. broadcast_axis
| int
or string
| optional
The axis along which to perform broadcasting:
Axis | Description |
---|---|
| Broadcast along the index axis. |
| Broadcast along the columns axis. |
By default, broadcast_axis=None
. This is only relevant when the source DataFrame
and other
have different dimensions.
Return value
A sized-two tuple of DataFrames (aligned source DataFrame, other DataFrame/Series).
Examples
Specifying the join type
Consider the following two DataFrames:
df_one = pd.DataFrame({"A":[1,2], "B":[3,4], "C":[5,6]})df_two = pd.DataFrame({"A":[7,8], "E":[9,10], "B":[11,12]}, index=["a","b"])
[df_one] [df_two] A B C A E B0 1 3 5 a 7 9 111 2 4 6 b 8 10 12
Outer full-join
To align the two DataFrame via outer full-join:
a_one, a_two = df_one.align(df_two, axis=1) # join="outer"
[a_one] | [a_two] A B C E | A B C E0 1 3 5 NaN | a 7 12 NaN 91 2 4 6 NaN | b 8 12 NaN 10
Here, note the following:
By default,
join="outer"
, which means that the resulting DataFrames will have all column labels present in both the input DataFrames. This is the reason we see column labelE
ina_one
, and column labelC
ina_two
.The
axis=1
parameter is telling Pandas to perform the alignment column-wise.Despite the fact that new columns are added, they do not hold any values as they are filled with
NaN
.
Inner join
To align via an inner-join:
a_one, a_two = df_one.align(df_two, join="inner", axis=1)
[a_one] [a_two] A B A B 0 1 3 a 7 111 2 4 b 8 12
We obtain this result because column labels "A"
and "B"
are present in both the DataFrames - every other columns are stripped away.
Left join
To align via a left-join:
a_one, a_two = df_one.align(df_two, join="left", axis=1)a_one
[a_one] [a_two] A B C A B C0 1 3 5 a 7 11 NaN1 2 4 6 b 8 12 NaN
By performing a left join, we are ensuring that the other
DataFrame has all the column labels of the source DataFrame. This is why we see column C
appear in a_two
.
Specifying the axis
Once again, suppose we have the following two DataFrames:
df_one = pd.DataFrame({"A":[1,2], "B":[3,4], "C":[5,6]})df_two = pd.DataFrame({"A":[7,8], "E":[9,10], "B":[11,12]}, index=["a","b"])
[df_one] [df_two] A B C A E B0 1 3 5 a 7 9 111 2 4 6 b 8 10 12
axis=0
a_one, a_two = df_one.align(df_two, axis=0)
[a_one] [a_two] A B C A E B0 1.0 3.0 5.0 0 NaN NaN NaN1 2.0 4.0 6.0 1 NaN NaN NaNa NaN NaN NaN a 7.0 9.0 11.0b NaN NaN NaN b 8.0 10.0 12.0
By setting axis=0
, we are telling Pandas to align the row labels, that is, for both resulting DataFrames to have the exact same row labels. However, notice how the column labels are kept intact for both DataFrames.
axis=1
a_one, a_two = df_one.align(df_two, axis=1)
[a_one] | [a_two] A B C E | A B C E0 1 3 5 NaN | a 7 12 NaN 91 2 4 6 NaN | b 8 12 NaN 10
By setting axis=1
, we are telling Pandas to align the column labels, that is, for both resulting DataFrames to have the exact same column labels. However, notice how the row labels are kept intact for both DataFrames.
axis=None
The default parameter value is axis=None
:
a_one, a_two = df_one.align(df_two) # axis=None
[a_one] [a_two] A B C E A B C E0 1.0 3.0 5.0 NaN 0 NaN NaN NaN NaN1 2.0 4.0 6.0 NaN 1 NaN NaN NaN NaNa NaN NaN NaN NaN a 7.0 11.0 NaN 9.0b NaN NaN NaN NaN b 8.0 12.0 NaN 10.0
The axis=None
is a combination of axis=0
and axis=1
, that is, the resulting DataFrames will share the same row labels as well as the column labels.
Performing filling
Consider the same DataFrames we had before:
df_one = pd.DataFrame({"A":[1,2], "B":[3,4], "C":[5,6]})df_two = pd.DataFrame({"A":[7,8], "E":[9,10], "B":[11,12]}, index=["a","b"])
[df_one] [df_two] A B C A E B0 1 3 5 a 7 9 111 2 4 6 b 8 10 12
Performing horizontal alignment using outer full-join yields:
a_one, a_two = df_one.align(df_two, axis=1) # join="outer"
[a_one] | [a_two] A B C E | A B C E0 1 3 5 NaN | a 7 12 NaN 91 2 4 6 NaN | b 8 12 NaN 10
Notice how we end up with missing values here since no filling is performed by default.
To fill the NaN
s, we can specify parameters method
and optionally fill_axis
:
a_one, a_two = df_one.align(df_two, axis=1, method="ffill", fill_axis=1)a_one, a_two
[a_one] | [a_two] A B C E | A B C E0 1.0 3.0 5.0 5.0 | a 7.0 11.0 11.0 9.01 2.0 4.0 6.0 6.0 | b 8.0 12.0 12.0 10.0
Here, note the following:
method="ffill"
applies a forward-fill, meaningNaN
s are filled using the previous valid observation.fill_axis=1
performs the forward-fill horizontally.