If True, then the index of the resulting DataFrame will be reset to 0,1,...,n-1 where n is the number of rows of the DataFrame. By default, ignore_index=False.

5. keyslink | sequence | optional

Used to construct a hierarchical index. By default, keys=None.

6. levels | list<sequence> | optional

The levels used to construct a MultiIndex. By default, keys will be used.

7. nameslink | list<string> | optional

The labels assigned to the levels in the resulting hierarchical index. By default, names=None.

8. verify_integritylink | boolean | optional

If True, then an error will be thrown if the resulting Series/DataFrame contains duplicate index or column labels. This checking process may be computationally expensive. By default, verify_integrity=False.

9. sortlink | boolean | optional

Whether or not to sort non-concatenation axis. This is only applicable for join="outer", and not for join="inner".

10. copy | boolean | optional

Whether to return a new Series/DataFrame or reuse the provided objs if possible. By default, copy=True.

Return Value

The return type depends on the following parameters:

When axis=0 and concatenation is between Series, then a Series is returned.
When the concatenation involves at least one DataFrame, then a DataFrame is returned.
When axis=1, then a DataFrame is returned.

Examples

Consider the following DataFrames:


        
        
            
                
                
                    df = pd.DataFrame({"A":[2,3],"B":[4,5]})
df_other = pd.DataFrame({"A":[6,7],"B":[8,9]})
                
            
               A  B  |     A  B
0  2  4  |  0  6  8
1  3  5  |  1  7  9

Concatenating multiple DataFrames vertically

To concatenate multiple DataFrames vertically:


        
        
            
                
                
                    pd.concat([df, df_other])   # axis=0
                
            
               A  B
0  2  4
1  3  5
0  6  8
1  7  9

Concatenating multiple DataFrames horizontally

To concatenate multiple DataFrames horizontally, pass in axis=1 like so:


        
        
            
                
                
                    pd.concat([df, df_other], axis=1)
                
            
               A  B  A  B
0  2  4  6  8
1  3  5  7  9

Specifying join

Consider the following DataFrames:


        
        
            
                
                
                    df = pd.DataFrame({"A":[2],"B":[3]})
df_other = pd.DataFrame({"B":[4],"C":[5]})
                
            
               A  B   |     B  C
0  2  3   |  0  4  5

Here, both the DataFrames both have column B.

Outer join

By default, join="outer", which means that all columns will appear in the resulting DataFrame, and the columns with the same label will be stacked:


        
        
            
                
                
                    pd.concat([df,df_other], join="inner")
                
            
               A    B  C
0  2.0  3  NaN
0  NaN  4  5.0

The reason why we get NaN for some entries is that, since column B is shared between the DataFrames, the values get stacked for B, but columns A and C only have a single value, so NaN must be inserted as a filler.

Inner join

To perform inner-join instead, set join="inner" like so:


        
        
            
                
                
                    pd.concat([df,df_other], join="inner")
                
            
               B
0  3
0  4

Here, only columns that appear in all the DataFrames will appear in the resulting DataFrame. Since only column B is shared between df and df_other, we only see column B in the output.

Concatenating Series

Concatenating Series works in the same as concatenating DataFrames.

To concatenate two Series vertically:


        
        
            
                
                
                    s1 = pd.Series(['a','b'])
s2 = pd.Series(['c','d'])
pd.concat([s1, s2])         # returns a Series
                
            
            0    a
1    b
0    c
1    d
dtype: object

To concatenate two Series horizontally:


        
        
            
                
                
                    s1 = pd.Series(['a','b'])
s2 = pd.Series(['c','d'])
pd.concat([s1, s2], axis=1)   # returns a DataFrame
                
            
               0  1
0  a  c
1  b  d

Specifying ignore_index

By default, ignore_index=False, which means the original indexes of the inputs will be preserved:


        
        
            
                
                
                    s1 = pd.Series([3,4], index=["a","b"])
s2 = pd.Series([5,6], index=["c","d"])
pd.concat([s1, s2])
                
            
            a    3
b    4
c    5
d    6
dtype: int64

To reset the index to the default integer indices:


        
        
            
                
                
                    s1 = pd.Series([3,4], index=["a","b"])
s2 = pd.Series([5,6], index=["c","d"])
pd.concat([s1, s2], ignore_index=True)
                
            
            0    3
1    4
2    5
3    6
dtype: int64

Specifying keys

To form a multi-index, specify the keys parameters:


        
        
            
                
                
                    s1 = pd.Series(["a","b"])
s2 = pd.Series(["c","d"])
pd.concat([s1, s2], keys=["A","B"])
                
            
            A  0    a
   1    b
B  0    c
   1    d
dtype: object

To add more levels, pass a tuple like so:


        
        
            
                
                
                    s1 = pd.Series(["a","b"])
s2 = pd.Series(["c","d"])
pd.concat([s1, s2], keys=[("A","B"),("C","D")])
                
            
            A  B  0    a
      1    b
C  D  0    c
      1    d
dtype: object

Specifying names

The names parameter is used to assign a label to the index of the resulting Series/DataFrame:


        
        
            
                
                
                    s1 = pd.Series(["a","b"])
s2 = pd.Series(["c","d"])
pd.concat([s1, s2], keys=["A","B"], names=["Groups"])
                
            
            Groups   
A       0    a
        1    b
B       0    c
        1    d
dtype: object

Here, the label "Groups" is assigned to the index of the Series.

Specifying verify_integrity

By default, verify_integrity=False, which means that duplicate indexes and column labels are allowed:


        
        
            
                
                
                    s1 = pd.Series(["a","b"])
s2 = pd.Series(["c","d"])
pd.concat([s1, s2])         # verify_integrity=False
                
            
            0    a
1    b
0    c
1    d
dtype: object

Notice how we have overlapping indexes 0 and 1.

Setting verify_integrity=True will throw an error in such cases:


        
        
            
                
                
                    s1 = pd.Series(["a","b"])
s2 = pd.Series(["c","d"])
pd.concat([s1, s2], verify_integrity=True)
                
            
            ValueError: Indexes have overlapping values: Int64Index([0, 1], dtype='int64')

If you want to ensure that the resulting Series/DataFrame has a unique index, consider setting ignore_index=True.

Specifying sort

By default, sort=False, which means that the resulting column labels or indexes will not be sorted:


        
        
            
                
                
                    df = pd.DataFrame({"C":[2,3],"B":[4,5]})
df_other = pd.DataFrame({"A":[6,7],"D":[8,9]})
pd.concat([df, df_other])      # axis=0
                
            
               C    B    A    D 
0  2.0  4.0  NaN  NaN
1  3.0  5.0  NaN  NaN
0  NaN  NaN  6.0  8.0
1  NaN  NaN  7.0  9.0

Notice how the columns are not sorted by column labels.

When axis=0 and sort=True, the columns will be sorted by column labels:


        
        
            
                
                
                    df = pd.DataFrame({"C":[2,3],"B":[4,5]})
df_other = pd.DataFrame({"A":[6,7],"D":[8,9]})
pd.concat([df, df_other], sort=True)
                
            
               A    B    C    D
0  NaN  4.0  2.0  NaN
1  NaN  5.0  3.0  NaN
0  6.0  NaN  NaN  8.0
1  7.0  NaN  NaN  9.0

When axis=1 and sort=True, the rows will be sorted by row labels:


        
        
            
                
                
                    df = pd.DataFrame({"C":[2,3],"B":[4,5]}, index=[3,2])
df_other = pd.DataFrame({"A":[6,7],"D":[8,9]}, index=[1,4])
pd.concat([df, df_other], axis=1, sort=True)
                
            
               C    B    A    D
1  NaN  NaN  6.0  8.0
2  3.0  5.0  NaN  NaN
3  2.0  4.0  NaN  NaN
4  NaN  NaN  7.0  9.0

Published by Isshin Inada

Edited by 0 others

Did you find this page useful?

thumb_up

thumb_down

Comment

Citation

Ask a question or leave a feedback...

Official Pandas Documentation

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.concat.html

thumb_up

thumb_down

chat_bubble_outline

settings

Enjoy our search

Hit / to insta-search docs and recipes!