Pandas DataFrame DataFrame.append() Function

Suraj Joshi Feb 16, 2024 Pandas Pandas DataFrame

pandas.DataFrame.append() takes a DataFrame as input and merges its rows with rows of DataFrame calling the method finally returning a new DataFrame. If any column in input DataFrame is not present in caller DataFrame, then the columns are added to DataFrame, and the missing values are set to NaN.

Syntax of `pandas.DataFrame.append()` Method:

DataFrame.append(other, ignore_index=False, verify_integrity=False, sort=False)

Parameters


`other`	Input DataFrame or Series, or Python Dictionary-like whose rows are to be appended
`ignore_index`	Boolean. If `True`, the indexes from the original DataFrame is ignored. The default value is `False` which means the indexes are used.
`verify_integrity`	Boolean. If `True`, raise `ValueError` on creating index with duplicates. The default value is `False`.
`sort`	Boolean. It sorts the original and the other DataFrame if the columns are not aligned.

Example Codes: Append Two DataFrames With `pandas.DataFrame.append()`

import pandas as pd

names_1=['Hisila', 'Brian','Zeppy']
salary_1=[23,30,21]

names_2=['Ram','Shyam',"Hari"]
salary_2=[22,23,31]

df_1 = pd.DataFrame({'Name': names_1, 'Salary': salary_1})
df_2 = pd.DataFrame({'Name': names_2, 'Salary': salary_2})


merged_df = df_1.append(df_2)
print(merged_df)

Output:

     Name  Salary
0  Hisila      23
1   Brian      30
2   Zeppy      21
    Name  Salary
0    Ram      22
1  Shyam      23
2   Hari      31
     Name  Salary
0  Hisila      23
1   Brian      30
2   Zeppy      21
0     Ram      22
1   Shyam      23
2    Hari      31

It appends df_2 at the end of df_1 and returns merged_df merging rows of both DataFrames. Here, the indices of merged_df are the same as their parent DataFrames.

Example Codes: Append DataFrames and Ignore the Index With `pandas.DataFrame.append()`

import pandas as pd

names_1=['Hisila', 'Brian','Zeppy']
salary_1=[23,30,21]

names_2=['Ram','Shyam',"Hari"]
salary_2=[22,23,31]

df_1 = pd.DataFrame({'Name': names_1, 'Salary': salary_1})
df_2 = pd.DataFrame({'Name': names_2, 'Salary': salary_2})

merged_df = df_1.append(df_2,ignore_index=True)

print(df_1)
print(df_2)
print( merged_df)

Output:

     Name  Salary
0  Hisila      23
1   Brian      30
2   Zeppy      21
    Name  Salary
0    Ram      22
1  Shyam      23
2   Hari      31
     Name  Salary
0  Hisila      23
1   Brian      30
2   Zeppy      21
3     Ram      22
4   Shyam      23
5    Hari      31

It appends df_2 at end of df_1 and here the merged_df gets completely new indices by using ignore_index=True argument in append() method.

Set `verify_integrity=True` in `DataFrame.append()` Method

If we set verify_integrity=True in append() method, we get the ValueError for duplicate indices.

import pandas as pd

names_1=['Hisila', 'Brian','Zeppy']
salary_1=[23,30,21]

names_2=['Ram','Shyam',"Hari"]
salary_2=[22,23,31]

df_1 = pd.DataFrame({'Name': names_1, 'Salary': salary_1})
df_2 = pd.DataFrame({'Name': names_2, 'Salary': salary_2})

merged_df = df_1.append(df_2,verify_integrity=True)

print(df_1)
print(df_2)
print( merged_df)

Output:

ValueError: Indexes have overlapping values: Int64Index([0, 1, 2], dtype='int64')

It generates a ValueError because the elements in df_1 and df_2 have the same indices by default. To prevent this error, we use the default value of verify_integrity i.e. verify_integrity=False.

Example Codes: Append Dataframe With Different Column(s)

If we append a DataFrame with a different column, this column is added to the resulted DataFrame, and the corresponding cells of the non-existing columns in the original or the other DataFrame are set to be NaN.

import pandas as pd

names_1=['Hisila', 'Brian','Zeppy']
salary_1=[23,30,21]

names_2=['Ram','Shyam',"Hari"]
salary_2=[22,23,31]
Age=[30,31,33]

df_1 = pd.DataFrame({'Name': names_1, 'Salary': salary_1})
df_2 = pd.DataFrame({'Name': names_2, 'Salary': salary_2,"Age":Age})

merged_df = df_1.append(df_2, sort=False)

print(df_1)
print(df_2)
print( merged_df)

Output:

     Name  Salary
0  Hisila      23
1   Brian      30
2   Zeppy      21
    Name  Salary  Age
0    Ram      22   30
1  Shyam      23   31
2   Hari      31   33
     Name  Salary   Age
0  Hisila      23   NaN
1   Brian      30   NaN
2   Zeppy      21   NaN
0     Ram      22  30.0
1   Shyam      23  31.0
2    Hari      31  33.0

Here, the rows of df_1 get NaN values for the Age column because the Age column is present only in df_2.

We also set sort=False to silence the warning that sorting will be deprecated in the future Pandas version.

Enjoying our tutorials? Subscribe to DelftStack on YouTube to support us in creating more high-quality video guides. Subscribe

Author: Suraj Joshi

Suraj Joshi is a backend software engineer at Matrice.ai.

Syntax of pandas.DataFrame.append() Method:

Parameters

Example Codes: Append Two DataFrames With pandas.DataFrame.append()

Example Codes: Append DataFrames and Ignore the Index With pandas.DataFrame.append()

Set verify_integrity=True in DataFrame.append() Method

Example Codes: Append Dataframe With Different Column(s)

Related Article - Pandas DataFrame

Syntax of `pandas.DataFrame.append()` Method:

Example Codes: Append Two DataFrames With `pandas.DataFrame.append()`

Example Codes: Append DataFrames and Ignore the Index With `pandas.DataFrame.append()`

Set `verify_integrity=True` in `DataFrame.append()` Method