Applying Pandas Function with Corresponding Cell Values from Two Different DataFrames

Pandas - Applying applymap with Corresponding Cell Values from Two Different DataFrames

===========================================================

In this article, we will explore how to apply a function using corresponding cell values from two different pandas dataframes. We’ll discuss the use of vectorization in pandas and show examples of how to achieve this without using loops.

Introduction


Pandas is a powerful library for data manipulation and analysis in Python. One of its key features is the ability to perform element-wise operations on DataFrames, which can be very useful in a variety of scenarios. In this article, we’ll focus on applying a function using corresponding cell values from two different DataFrames.

Problem Statement


Given two pandas DataFrames df1 and df2, where df1 contains the statistical data and df2 contains the percentage data, we want to apply a function that divides each value in df1 by its corresponding value in df2.

Inefficient Solution using Loops


One possible approach is to use loops to iterate over the rows of df1 and divide each value by its corresponding value in df2. This can be achieved as follows:

datlist = df.values.tolist()
headerList = datlist[:1]  # Save Header row to merge after calculations
statList = datlist[1:]  # Remove Header row for calculations
perlist = dfPerc.values.tolist()[1:]  # Remove Header row for calculations

AdjList = []
for z in range(len(statList)):
    PlayerList = []
    for i in range(len(statList[z])):
        if isinstance(statList[z][i], int) == True:
            adjStat = statList[z][i] / perlist[z][i]
        else:
            adjStat = statList[z][i]
        PlayerList.append(adjStat)
    AdjList.append(PlayerList)

AdjList.insert(0, headerList[0])
finaldf = pd.DataFrame(AdjList)
print(finaldf)

This approach is inefficient because it creates temporary lists and uses loops to iterate over the data.

Efficient Solution using applymap


A more efficient way to achieve this is by using the applymap function, which applies a function element-wise to an entire DataFrame. We can use the following code:

columns = ['Round 1', 'Round 2', 'Round 3', 'Round 4']
final = df1[columns] / df2[columns]

This approach is much faster and more memory-efficient than the previous one.

Getting Names Back


If we want to assign the names back to the resulting DataFrame, we can use the following code:

final['Name'] = df1['Name']

Using filter


Another way to achieve this is by using the filter function, which returns a new DataFrame with only the specified columns. We can use the following code:

final = df1.filter(like='Round') / df2.filter(like='Round')

This approach is also efficient and easy to read.

Conclusion


In this article, we discussed how to apply a function using corresponding cell values from two different pandas DataFrames. We showed an inefficient solution using loops and then presented more efficient solutions using applymap and filter. These approaches are much faster and more memory-efficient than the previous one.


Last modified on 2023-05-19