Introduction
The Pandas DataFrame cumsum() method is used to calculate the cumulative sum over a DataFrame or Series axis. It returns a DataFrame or Series of the same size that contains the cumulative sum.
VM Tips
After the VM startup is done, click the top left corner to switch to the Notebook tab to access Jupyter Notebook for practice.
Sometimes, you may need to wait a few seconds for Jupyter Notebook to finish loading. The validation of operations cannot be automated because of limitations in Jupyter Notebook.
If you face issues during learning, feel free to ask Labby. Provide feedback after the session, and we will promptly resolve the problem for you.
Import the required libraries
In order to use the cumsum() method, we need to import the pandas library.
import pandas as pd
Create a DataFrame
Next, we need to create a DataFrame using the data we want to perform the cumulative sum on. For example:
df = pd.DataFrame({"A": [1, 2, 3, 4], "B": [5, 6, 7, 8]})
Calculate the cumulative sum over the index axis
To calculate the cumulative sum over the index axis, we can use the cumsum() method on the DataFrame. By default, the cumsum() method calculates the sum column-wise, so we need to specify the axis as 0.
result = df.cumsum(axis=0)
Calculate the cumulative sum over the column axis
To calculate the cumulative sum over the column axis, we can again use the cumsum() method on the DataFrame, but this time we specify the axis as 1.
result = df.cumsum(axis=1)
Handling null values in the DataFrame
If the DataFrame contains null values, by default the cumsum() method skips these values. However, we can change this behavior and include the null values in the cumulative sum calculation by specifying skipna=False.
result = df.cumsum(axis=0, skipna=False)
Summary
The cumsum() method in Pandas allows us to calculate the cumulative sum over a DataFrame or Series axis. It can be used to perform cumulative sum calculations over both the index and column axes. The method also provides the option to include or exclude null values from the calculation.