Python's Higher Functions

PythonPythonBeginner
Practice Now

This tutorial is from open-source community. Access the source code

Introduction

In this lab, you will learn about higher-order functions in Python. Higher-order functions can accept other functions as arguments or return functions as results. This concept is crucial in functional programming and enables you to write more modular and reusable code.

You will understand what higher-order functions are, create one that takes a function as an argument, refactor existing functions to use a higher-order function, and utilize Python's built-in map() function. The reader.py file will be modified during the lab.

Understanding Code Duplication

Let's start by looking at the current code in the reader.py file. In programming, examining existing code is an important step to understand how things work and identify areas for improvement. You can open the reader.py file in the WebIDE. There are two ways to do this. You can click on the file in the file explorer, or you can run the following commands in the terminal. These commands first navigate to the project directory and then display the contents of the reader.py file.

cd ~/project
cat reader.py

When you look at the code, you'll notice there are two functions. Functions in Python are blocks of code that perform a specific task. Here are the two functions and what they do:

  1. csv_as_dicts(): This function takes CSV data and converts it into a list of dictionaries. A dictionary in Python is a collection of key - value pairs, which is useful for storing data in a structured way.
  2. csv_as_instances(): This function takes CSV data and converts it into a list of instances. An instance is an object created from a class, which is a blueprint for creating objects.

Now, let's take a closer look at these two functions. You'll see that they are quite similar. Both functions follow these steps:

  • First, they initialize an empty records list. A list in Python is a collection of items that can be of different types. Initializing an empty list means creating a list with no items in it, which will be used to store the processed data.
  • Then, they use csv.reader() to parse the input. Parsing means analyzing the input data to extract meaningful information. In this case, csv.reader() helps us read the CSV data row by row.
  • They handle the headers in the same way. Headers in a CSV file are the first row that usually contains the names of the columns.
  • After that, they loop through each row in the CSV data. A loop is a programming construct that allows you to execute a block of code multiple times.
  • For each row, they process it to create a record. This record can be either a dictionary or an instance, depending on the function.
  • They append the record to the records list. Appending means adding an item to the end of the list.
  • Finally, they return the records list, which contains all the processed data.

This duplication of code is a problem for several reasons. When code is duplicated:

  • It becomes harder to maintain. If you need to make a change to the code, you have to make the same change in multiple places. This takes more time and effort.
  • Any changes must be implemented in multiple places. This increases the chance that you might forget to make the change in one of the places, leading to inconsistent behavior.
  • It also increases the chance of introducing bugs. Bugs are errors in the code that can cause it to behave unexpectedly.

The only real difference between these two functions is how they convert a row into a record. This is a classic situation where a higher - order function can be very useful. A higher - order function is a function that can take another function as an argument or return a function as a result.

Let's look at some sample usage of these functions to better understand how they work. The following code shows how to use csv_as_dicts() and csv_as_instances():

## Example of using csv_as_dicts
with open('portfolio.csv') as f:
    portfolio = csv_as_dicts(f, [str, int, float])
print(portfolio[0])  ## {'name': 'AA', 'shares': 100, 'price': 32.2}

## Example of using csv_as_instances
class Stock:
    @classmethod
    def from_row(cls, row):
        return cls(row[0], int(row[1]), float(row[2]))

    def __init__(self, name, shares, price):
        self.name = name
        self.shares = shares
        self.price = price

with open('portfolio.csv') as f:
    portfolio = csv_as_instances(f, Stock)
print(portfolio[0].name, portfolio[0].shares, portfolio[0].price)  ## AA 100 32.2

In the next step, we'll create a higher - order function to eliminate this code duplication. This will make the code more maintainable and less error - prone.

✨ Check Solution and Practice

Creating a Higher-Order Function

In Python, a higher-order function is a function that can take another function as an argument. This allows for greater flexibility and code reuse. Now, let's create a higher-order function called convert_csv(). This function will handle the common operations of processing CSV data, while allowing you to customize how each row of the CSV is converted into a record.

Open the reader.py file in the WebIDE. We're going to add a function that will take an iterable of CSV data, a conversion function, and optionally, column headers. The conversion function will be used to transform each row of the CSV into a record.

Here's the code for the convert_csv() function. Copy and paste it into your reader.py file:

def convert_csv(lines, conversion_func, *, headers=None):
    '''
    Convert lines of CSV data using the provided conversion function

    Args:
        lines: An iterable containing CSV data
        conversion_func: A function that takes headers and a row and returns a record
        headers: Column headers (optional). If None, the first row is used as headers

    Returns:
        A list of records as processed by conversion_func
    '''
    records = []
    rows = csv.reader(lines)
    if headers is None:
        headers = next(rows)
    for row in rows:
        record = conversion_func(headers, row)
        records.append(record)
    return records

Let's break down what this function does. First, it initializes an empty list called records to store the converted records. Then, it uses the csv.reader() function to read the lines of CSV data. If no headers are provided, it takes the first row as the headers. For each subsequent row, it applies the conversion_func to convert the row into a record and adds it to the records list. Finally, it returns the list of records.

Now, we need a simple conversion function to test our convert_csv() function. This function will take the headers and a row and convert the row into a dictionary using the headers as keys.

Here's the code for the make_dict() function. Add this function to your reader.py file as well:

def make_dict(headers, row):
    '''
    Convert a row to a dictionary using the provided headers
    '''
    return dict(zip(headers, row))

The make_dict() function uses the zip() function to pair each header with its corresponding value in the row, and then creates a dictionary from these pairs.

Let's test these functions. Open a Python shell by running the following commands in the terminal:

cd ~/project
python3 -i reader.py

The -i option in the python3 command starts the Python interpreter in interactive mode and imports the reader.py file, so we can use the functions we just defined.

In the Python shell, run the following code to test our functions:

## Open the CSV file
lines = open('portfolio.csv')

## Convert to a list of dictionaries using our new function
result = convert_csv(lines, make_dict)

## Print the result
print(result)

This code opens the portfolio.csv file, uses the convert_csv() function with the make_dict() conversion function to convert the CSV data into a list of dictionaries, and then prints the result.

You should see output similar to the following:

[{'name': 'AA', 'shares': '100', 'price': '32.20'}, {'name': 'IBM', 'shares': '50', 'price': '91.10'}, {'name': 'CAT', 'shares': '150', 'price': '83.44'}, {'name': 'MSFT', 'shares': '200', 'price': '51.23'}, {'name': 'GE', 'shares': '95', 'price': '40.37'}, {'name': 'MSFT', 'shares': '50', 'price': '65.10'}, {'name': 'IBM', 'shares': '100', 'price': '70.44'}]

This output shows that our higher-order function convert_csv() works correctly. We've successfully created a function that takes another function as an argument, which gives us the ability to easily change how the CSV data is converted.

To exit the Python shell, you can type exit() or press Ctrl+D.

✨ Check Solution and Practice

Refactoring Existing Functions

Now, we have created a higher-order function named convert_csv(). Higher-order functions are functions that can take other functions as arguments or return functions as results. They are a powerful concept in Python that can help us write more modular and reusable code. In this section, we will use this higher-order function to refactor the original functions csv_as_dicts() and csv_as_instances(). Refactoring is the process of restructuring existing code without changing its external behavior, aiming to improve its internal structure, such as eliminating code duplication.

Let's start by opening the reader.py file in the WebIDE. We will update the functions as follows:

  1. First, we'll replace the csv_as_dicts() function. This function is used to convert lines of CSV data into a list of dictionaries. Here's the new code:
def csv_as_dicts(lines, types, *, headers=None):
    '''
    Convert lines of CSV data into a list of dictionaries
    '''
    def dict_converter(headers, row):
        return {name: func(val) for name, func, val in zip(headers, types, row)}

    return convert_csv(lines, dict_converter, headers=headers)

In this code, we define an inner function dict_converter that takes headers and row as arguments. It uses a dictionary comprehension to create a dictionary where the keys are the header names, and the values are the result of applying the corresponding type conversion function to the values in the row. Then, we call the convert_csv() function with the dict_converter function as an argument.

  1. Next, we'll replace the csv_as_instances() function. This function is used to convert lines of CSV data into a list of instances of a given class. Here's the new code:
def csv_as_instances(lines, cls, *, headers=None):
    '''
    Convert lines of CSV data into a list of instances
    '''
    def instance_converter(headers, row):
        return cls.from_row(row)

    return convert_csv(lines, instance_converter, headers=headers)

In this code, we define an inner function instance_converter that takes headers and row as arguments. It calls the from_row class method of the given class cls to create an instance from the row. Then, we call the convert_csv() function with the instance_converter function as an argument.

After refactoring these functions, we need to test them to make sure they still work as expected. To do this, we'll run the following commands in a Python shell:

cd ~/project
python3 -i reader.py

The cd ~/project command changes the current working directory to the project directory. The python3 -i reader.py command runs the reader.py file in interactive mode, which means we can continue to execute Python code after the file has finished running.

Once the Python shell is open, we'll run the following code to test the refactored functions:

## Define a simple Stock class for testing
class Stock:
    def __init__(self, name, shares, price):
        self.name = name
        self.shares = shares
        self.price = price

    @classmethod
    def from_row(cls, row):
        return cls(row[0], int(row[1]), float(row[2]))

    def __repr__(self):
        return f'Stock({self.name}, {self.shares}, {self.price})'

## Test csv_as_dicts
with open('portfolio.csv') as f:
    portfolio_dicts = csv_as_dicts(f, [str, int, float])
print("First dictionary:", portfolio_dicts[0])

## Test csv_as_instances
with open('portfolio.csv') as f:
    portfolio_instances = csv_as_instances(f, Stock)
print("First instance:", portfolio_instances[0])

In this code, we first define a simple Stock class for testing. The __init__ method initializes the attributes of a Stock instance. The from_row class method creates a Stock instance from a row of CSV data. The __repr__ method provides a string representation of the Stock instance.

Then, we test the csv_as_dicts() function by opening the portfolio.csv file and passing it to the function along with a list of type conversion functions. We print the first dictionary in the resulting list.

Finally, we test the csv_as_instances() function by opening the portfolio.csv file and passing it to the function along with the Stock class. We print the first instance in the resulting list.

If everything is working correctly, you should see output similar to the following:

First dictionary: {'name': 'AA', 'shares': 100, 'price': 32.2}
First instance: Stock(AA, 100, 32.2)

This output indicates that our refactored functions are working correctly. We've successfully eliminated the code duplication while maintaining the same functionality.

To exit the Python shell, you can type exit() or press Ctrl+D.

✨ Check Solution and Practice

Using the map() Function

In Python, a higher-order function is a function that can take another function as an argument or return a function as a result. Python's map() function is a great example of a higher-order function. It's a powerful tool that allows you to apply a given function to each item in an iterable, such as a list or a tuple. After applying the function to each item, it returns an iterator of the results. This feature makes map() perfect for processing sequences of data, like rows in a CSV file.

The basic syntax of the map() function is as follows:

map(function, iterable, ...)

Here, the function is the operation you want to perform on each item in the iterable. The iterable is a sequence of items, like a list or a tuple.

Let's look at a simple example. Suppose you have a list of numbers, and you want to square each number in that list. You can use the map() function to achieve this. Here's how you can do it:

numbers = [1, 2, 3, 4, 5]
squared = list(map(lambda x: x * x, numbers))
print(squared)  ## Output: [1, 4, 9, 16, 25]

In this example, we first define a list called numbers. Then, we use the map() function. The lambda function lambda x: x * x is the operation we want to perform on each item in the numbers list. The map() function applies this lambda function to each number in the list. Since map() returns an iterator, we convert it to a list using the list() function. Finally, we print the squared list, which contains the squared values of the original numbers.

Now, let's take a look at how we can use the map() function to modify our convert_csv() function. Previously, we used a for loop to iterate over the rows in the CSV data. Now, we'll replace that for loop with the map() function.

def convert_csv(lines, conversion_func, *, headers=None):
    '''
    Convert lines of CSV data using the provided conversion function
    '''
    rows = csv.reader(lines)
    if headers is None:
        headers = next(rows)

    ## Use map to apply conversion_func to each row
    records = list(map(lambda row: conversion_func(headers, row), rows))
    return records

This updated version of the convert_csv() function does exactly the same thing as the previous version, but it uses the map() function instead of a for loop. The lambda function inside the map() takes each row from the CSV data and applies the conversion_func to it, along with the headers.

Let's test this updated function to make sure it works correctly. First, open your terminal and navigate to the project directory. Then, start the Python interactive shell with the reader.py file.

cd ~/project
python3 -i reader.py

Once you're in the Python shell, run the following code to test the updated convert_csv() function:

## Test the updated convert_csv function
with open('portfolio.csv') as f:
    result = convert_csv(f, make_dict)
print(result[0])  ## Should print the first dictionary

## Test that csv_as_dicts still works
with open('portfolio.csv') as f:
    portfolio = csv_as_dicts(f, [str, int, float])
print(portfolio[0])  ## Should print the first dictionary with converted types

After running this code, you should see output similar to the following:

{'name': 'AA', 'shares': '100', 'price': '32.20'}
{'name': 'AA', 'shares': 100, 'price': 32.2}

This output shows that the updated convert_csv() function using the map() function works correctly, and the functions that rely on it also continue to work as expected.

Using the map() function has several advantages:

  1. It can be more concise than a for loop. Instead of writing multiple lines of code for a for loop, you can achieve the same result with a single line using map().
  2. It clearly communicates your intent to transform each item in a sequence. When you see map(), you immediately know that you're applying a function to each item in an iterable.
  3. It can be more memory-efficient because it returns an iterator. An iterator generates values on-the-fly, which means it doesn't store all the results in memory at once. In our example, we converted the iterator returned by map() to a list, but in some cases, you can work directly with the iterator to save memory.

To exit the Python shell, you can type exit() or press Ctrl+D.

✨ Check Solution and Practice

Summary

In this lab, you have learned about higher-order functions in Python and how they contribute to writing more modular and maintainable code. First, you identified code duplication in two similar functions. Then, you created a higher-order function convert_csv() that accepts a conversion function as an argument and refactored the original functions to use it. Finally, you updated the higher-order function to utilize Python's built-in map() function.

These techniques are powerful assets in a Python programmer's toolkit. Higher-order functions promote code reuse and separation of concerns, while passing functions as arguments allows for more flexible and customizable behavior. Functions like map() offer concise ways to transform data. Mastering these concepts enables you to write Python code that is more concise, maintainable, and less error-prone.