Definitional Aspects of Functions

PythonPythonBeginner
Practice Now

This tutorial is from open-source community. Access the source code

Introduction

In this lab, you will learn to explore the fundamental aspects of Python functions and methods. You'll also make functions more flexible by designing parameters effectively.

Moreover, you will implement type hints to enhance code readability and safety, which are crucial for writing high - quality Python code.


Skills Graph

%%%%{init: {'theme':'neutral'}}%%%% flowchart RL python(("Python")) -.-> python/PythonStandardLibraryGroup(["Python Standard Library"]) python(("Python")) -.-> python/BasicConceptsGroup(["Basic Concepts"]) python(("Python")) -.-> python/ObjectOrientedProgrammingGroup(["Object-Oriented Programming"]) python(("Python")) -.-> python/FileHandlingGroup(["File Handling"]) python(("Python")) -.-> python/FunctionsGroup(["Functions"]) python/BasicConceptsGroup -.-> python/variables_data_types("Variables and Data Types") python/FunctionsGroup -.-> python/function_definition("Function Definition") python/FunctionsGroup -.-> python/arguments_return("Arguments and Return Values") python/FunctionsGroup -.-> python/default_arguments("Default Arguments") python/ObjectOrientedProgrammingGroup -.-> python/classes_objects("Classes and Objects") python/FileHandlingGroup -.-> python/file_opening_closing("Opening and Closing Files") python/FileHandlingGroup -.-> python/file_reading_writing("Reading and Writing Files") python/FileHandlingGroup -.-> python/file_operations("File Operations") python/PythonStandardLibraryGroup -.-> python/data_collections("Data Collections") subgraph Lab Skills python/variables_data_types -.-> lab-132503{{"Definitional Aspects of Functions"}} python/function_definition -.-> lab-132503{{"Definitional Aspects of Functions"}} python/arguments_return -.-> lab-132503{{"Definitional Aspects of Functions"}} python/default_arguments -.-> lab-132503{{"Definitional Aspects of Functions"}} python/classes_objects -.-> lab-132503{{"Definitional Aspects of Functions"}} python/file_opening_closing -.-> lab-132503{{"Definitional Aspects of Functions"}} python/file_reading_writing -.-> lab-132503{{"Definitional Aspects of Functions"}} python/file_operations -.-> lab-132503{{"Definitional Aspects of Functions"}} python/data_collections -.-> lab-132503{{"Definitional Aspects of Functions"}} end

Understanding the Context

In previous exercises, you may have encountered code that reads CSV files and stores the data in various data structures. The purpose of this code is to take raw text data from a CSV file and convert it into more useful Python objects, such as dictionaries or class instances. This conversion is essential because it allows us to work with the data in a more structured and meaningful way within our Python programs.

The typical pattern for reading CSV files often follows a specific structure. Here is an example of a function that reads a CSV file and converts each row into a dictionary:

import csv

def read_csv_as_dicts(filename, types):
    records = []
    with open(filename) as file:
        rows = csv.reader(file)
        headers = next(rows)
        for row in rows:
            record = { name: func(val)
                       for name, func, val in zip(headers, types, row) }
            records.append(record)
    return records

Let's break down how this function works. First, it imports the csv module, which provides functionality for working with CSV files in Python. The function takes two parameters: filename, which is the name of the CSV file to read, and types, which is a list of functions used to convert the data in each column to the appropriate data type.

Inside the function, it initializes an empty list called records to store the dictionaries representing each row of the CSV file. It then opens the file using the with statement, which ensures that the file is properly closed after the block of code is executed. The csv.reader function is used to create an iterator that reads each row of the CSV file. The first row is assumed to be the headers, so it is retrieved using the next function.

Next, the function iterates over the remaining rows in the CSV file. For each row, it creates a dictionary using a dictionary comprehension. The keys of the dictionary are the column headers, and the values are the result of applying the corresponding type conversion function from the types list to the value in the row. Finally, the dictionary is added to the records list, and the function returns the list of dictionaries.

Now, let's look at a similar function that reads data from a CSV file into class instances:

def read_csv_as_instances(filename, cls):
    records = []
    with open(filename) as file:
        rows = csv.reader(file)
        headers = next(rows)
        for row in rows:
            record = cls.from_row(row)
            records.append(record)
    return records

This function is similar to the previous one, but instead of creating dictionaries, it creates instances of a class. The function takes two parameters: filename, which is the name of the CSV file to read, and cls, which is the class whose instances will be created.

Inside the function, it follows a similar structure as the previous function. It initializes an empty list called records to store the class instances. It then opens the file, reads the headers, and iterates over the remaining rows. For each row, it calls the from_row method of the class cls to create an instance of the class using the data from the row. The instance is then added to the records list, and the function returns the list of instances.

In this lab, we will refactor these functions to make them more flexible and robust. We will also explore Python's type hinting system, which allows us to specify the expected types of the parameters and return values of our functions. This can make our code more readable and easier to understand, especially for other developers who may be working with our code.

Let's start by creating a reader.py file and adding these initial functions to it. Make sure to test these functions to ensure they work properly before moving on to the next steps.

Creating the Basic CSV Reader Functions

Let's start by creating a reader.py file with two basic functions for reading CSV data. These functions will help us handle CSV files in different ways, such as converting the data into dictionaries or class instances.

First, we need to understand what a CSV file is. CSV stands for Comma-Separated Values. It's a simple file format used to store tabular data, where each line represents a row, and the values in each row are separated by commas.

Now, let's create the reader.py file. Follow these steps:

  1. Open up the code editor and create a new file called reader.py in the /home/labex/project directory. This is where we'll write our functions to read CSV data.

  2. Add the following code to reader.py:

## reader.py

import csv

def read_csv_as_dicts(filename, types):
    '''
    Read CSV data into a list of dictionaries with optional type conversion

    Args:
        filename: Path to the CSV file
        types: List of type conversion functions for each column

    Returns:
        List of dictionaries with data from the CSV file
    '''
    records = []
    with open(filename) as file:
        rows = csv.reader(file)
        headers = next(rows)
        for row in rows:
            record = { name: func(val)
                       for name, func, val in zip(headers, types, row) }
            records.append(record)
    return records

def read_csv_as_instances(filename, cls):
    '''
    Read CSV data into a list of class instances

    Args:
        filename: Path to the CSV file
        cls: Class to create instances from

    Returns:
        List of class instances with data from the CSV file
    '''
    records = []
    with open(filename) as file:
        rows = csv.reader(file)
        headers = next(rows)
        for row in rows:
            record = cls.from_row(row)
            records.append(record)
    return records

In the read_csv_as_dicts function, we first open the CSV file using the open function. Then, we use the csv.reader to read the file line by line. The next(rows) statement reads the first line of the file, which usually contains the headers. After that, we iterate over the remaining rows. For each row, we create a dictionary where the keys are the headers and the values are the corresponding values in the row, with optional type conversion using the types list.

The read_csv_as_instances function is similar, but instead of creating dictionaries, it creates instances of a given class. It assumes that the class has a static method called from_row that can create an instance from a row of data.

  1. Let's test these functions to make sure they work correctly. Create a new file called test_reader.py with the following code:
## test_reader.py

import reader
import stock

## Test reading CSV as dictionaries
portfolio_dicts = reader.read_csv_as_dicts('portfolio.csv', [str, int, float])
print("First portfolio item as dictionary:", portfolio_dicts[0])
print("Total items:", len(portfolio_dicts))

## Test reading CSV as class instances
portfolio_instances = reader.read_csv_as_instances('portfolio.csv', stock.Stock)
print("\nFirst portfolio item as Stock instance:", portfolio_instances[0])
print("Total items:", len(portfolio_instances))

In the test_reader.py file, we import the reader module that we just created and the stock module. We then test the two functions by calling them with a sample CSV file named portfolio.csv. We print the first item and the total number of items in the portfolio to verify that the functions are working as expected.

  1. Run the test script from the terminal:
python test_reader.py

The output should look similar to this:

First portfolio item as dictionary: {'name': 'AA', 'shares': 100, 'price': 32.2}
Total items: 7

First portfolio item as Stock instance: Stock('AA', 100, 32.2)
Total items: 7

This confirms that our two functions are working correctly. The first function converts CSV data into a list of dictionaries with proper type conversion, and the second function creates class instances using a static method on the provided class.

In the next step, we'll refactor these functions to make them more flexible by allowing them to work with any iterable source of data, not just filenames.

โœจ Check Solution and Practice

Making Functions More Flexible

Currently, our functions are limited to reading from files specified by a filename. This restricts their usability. In programming, it's often beneficial to make functions more flexible so that they can handle different types of input. In our case, it would be great if our functions could work with any iterable that produces lines, such as file objects or other sources. This way, we can use these functions in more scenarios, like reading from compressed files or other data streams.

Let's refactor our code to enable this flexibility:

  1. Open the reader.py file. We're going to modify it to include some new functions. These new functions will allow our code to work with different types of iterables. Here's the code you need to add:
## reader.py

import csv

def csv_as_dicts(lines, types):
    '''
    Parse CSV data from an iterable into a list of dictionaries

    Args:
        lines: An iterable producing CSV lines
        types: List of type conversion functions for each column

    Returns:
        List of dictionaries with data from the CSV lines
    '''
    records = []
    rows = csv.reader(lines)
    headers = next(rows)
    for row in rows:
        record = { name: func(val)
                  for name, func, val in zip(headers, types, row) }
        records.append(record)
    return records

def csv_as_instances(lines, cls):
    '''
    Parse CSV data from an iterable into a list of class instances

    Args:
        lines: An iterable producing CSV lines
        cls: Class to create instances from

    Returns:
        List of class instances with data from the CSV lines
    '''
    records = []
    rows = csv.reader(lines)
    headers = next(rows)
    for row in rows:
        record = cls.from_row(row)
        records.append(record)
    return records

def read_csv_as_dicts(filename, types):
    '''
    Read CSV data into a list of dictionaries with optional type conversion

    Args:
        filename: Path to the CSV file
        types: List of type conversion functions for each column

    Returns:
        List of dictionaries with data from the CSV file
    '''
    with open(filename) as file:
        return csv_as_dicts(file, types)

def read_csv_as_instances(filename, cls):
    '''
    Read CSV data into a list of class instances

    Args:
        filename: Path to the CSV file
        cls: Class to create instances from

    Returns:
        List of class instances with data from the CSV file
    '''
    with open(filename) as file:
        return csv_as_instances(file, cls)

Let's take a closer look at how we've refactored the code:

  1. We've created two more generic functions, csv_as_dicts() and csv_as_instances(). These functions are designed to work with any iterable that produces CSV lines. This means they can handle different types of input sources, not just files specified by a filename.

  2. We've reimplemented read_csv_as_dicts() and read_csv_as_instances() to use these new functions. This way, the original functionality of reading from a file by filename is still available, but now it's built on top of the more flexible functions.

  3. This approach maintains backward compatibility with existing code. That means any code that was using the old functions will still work as expected. At the same time, our library becomes more flexible because it can now handle different types of input sources.

  4. Now, let's test these new functions. Create a file called test_reader_flexibility.py and add the following code to it. This code will test the new functions with different types of input sources:

## test_reader_flexibility.py

import reader
import stock
import gzip

## Test opening a regular file
with open('portfolio.csv') as file:
    portfolio = reader.csv_as_dicts(file, [str, int, float])
    print("First item from open file:", portfolio[0])

## Test opening a gzipped file
with gzip.open('portfolio.csv.gz', 'rt') as file:  ## 'rt' means read text
    portfolio = reader.csv_as_instances(file, stock.Stock)
    print("\nFirst item from gzipped file:", portfolio[0])

## Test backward compatibility
portfolio = reader.read_csv_as_dicts('portfolio.csv', [str, int, float])
print("\nFirst item using backward compatible function:", portfolio[0])
  1. After creating the test file, we need to run the test script from the terminal. Open your terminal and navigate to the directory where the test_reader_flexibility.py file is located. Then run the following command:
python test_reader_flexibility.py

The output should look similar to this:

First item from open file: {'name': 'AA', 'shares': 100, 'price': 32.2}

First item from gzipped file: Stock('AA', 100, 32.2)

First item using backward compatible function: {'name': 'AA', 'shares': 100, 'price': 32.2}

This output confirms that our functions now work with different types of input sources while maintaining backward compatibility. The refactored functions can process data from:

  • Regular files opened with open()
  • Compressed files opened with gzip.open()
  • Any other iterable object that produces lines of text

This makes our code much more flexible and easier to use in different scenarios.

โœจ Check Solution and Practice

Handling CSV Files Without Headers

In the world of data processing, not all CSV files come with headers in their first row. Headers are the names given to each column in a CSV file, which help us understand what kind of data each column holds. When a CSV file lacks headers, we need a way to handle it properly. In this section, we'll modify our functions to allow the caller to provide the headers manually, so we can work with CSV files both with and without headers.

  1. Open the reader.py file and update it to include header handling:
## reader.py

import csv

def csv_as_dicts(lines, types, headers=None):
    '''
    Parse CSV data from an iterable into a list of dictionaries

    Args:
        lines: An iterable producing CSV lines
        types: List of type conversion functions for each column
        headers: Optional list of column names. If None, first row is used as headers

    Returns:
        List of dictionaries with data from the CSV lines
    '''
    records = []
    rows = csv.reader(lines)

    if headers is None:
        ## Use the first row as headers if none provided
        headers = next(rows)

    for row in rows:
        record = { name: func(val)
                  for name, func, val in zip(headers, types, row) }
        records.append(record)
    return records

def csv_as_instances(lines, cls, headers=None):
    '''
    Parse CSV data from an iterable into a list of class instances

    Args:
        lines: An iterable producing CSV lines
        cls: Class to create instances from
        headers: Optional list of column names. If None, first row is used as headers

    Returns:
        List of class instances with data from the CSV lines
    '''
    records = []
    rows = csv.reader(lines)

    if headers is None:
        ## Skip the first row if no headers provided
        next(rows)

    for row in rows:
        record = cls.from_row(row)
        records.append(record)
    return records

def read_csv_as_dicts(filename, types, headers=None):
    '''
    Read CSV data into a list of dictionaries with optional type conversion

    Args:
        filename: Path to the CSV file
        types: List of type conversion functions for each column
        headers: Optional list of column names. If None, first row is used as headers

    Returns:
        List of dictionaries with data from the CSV file
    '''
    with open(filename) as file:
        return csv_as_dicts(file, types, headers)

def read_csv_as_instances(filename, cls, headers=None):
    '''
    Read CSV data into a list of class instances

    Args:
        filename: Path to the CSV file
        cls: Class to create instances from
        headers: Optional list of column names. If None, first row is used as headers

    Returns:
        List of class instances with data from the CSV file
    '''
    with open(filename) as file:
        return csv_as_instances(file, cls, headers)

Let's understand the key changes we've made to these functions:

  1. We've added a headers parameter to all functions, and we've set its default value to None. This means that if the caller doesn't provide any headers, the functions will use the default behavior.

  2. In the csv_as_dicts function, we use the first row as headers only if the headers parameter is None. This allows us to handle files with headers automatically.

  3. In the csv_as_instances function, we skip the first row only if the headers parameter is None. This is because if we're providing our own headers, the first row of the file is actual data, not headers.

  4. Let's test these modifications with our file without headers. Create a file called test_headers.py:

## test_headers.py

import reader
import stock

## Define column names for the file without headers
column_names = ['name', 'shares', 'price']

## Test reading a file without headers
portfolio = reader.read_csv_as_dicts('portfolio_noheader.csv',
                                     [str, int, float],
                                     headers=column_names)
print("First item from file without headers:", portfolio[0])
print("Total items:", len(portfolio))

## Test reading the same file as instances
portfolio = reader.read_csv_as_instances('portfolio_noheader.csv',
                                        stock.Stock,
                                        headers=column_names)
print("\nFirst item as Stock instance:", portfolio[0])
print("Total items:", len(portfolio))

## Verify that original functionality still works
portfolio = reader.read_csv_as_dicts('portfolio.csv', [str, int, float])
print("\nFirst item from file with headers:", portfolio[0])

In this test script, we first define the column names for the file without headers. Then we test reading the file without headers as a list of dictionaries and as a list of class instances. Finally, we verify that the original functionality still works by reading a file with headers.

  1. Run the test script from the terminal:
python test_headers.py

The output should look similar to:

First item from file without headers: {'name': 'AA', 'shares': 100, 'price': 32.2}
Total items: 7

First item as Stock instance: Stock('AA', 100, 32.2)
Total items: 7

First item from file with headers: {'name': 'AA', 'shares': 100, 'price': 32.2}

This output confirms that our functions can now handle CSV files both with and without headers. The user can provide column names when needed, or rely on the default behavior of reading headers from the first row.

By making this modification, our CSV reader functions are now more versatile and can handle a wider range of file formats. This is an important step in making our code more robust and useful in different scenarios.

Adding Type Hints

In Python 3.5 and later versions, type hints are supported. Type hints are a way to indicate the expected data types of variables, function parameters, and return values in your code. They don't change how the code runs, but they make the code more readable and can help catch certain types of errors before the code is actually run. Now, let's add type hints to our CSV reader functions.

  1. Open the reader.py file and update it to include type hints:
## reader.py

import csv
from typing import List, Callable, Dict, Any, Type, Optional, TextIO, Iterator, TypeVar

## Define a generic type for the class parameter
T = TypeVar('T')

def csv_as_dicts(lines: Iterator[str],
                types: List[Callable[[str], Any]],
                headers: Optional[List[str]] = None) -> List[Dict[str, Any]]:
    '''
    Parse CSV data from an iterable into a list of dictionaries

    Args:
        lines: An iterable producing CSV lines
        types: List of type conversion functions for each column
        headers: Optional list of column names. If None, first row is used as headers

    Returns:
        List of dictionaries with data from the CSV lines
    '''
    records: List[Dict[str, Any]] = []
    rows = csv.reader(lines)

    if headers is None:
        ## Use the first row as headers if none provided
        headers = next(rows)

    for row in rows:
        record = { name: func(val)
                  for name, func, val in zip(headers, types, row) }
        records.append(record)
    return records

def csv_as_instances(lines: Iterator[str],
                    cls: Type[T],
                    headers: Optional[List[str]] = None) -> List[T]:
    '''
    Parse CSV data from an iterable into a list of class instances

    Args:
        lines: An iterable producing CSV lines
        cls: Class to create instances from
        headers: Optional list of column names. If None, first row is used as headers

    Returns:
        List of class instances with data from the CSV lines
    '''
    records: List[T] = []
    rows = csv.reader(lines)

    if headers is None:
        ## Skip the first row if no headers provided
        next(rows)

    for row in rows:
        record = cls.from_row(row)
        records.append(record)
    return records

def read_csv_as_dicts(filename: str,
                     types: List[Callable[[str], Any]],
                     headers: Optional[List[str]] = None) -> List[Dict[str, Any]]:
    '''
    Read CSV data into a list of dictionaries with optional type conversion

    Args:
        filename: Path to the CSV file
        types: List of type conversion functions for each column
        headers: Optional list of column names. If None, first row is used as headers

    Returns:
        List of dictionaries with data from the CSV file
    '''
    with open(filename) as file:
        return csv_as_dicts(file, types, headers)

def read_csv_as_instances(filename: str,
                         cls: Type[T],
                         headers: Optional[List[str]] = None) -> List[T]:
    '''
    Read CSV data into a list of class instances

    Args:
        filename: Path to the CSV file
        cls: Class to create instances from
        headers: Optional list of column names. If None, first row is used as headers

    Returns:
        List of class instances with data from the CSV file
    '''
    with open(filename) as file:
        return csv_as_instances(file, cls, headers)

Let's understand the key changes we've made in the code:

  1. We imported types from the typing module. This module provides a set of types that we can use to define type hints. For example, List, Dict, and Optional are types from this module.

  2. We added a generic type variable T to represent the class type. A generic type variable allows us to write functions that can work with different types in a type-safe way.

  3. We added type hints to all function parameters and return values. This makes it clear what types of arguments a function expects and what type of value it returns.

  4. We used appropriate container types like List, Dict, and Optional. List represents a list, Dict represents a dictionary, and Optional indicates that a parameter can either have a certain type or be None.

  5. We used Callable for the type conversion functions. Callable is used to indicate that a parameter is a function that can be called.

  6. We used the generic T to express that csv_as_instances returns a list of instances of the class passed in. This helps the IDE and other tools understand the type of the returned objects.

  7. Now, let's create a simple test file to ensure everything still works properly:

## test_types.py

import reader
import stock

## The functions should work exactly as before
portfolio = reader.read_csv_as_dicts('portfolio.csv', [str, int, float])
print("First item:", portfolio[0])

## But now we have better type checking and IDE support
stock_portfolio = reader.read_csv_as_instances('portfolio.csv', stock.Stock)
print("\nFirst stock:", stock_portfolio[0])

## We can see that stock_portfolio is a list of Stock objects
## This helps IDEs provide better code completion
first_stock = stock_portfolio[0]
print(f"\nName: {first_stock.name}")
print(f"Shares: {first_stock.shares}")
print(f"Price: {first_stock.price}")
print(f"Value: {first_stock.shares * first_stock.price}")
  1. Run the test script from the terminal:
python test_types.py

The output should look similar to:

First item: {'name': 'AA', 'shares': 100, 'price': 32.2}

First stock: Stock('AA', 100, 32.2)

Name: AA
Shares: 100
Price: 32.2
Value: 3220.0

Type hints don't change how the code runs, but they provide several benefits:

  1. They offer better IDE support with code completion. When you use an IDE like PyCharm or VS Code, it can use the type hints to suggest the correct methods and attributes for your variables.
  2. They provide clearer documentation about expected parameter and return types. Just by looking at the function definition, you can tell what types of arguments it expects and what type of value it returns.
  3. They allow you to use static type checkers like mypy to catch errors early. Static type checkers analyze your code without running it and can find type-related errors before you run the code.
  4. They improve code readability and maintainability. When you or other developers come back to the code later, it's easier to understand what the code is doing.

In a large codebase, these benefits can significantly reduce bugs and make the code easier to understand and maintain.

Note: Type hints are optional in Python, but they're increasingly used in professional code. Libraries like those in the Python standard library and many popular third-party packages now include extensive type hints.

Summary

In this lab, you have learned several key aspects of function design in Python. First, you learned basic function design, specifically how to write functions to process CSV data into various data structures. You also explored function flexibility by refactoring functions to work with any iterable source, enhancing code versatility and reusability.

Moreover, you mastered adding optional parameters to handle different use - cases, like CSV files with or without headers, and using Python's type hinting system to improve code readability and maintainability. These skills are crucial for writing robust Python code, and as your programs become more complex, these design principles will keep your code organized and understandable. The techniques can be applied beyond CSV processing, making them valuable in your Python programming toolkit.