Review Basic File I/O

PythonPythonBeginner
Practice Now

This tutorial is from open-source community. Access the source code

Introduction

In this lab, you will learn to review basic file input and output operations in Python. You'll create a Python program to read data from a file with stock portfolio information and calculate the portfolio's total cost.

The objectives of this lab include learning how to open and read files in Python, processing data from files line by line, performing calculations on the data, and outputting the results. The file you will create is pcost.py.


Skills Graph

%%%%{init: {'theme':'neutral'}}%%%% flowchart RL python(("Python")) -.-> python/FileHandlingGroup(["File Handling"]) python(("Python")) -.-> python/PythonStandardLibraryGroup(["Python Standard Library"]) python(("Python")) -.-> python/ControlFlowGroup(["Control Flow"]) python(("Python")) -.-> python/FunctionsGroup(["Functions"]) python(("Python")) -.-> python/ErrorandExceptionHandlingGroup(["Error and Exception Handling"]) python/ControlFlowGroup -.-> python/conditional_statements("Conditional Statements") python/ControlFlowGroup -.-> python/for_loops("For Loops") python/FunctionsGroup -.-> python/function_definition("Function Definition") python/ErrorandExceptionHandlingGroup -.-> python/catching_exceptions("Catching Exceptions") python/FileHandlingGroup -.-> python/file_opening_closing("Opening and Closing Files") python/FileHandlingGroup -.-> python/file_reading_writing("Reading and Writing Files") python/PythonStandardLibraryGroup -.-> python/data_collections("Data Collections") subgraph Lab Skills python/conditional_statements -.-> lab-132392{{"Review Basic File I/O"}} python/for_loops -.-> lab-132392{{"Review Basic File I/O"}} python/function_definition -.-> lab-132392{{"Review Basic File I/O"}} python/catching_exceptions -.-> lab-132392{{"Review Basic File I/O"}} python/file_opening_closing -.-> lab-132392{{"Review Basic File I/O"}} python/file_reading_writing -.-> lab-132392{{"Review Basic File I/O"}} python/data_collections -.-> lab-132392{{"Review Basic File I/O"}} end

Understanding the Problem

In this step, we'll first understand what the problem we need to solve is and then take a look at the data we'll be working with. This is an important first step in any programming task because it helps us know exactly what we're aiming for and what resources we have at our disposal.

In your project directory, there's a file named portfolio.dat. This file stores information about a portfolio of stocks. A portfolio is like a collection of different stocks that an investor owns. Each line in this file represents a single stock purchase. The format of each line is as follows:

[Stock Symbol] [Number of Shares] [Price per Share]

The stock symbol is a short code that represents a particular company's stock. The number of shares tells us how many units of that stock were bought, and the price per share is the cost of one unit of that stock.

Let's take a look at an example. Consider the first line of the file:

AA 100 32.20

This line indicates that 100 shares of the stock with the symbol "AA" were purchased. Each share cost $32.20.

If you want to see what's inside the portfolio.dat file, you can run the following command in the terminal. The cat command is a useful tool in the terminal that allows you to view the contents of a file.

cat ~/project/portfolio.dat

Now, your task is to create a Python program named pcost.py. This program will perform three main tasks:

  1. First, it needs to open and read the portfolio.dat file. Opening a file in Python allows our program to access the data stored inside it.
  2. Then, it has to calculate the total cost of all the stock purchases in the portfolio. To do this, for each line in the file, we need to multiply the number of shares by the price per share. After getting these values for each line, we sum them all up. This gives us the total amount of money spent on all the stocks in the portfolio.
  3. Finally, the program should output the total cost. This way, we can see the result of our calculations.

Let's begin by creating the pcost.py file. You can use the editor to open and edit this file. It was already created for you during the setup step. This file will be the place where you write the Python code to solve the problem we just discussed.

โœจ Check Solution and Practice

Opening and Reading the File

In this step, we're going to learn how to open and read a file in Python. File input/output (I/O) is a fundamental concept in programming. It allows your program to interact with external files, like text files, CSV files, and more. In Python, one of the most common ways to work with files is by using the open() function.

The open() function is used to open a file in Python. It takes two important arguments. The first argument is the name of the file you want to open. The second argument is the mode in which you want to open the file. When you want to read a file, you use the mode 'r'. This tells Python that you only want to read the contents of the file and not make any changes to it.

Now, let's add some code to the pcost.py file to open and read the portfolio.dat file. Open the pcost.py file in your code editor and add the following code:

## pcost.py
## Calculate the total cost of a portfolio of stocks

def portfolio_cost(filename):
    """
    Computes the total cost (shares*price) of a portfolio file
    """
    total_cost = 0.0

    ## Open the file
    with open(filename, 'r') as file:
        ## Read all lines in the file
        for line in file:
            print(line)  ## Just for debugging, to see what we're reading

    ## Return the total cost
    return total_cost

## Call the function with the portfolio file
total_cost = portfolio_cost('portfolio.dat')
print(f'Total cost: ${total_cost}')

Let's break down what this code does:

  1. First, we define a function named portfolio_cost(). This function takes a filename as an input parameter. The purpose of this function is to calculate the total cost of a portfolio of stocks based on the data in the file.
  2. Inside the function, we use the open() function to open the specified file in read mode. The with statement is used here to ensure that the file is properly closed after we're done reading it. This is a good practice to avoid resource leaks.
  3. We then use a for loop to read the file line by line. For each line in the file, we print it. This is just for debugging purposes, so we can see what data we're reading from the file.
  4. After reading the file, the function returns the total cost. Currently, the total cost is set to 0.0 because we haven't implemented the actual calculation yet.
  5. Outside the function, we call the portfolio_cost() function with the filename 'portfolio.dat'. This means we're asking the function to calculate the total cost based on the data in the portfolio.dat file.
  6. Finally, we print the total cost using an f-string.

Now, let's run this code to see what it does. You can run the Python file from the terminal using the following command:

python3 ~/project/pcost.py

When you run this command, you should see each line of the portfolio.dat file printed on the terminal, followed by the total cost, which is currently set to 0.0. This output helps you verify that the file is being read correctly.

โœจ Check Solution and Practice

Processing the Data

Now that we've learned how to read a file, the next step is to process each line of the file to calculate the cost of each stock purchase. This is an important part of working with data in Python, as it allows us to extract meaningful information from the file.

Each line in the file follows a specific format: [Stock Symbol] [Number of Shares] [Price per Share]. To calculate the cost of each stock purchase, we need to extract the number of shares and the price per share from each line. Then, we multiply these two values together to get the cost of that particular stock purchase. Finally, we add this cost to our running total to find the overall cost of the portfolio.

Let's modify the portfolio_cost() function in the pcost.py file to achieve this. Here's the modified code:

def portfolio_cost(filename):
    """
    Computes the total cost (shares*price) of a portfolio file
    """
    total_cost = 0.0

    ## Open the file
    with open(filename, 'r') as file:
        ## Read all lines in the file
        for line in file:
            ## Strip any leading/trailing whitespace
            line = line.strip()

            ## Skip empty lines
            if not line:
                continue

            ## Split the line into fields
            fields = line.split()

            ## Extract the relevant data
            ## fields[0] is the stock symbol (which we don't need for the calculation)
            shares = int(fields[1])  ## Number of shares (second field)
            price = float(fields[2])  ## Price per share (third field)

            ## Calculate the cost of this stock purchase
            cost = shares * price

            ## Add to the total cost
            total_cost += cost

            ## Print some debug information
            print(f'{fields[0]}: {shares} shares at ${price:.2f} = ${cost:.2f}')

    ## Return the total cost
    return total_cost

Let's break down what this modified function does step by step:

  1. Strips whitespace: We use the strip() method to remove any leading or trailing whitespace from each line. This ensures that we don't accidentally include extra spaces when we split the line into fields.
  2. Skips empty lines: If a line is empty (i.e., it contains only whitespace), we use the continue statement to skip it. This helps us avoid errors when trying to split an empty line.
  3. Splits the line into fields: We use the split() method to split each line into a list of fields based on whitespace. This allows us to access each part of the line separately.
  4. Extracts relevant data: We extract the number of shares and the price per share from the list of fields. The number of shares is the second field, and the price per share is the third field. We convert these values to the appropriate data types (int for shares and float for price) so that we can perform arithmetic operations on them.
  5. Calculates the cost: We multiply the number of shares by the price per share to calculate the cost of this stock purchase.
  6. Adds to the total: We add the cost of this stock purchase to the running total cost.
  7. Prints debug information: We print some information about each stock purchase to help us see what's happening. This includes the stock symbol, the number of shares, the price per share, and the total cost of the purchase.

Now, let's run the code to see if it works. Open your terminal and run the following command:

python3 ~/project/pcost.py

After running the command, you should see detailed information about each stock purchase, followed by the total cost of the portfolio. This output will help you verify that the function is working correctly and that you've calculated the total cost accurately.

โœจ Check Solution and Practice

Finalizing the Program

Now, we're going to clean up our code and create the final version of the pcost.py program. Cleaning up the code means removing any unnecessary parts and making sure the output looks good. This is an important step in programming because it makes our code more professional and easier to understand.

We'll start by removing the debug print statements. These statements are used during development to check the values of variables and the flow of the program, but they're not needed in the final version. Then, we'll ensure that the final output is formatted nicely.

Here's the final version of the pcost.py code:

## pcost.py
## Calculate the total cost of a portfolio of stocks

def portfolio_cost(filename):
    """
    Computes the total cost (shares*price) of a portfolio file
    """
    total_cost = 0.0

    try:
        ## Open the file
        with open(filename, 'r') as file:
            ## Read all lines in the file
            for line in file:
                ## Strip any leading/trailing whitespace
                line = line.strip()

                ## Skip empty lines
                if not line:
                    continue

                ## Split the line into fields
                fields = line.split()

                ## Extract the relevant data
                ## fields[0] is the stock symbol (which we don't need for the calculation)
                shares = int(fields[1])  ## Number of shares (second field)
                price = float(fields[2])  ## Price per share (third field)

                ## Calculate the cost of this stock purchase and add to the total
                total_cost += shares * price

    except FileNotFoundError:
        print(f"Error: Could not find file '{filename}'")
        return 0.0
    except Exception as e:
        print(f"Error processing file: {e}")
        return 0.0

    ## Return the total cost
    return total_cost

## Main block to run when the script is executed directly
if __name__ == '__main__':
    ## Call the function with the portfolio file
    total_cost = portfolio_cost('portfolio.dat')
    print(f'Total cost: ${total_cost:.2f}')

This final version of the code has several improvements:

  1. Error handling: We've added code to catch two types of errors. The FileNotFoundError is raised when the specified file doesn't exist. If this happens, the program will print an error message and return 0.0. The Exception block catches any other errors that might occur while processing the file. This makes our program more robust and less likely to crash unexpectedly.
  2. Proper formatting: The total cost is formatted to two decimal places using the :.2f format specifier in the f-string. This makes the output look more professional and easier to read.
  3. __name__ == '__main__' check: This is a common Python idiom. It ensures that the code inside the if block only runs when the script is executed directly. If the script is imported as a module into another script, this code won't run. This gives us more control over how our script behaves.

Now, let's run the final code. Open your terminal and enter the following command:

python3 ~/project/pcost.py

When you run this command, the program will read the portfolio.dat file, calculate the total cost of the portfolio, and print the result. You should see the total cost of the portfolio, which should be $44671.15.

Congratulations! You've successfully created a Python program that reads data from a file, processes it, and calculates a result. This is a great achievement, and it shows that you're on your way to becoming a proficient Python programmer.

โœจ Check Solution and Practice

Summary

In this lab, you have learned how to perform basic file I/O operations in Python. You can open and read files using the open() function and a context manager, process data line by line, parse text data, perform calculations, handle errors, and structure a complete Python program with functions and a main block.

These skills are fundamental for many Python programs and are useful in various applications, such as data analysis and configuration management. You can further enhance the program by adding command - line arguments, handling different file formats, improving error checking, and creating more detailed reports.