File and Folder Manipulation

PythonPythonBeginner
Practice Now

Introduction

This lab is designed to help you understand how to manipulate files and folders using Python. We will be using the os and glob modules, which provide a way to interact with the file system and perform common operations such as creating, deleting, and moving files and directories.

Achievements

  • os module
  • glob module

Creating Folders and Files

Here's some Python code that creates multiple groups of sample folders and files:

import os

## Create multiple groups of sample folders and files
for i in range(3):
    ## Create a new folder for each group
    folder_name = "group_" + str(i)
    os.makedirs(folder_name)

    ## Create sample files within each folder
    for j in range(3):
        file_name = "file_" + str(j) + ".txt"
        file_path = os.path.join(folder_name, file_name)
        with open(file_path, "w") as file:
            file.write("This is a sample file.")

You can open create_samples.py in the editor and run the code to see the results.

os.makedirs(path) is a function from the os module in Python that creates a directory at the specified path. In this case, the path is the folder_name, which is constructed by concatenating the string "group_" with the current value of the i variable from the outer for loop, resulting in "group_0", "group_1", and "group_2".

os.path.join(path1, path2, ...) is a function from the os.path module that joins one or more paths together. In this case, path1 is the folder_name and path2 is the file_name, which is constructed by concatenating the string "file_" with the current value of the j variable from the inner for loop, resulting in "file_0.txt", "file_1.txt", and "file_2.txt". The os.path.join() function is used to concatenate the folder name and file name to create the full file path so that the file can be created inside the folder.

By using the os.path.join function, the code ensures that the correct separator for the current operating system is used to join the folder and file names, regardless of whether the code is run on Windows, Linux, or macOS.

More on the Os Module

The os module in Python provides a way to interact with the operating system, allowing you to perform various tasks such as creating and deleting directories, reading and writing files, and executing commands.

Open up a new Python interpreter.

python3

Here are a few examples of other useful functions provided by the os module:

  • os.listdir(path): Returns a list of all files and directories in the specified directory.

    os.listdir('.') ## returns a list of all files and directories in the current directory
  • os.remove(path): Deletes the file at the specified path.

    os.remove('file.txt') ## deletes the file named 'file.txt'
  • os.rmdir(path): Deletes the empty directory at the specified path.

    os.rmdir('folder') ## deletes the empty folder named 'folder'
  • os.rename(src, dst): Renames a file or directory from the src path to the dst path.

    os.rename('file1.txt', 'file2.txt') ## renames the file 'file1.txt' to 'file2.txt'
  • os.chdir(path): Changes the current working directory to the specified path.

    os.chdir('/home/user/documents') ## changes the current working directory to '/home/user/documents'
  • os.getcwd(): Returns the current working directory.

    os.getcwd() ## returns the current working directory, e.g. '/home/user/documents'

Please note that most of the above functions will raise an exception (FileNotFoundError, OSError, etc) if the specified file or directory does not exist or you don't have the necessary permissions.

Walking a Directory Tree

os.walk(top, topdown=True, onerror=None, followlinks=False) is a function from the os module in Python that generates the file names in a directory tree by walking the tree either top-down or bottom-up. By default, os.walk() generates the file names in a directory tree top-down. For each directory in the tree rooted at directory top (including top itself), it yields a 3-tuple (dirpath, dirnames, filenames).

Here's an example of how you can use os.walk() to print all the files in a directory and its subdirectories:

import os

## Print all files in a directory and its subdirectories
for root, dirs, files in os.walk('.'):
    for file in files:
        print(os.path.join(root, file))

This code will start at the current directory (indicated by '.') and recursively walk through all the subdirectories, printing the full path of each file it encounters.

Here's an other example where you can use os.walk() to search for a specific file with a specific extension in a directory:

import os

def search_file(directory, file_extension):
    for root, dirs, files in os.walk(directory):
        for file in files:
            if file.endswith(file_extension):
                print(os.path.join(root, file))

search_file('.','.txt')

This will look for all the files with the '.txt' extension and print the full path of the file.

os.walk() is a powerful function that can be used for many tasks such as searching for files, analyzing directory structures, and more.

It's worth noting that os.walk is a generator, which means it generates the values on the fly, rather than keeping them all in memory. This makes it efficient for handling large directory trees.

the Glob Module

The glob module in Python provides a way to search for files and directories using wildcard characters. glob.glob(pathname) returns a list of file paths that match the specified pathname pattern.

Here's an example of how you can use glob.glob() to find all files with the ".txt" extension in the current directory:

import glob

txt_files = glob.glob('*.txt')
print(txt_files)

This code will search for all files with the ".txt" extension in the current directory and return a list of file paths that match the pattern.

It is similar to the os.walk() function, but it does not recursively search through subdirectories. It only looks for files that match the specified pattern in the current directory.

Here's an example of how you can use glob.glob() to find all files with the ".txt" extension in all subdirectories:

import glob

txt_files = glob.glob('**/*.txt', recursive=True)
print(txt_files)

This code will search for all files with the ".txt" extension in the current directory and all subdirectories.

In general, glob.glob() is more convenient when you are only looking for files in one directory and its subdirectories. However, os.walk() is more powerful and flexible as it can be used to search for files in multiple directories or to analyze the structure of a directory tree.

It is worth noting that, like os.walk, the glob is also a generator, which means it generates the values on the fly, rather than keeping them all in memory. This makes it efficient for handling large numbers of files.

Summary

In this lab, we learned how to use Python to manipulate files and folders by using the os and glob modules. We covered how to create a new folder, change the current working directory, create a new file, write to a file, close a file, list the contents of a directory, and find files that match a certain pattern using glob.

Other Python Tutorials you may like