Count Lines of Code with CLOC

LinuxLinuxBeginner
Practice Now

Introduction

CLOC (Count Lines of Code) is a command-line tool that analyzes the files in a directory and counts the number of lines of code, comments, and blank lines in a wide variety of programming languages. It's an invaluable resource for developers seeking to understand the composition of their projects, measure productivity, or estimate project complexity.

CLOC tool interface
This is a Guided Lab, which provides step-by-step instructions to help you learn and practice. Follow the instructions carefully to complete each step and gain hands-on experience. Historical data shows that this is a beginner level lab with a 100% completion rate. It has received a 99% positive review rate from learners.

Understanding CLOC and Its Basic Usage

CLOC (Count Lines of Code) is a powerful utility that helps developers analyze their codebase by counting lines of code, comments, and blank lines across multiple programming languages.

In this step, we will learn how to use CLOC with its basic syntax to analyze a project.

Basic CLOC Command

The basic syntax for using CLOC is:

cloc [options] <file/directory>

Let's use CLOC to analyze the Flask project that was cloned during setup:

  1. Open the terminal by clicking on the terminal icon in the taskbar
  2. Navigate to the Flask project directory:
cd ~/project/flask
  1. Run the basic CLOC command to analyze the entire project:
cloc .

The . represents the current directory, so this command tells CLOC to analyze all files in the current directory and its subdirectories.

Understanding the Output

After running the command, CLOC will display a table with the following information:

  • Language: Programming languages detected in the project
  • Files: Number of files for each language
  • Blank: Number of blank lines
  • Comment: Number of comment lines
  • Code: Number of code lines

The output should look something like this:

      56 text files.
      56 unique files.
      16 files ignored.

github.com/AlDanial/cloc v 1.90  T=0.11 s (428.1 files/s, 72093.6 lines/s)
--------------------------------------------------------------------------------
Language                      files          blank        comment           code
--------------------------------------------------------------------------------
Python                           41           3061           2088           7012
Markdown                          5            175              0            314
YAML                              2             10              3             84
make                              1             21             30             46
TOML                              1              6              0             19
--------------------------------------------------------------------------------
SUM:                             50           3273           2121           7475
--------------------------------------------------------------------------------

This output gives you a comprehensive overview of the Flask project's codebase composition.

Analyzing Specific File Types with CLOC

CLOC allows you to analyze specific file types or exclude certain files from your analysis. This is particularly useful when working with large projects containing multiple programming languages.

Analyzing Specific File Extensions

To analyze only files with specific extensions, you can use the --include-ext option followed by a comma-separated list of file extensions:

  1. Navigate to the Flask project directory if you're not already there:
cd ~/project/flask
  1. Run CLOC to analyze only Python files in the project:
cloc --include-ext=py .

The output will now only show information about Python files in the project:

      41 text files.
      41 unique files.
       0 files ignored.

github.com/AlDanial/cloc v 1.90  T=0.05 s (886.0 files/s, 264066.7 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Python                          41           3061           2088           7012
-------------------------------------------------------------------------------
SUM:                            41           3061           2088           7012
-------------------------------------------------------------------------------

Excluding Specific Directories

You can also exclude specific directories from your analysis using the --exclude-dir option:

cloc --exclude-dir=tests .

This command analyzes the Flask project but excludes the tests directory. The output will show statistics for all project files except those in the tests directory:

      34 text files.
      34 unique files.
      14 files ignored.

github.com/AlDanial/cloc v 1.90  T=0.07 s (372.3 files/s, 45941.8 lines/s)
--------------------------------------------------------------------------------
Language                      files          blank        comment           code
--------------------------------------------------------------------------------
Python                           25           1546           1103           3421
Markdown                          5            175              0            314
YAML                              2             10              3             84
make                              1             21             30             46
TOML                              1              6              0             19
--------------------------------------------------------------------------------
SUM:                             34           1758           1136           3884
--------------------------------------------------------------------------------

By using these filtering options, you can focus your analysis on specific parts of the codebase that are most relevant to your needs.

Comparing Projects and Generating Reports

CLOC provides features for comparing different projects and generating reports in various formats. These capabilities are particularly useful for tracking changes over time or comparing different codebases.

Comparing Two Directories

Let's create a simple project to compare with Flask:

  1. Navigate to the project directory:
cd ~/project
  1. Create a new directory for a simple Python project:
mkdir sample_project
cd sample_project
  1. Create a few Python files with some code:
echo 'def hello_world():
    """
    A simple function that prints Hello World
    """
    print("Hello, World!")

if __name__ == "__main__":
    hello_world()' > main.py
echo 'class Calculator:
    """A simple calculator class"""
    
    def add(self, a, b):
        """Add two numbers"""
        return a + b
        
    def subtract(self, a, b):
        """Subtract b from a"""
        return a - b' > calculator.py
  1. Now, let's compare this sample project with the Flask project using CLOC's diff feature:
cd ~/project
cloc --diff flask sample_project

The output will show the difference in code metrics between the two projects:

       2 text files.
       2 unique files.
       0 files ignored.

github.com/AlDanial/cloc v 1.90  T=0.01 s (195.2 files/s, 1756.8 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Python                           2              3              4             11
-------------------------------------------------------------------------------
SUM:                             2              3              4             11
-------------------------------------------------------------------------------

Diff by file type:
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Python                          39           3058           2084           7001
Markdown                         5            175              0            314
YAML                             2             10              3             84
make                             1             21             30             46
TOML                             1              6              0             19
-------------------------------------------------------------------------------
SUM:                            48           3270           2117           7464
-------------------------------------------------------------------------------

Generating Reports in Different Formats

CLOC can generate reports in various formats, including CSV and XML. Let's create a CSV report for the Flask project:

  1. Navigate to the Flask project:
cd ~/project/flask
  1. Generate a CSV report:
cloc --csv --out=flask_stats.csv .
  1. View the contents of the generated report:
cat flask_stats.csv

You should see the CLOC analysis in CSV format:

files,language,blank,comment,code,"github.com/AlDanial/cloc v 1.90 T=0.09 s (571.3 files/s, 96263.8 lines/s)"
41,Python,3061,2088,7012
5,Markdown,175,0,314
2,YAML,10,3,84
1,make,21,30,46
1,TOML,6,0,19
50,SUM,3273,2121,7475

This CSV format is particularly useful for importing into spreadsheets or other data analysis tools.

Similarly, you can generate an XML report:

cloc --xml --out=flask_stats.xml .

These reporting capabilities make CLOC a versatile tool for code analysis and project management.

Advanced CLOC Options for Detailed Analysis

CLOC offers several advanced options for more detailed and customized analysis. These options allow you to gain deeper insights into your codebase.

Showing Progress During Analysis

When analyzing large projects, it's helpful to see progress. Use the --progress-rate option to display progress updates:

cd ~/project/flask
cloc --progress-rate=10 .

This command will display a progress update after every 10 files are processed.

Analyzing Files Matching a Pattern

You can analyze files that match a specific pattern using the --match-f option:

cloc --match-f='test_.*\.py$' .

This command counts lines of code only in Python files that start with "test_". The output will show statistics for test files in the Flask project:

      15 text files.
      15 unique files.
       0 files ignored.

github.com/AlDanial/cloc v 1.90  T=0.03 s (541.8 files/s, 155547.1 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Python                          15           1337            736           3193
-------------------------------------------------------------------------------
SUM:                            15           1337            736           3193
-------------------------------------------------------------------------------

This feature is particularly useful when you want to focus on specific file patterns within a large project.

Displaying Results by File

To see a breakdown of the analysis by individual files rather than just by language, use the --by-file option:

cloc --by-file --include-ext=py src/

This command analyzes Python files in the src directory and displays results for each file individually:

    7 text files.
    7 unique files.
    0 files ignored.

github.com/AlDanial/cloc v 1.90  T=0.01 s (648.3 files/s, 180431.1 lines/s)
-------------------------------------------------------------------------------
File                          blank        comment           code
-------------------------------------------------------------------------------
src/flask/ctx.py                123            193            539
src/flask/app.py                284            490            999
src/flask/blueprints.py         100            191            421
src/flask/cli.py                126            188            557
src/flask/helpers.py            136            227            538
src/flask/templating.py          28             60            123
src/flask/globals.py             22             63             74
-------------------------------------------------------------------------------
SUM:                            819           1412           3251
-------------------------------------------------------------------------------

This detailed view helps you identify which specific files contribute most to your codebase size.

These advanced options make CLOC a versatile tool for comprehensive code analysis in various scenarios.

Summary

In this lab, you have learned how to use CLOC (Count Lines of Code) to analyze codebases and gain valuable insights into project composition. Here's a summary of what you've accomplished:

  1. Basic Usage: You learned how to use the basic cloc command to analyze directories and count lines of code, comments, and blank lines across different programming languages.

  2. Filtering and Targeting: You explored how to focus your analysis on specific file types using --include-ext and how to exclude directories with --exclude-dir.

  3. Comparison and Reporting: You learned how to compare different codebases using the --diff option and how to generate reports in various formats like CSV and XML.

  4. Advanced Analysis: You discovered advanced options that provide more detailed insights, such as progress tracking, duplicate file detection, pattern matching, and file-by-file analysis.

CLOC is a powerful tool for developers, project managers, and code auditors to quantify the size and complexity of codebases. By using CLOC effectively, you can:

  • Understand the composition of your projects
  • Track coding productivity
  • Estimate project complexity
  • Make informed decisions about code architecture and maintenance
  • Generate reports for documentation and analysis

These capabilities make CLOC an essential tool in any developer's toolkit for code analysis and project management.