Development of Golang Caching Component

GolangGolangBeginner
Practice Now

Introduction

In this project, we will learn about the principles and significance of caching, and then we will design and implement a caching component using the Go language.

Caching is a widely used technique in computer systems to improve performance by storing frequently accessed data in memory. This allows for faster retrieval and reduces the need to access slower data sources, such as databases or remote services.

In this project, we will learn about the principles and benefits of caching. We will also design and implement a caching component using the Go programming language. The caching component will have functionalities such as storage of cached data, management of expired data items, importing and exporting of data, and CRUD (Create, Read, Update, Delete) operations.

By completing this project, you will gain knowledge and skills in caching principles, data structures, and Go programming. This will enable you to build efficient and high-performance software systems that make effective use of caching techniques.

ðŸŽŊ Tasks

In this project, you will learn:

  • How to understand the principles and significance of caching
  • How to design a caching system to store and manage data in memory
  • How to implement CRUD operations and expiration management for the caching system
  • How to add functionality for importing and exporting data from the caching system

🏆 Achievements

After completing this project, you will be able to:

  • Explain the principles and benefits of caching
  • Design a caching system based on sound design principles
  • Implement efficient data structures and algorithms for cache management
  • Develop CRUD operations in Go for the caching system
  • Serialize and deserialize data for import and export operations

What Is a Cache?

Caches are commonly found in computer hardware. For example, CPUs have first-level cache, second-level cache, and even third-level cache. The principle of cache operation is that when the CPU needs to read data, it first searches for the required data in the cache. If it is found, it is processed directly. If not, the data is read from the memory. Due to the faster speed of cache in the CPU compared to memory, the use of cache can accelerate CPU processing speed. Caching not only exists in hardware but also in various software systems. For example, in web systems, caches exist on servers, clients, or proxy servers. The widely used CDN (Content Delivery Network) can also be seen as a huge caching system. There are many benefits to using caching in web systems, such as reducing network traffic, lowering client access latency, and reducing server load.

Currently, there are many high-performance caching systems available, such as Memcache, Redis, etc. Especially Redis, it is now widely used in various web services. Since there are already these feature-rich caching systems, why do we still need to implement our own caching system? There are two main reasons for doing this. First, by implementing it ourselves, we can understand the working principle of the caching system, which is a classic reason. Second, caching systems like Redis exist independently. If we only need to develop a simple application, using a separate Redis server might be overly complex. In this case, it would be best if there is a feature-rich software package that implements these functions. By simply importing this software package, we can achieve caching functionality without the need for a separate Redis server.

Design of Cache System

In a cache system, cached data is usually stored in memory. Therefore, the cache system we design should manage the data in memory in a certain way. If the system shuts down, won't the data be lost? In fact, in most cases, the cache system also supports writing the data in memory to a file. When the system restarts, the data in the file can be loaded back into memory. This way, even if the system shuts down, the cache data will not be lost.

At the same time, the cache system also provides a mechanism for cleaning up expired data. This means that each data item in the cache has a lifetime. If a data item expires, it will be deleted from memory. As a result, hot data will always be available while cold data will be deleted as it's unnecessary to be cached.

The cache system also needs to provide interfaces for external operations so that other components of the system can make use of the cache. Typically, the cache system needs to support CRUD operations, which include creation (adding), reading, updating, and deletion.

Based on the above analysis, we can summarize that the cache system needs to have the following functionalities:

  • Storage of cached data
  • Management of expired data items
  • Importing and exporting of data from memory
  • Provision of CRUD interfaces.

Development Preparation

First, create a working directory and set the GOPATH environment variable:

cd ~/project/
mkdir -p golang/src
export GOPATH=~/project/golang

In the above steps, we create the ~/project/golang directory and set it as the GOPATH for the subsequent experiments.

Basic Structure of a Cache System

Cached data needs to be stored in memory in order to be accessed quickly. What data structure should be used to store the data items? In general, a hash table is used to store the data items, as this provides better performance for accessing the data. In the Go language, we don't need to implement our own hash table because the built-in type map already implements a hash table. So, we can directly store the cache data items in a map.

Since the cache system also supports cleaning expired data, the cache data items should have a lifespan. This means that the cache data items need to be encapsulated and saved in the cache system. To do this, we first need to implement the cache data item. Create a new directory cache in the GOPATH/src directory, and create a source file cache.go:

package cache

import (
	"encoding/gob"
	"fmt"
	"io"
	"os"
	"sync"
	"time"
)

type Item struct {
    Object     interface{} // The actual data item
    Expiration int64       // Lifespan
}

// Check if the data item has expired
func (item Item) Expired() bool {
    if item.Expiration == 0 {
        return false
    }
    return time.Now().UnixNano() > item.Expiration
}

In the above code, we define a Item structure which has two fields. Object is used to store data objects of any type, and Expiration stores the expiration time of the data item. We also provide an Expired() method for the Item type, which returns a boolean value indicating whether the data item has expired. It's important to note that the expiration time of a data item is a Unix timestamp measured in nanoseconds. How do we determine if a data item has expired? It's actually quite simple. We record the expiration time of each data item, and the cache system periodically checks each data item. If the expiration time of a data item is earlier than the current time, the data item is removed from the cache system. To do this, we'll use the time module to implement periodic tasks.

With this, we can now implement the framework of the cache system. The code is as follows:

const (
    // Flag for no expiration time
    NoExpiration time.Duration = -1

    // Default expiration time
    DefaultExpiration time.Duration = 0
)

type Cache struct {
    defaultExpiration time.Duration
    items             map[string]Item // Store cache data items in a map
    mu                sync.RWMutex    // Read-write lock
    gcInterval        time.Duration   // Expiration data cleaning interval
    stopGc            chan bool
}

// Clean expired cache data items
func (c *Cache) gcLoop() {
    ticker := time.NewTicker(c.gcInterval)
    for {
        select {
        case <-ticker.C:
            c.DeleteExpired()
        case <-c.stopGc:
            ticker.Stop()
            return
        }
    }
}

In the above code, we have implemented the Cache structure, which represents the cache system structure. The items field is a map used to store the cache data items. As you can see, we have also implemented the gcLoop() method, which schedules the DeleteExpired() method to be periodically executed using a time.Ticker. A ticker created using time.NewTicker() will send data from its ticker.C channel at the specified gcInterval interval. We can use this characteristic to periodically execute the DeleteExpired() method.

To ensure that the gcLoop() function can end normally, we listen for data from the c.stopGc channel. If there is data sent to this channel, we stop the execution of gcLoop(). Also note that we define the NoExpiration and DefaultExpiration constants, where the former represents a data item that never expires and the latter represents a data item with a default expiration time. How do we implement DeleteExpired()? See the code below:

// Delete a cache data item
func (c *Cache) delete(k string) {
    delete(c.items, k)
}

// Delete expired data items
func (c *Cache) DeleteExpired() {
    now := time.Now().UnixNano()
    c.mu.Lock()
    defer c.mu.Unlock()

    for k, v := range c.items {
        if v.Expiration > 0 && now > v.Expiration {
            c.delete(k)
        }
    }
}

As you can see, the DeleteExpired() method is quite simple. We just need to iterate through all the data items and delete the expired ones.

âœĻ Check Solution and Practice

Implementing CRUD Interface for Cache System

Now, we can implement the CRUD interface for the cache system. We can add data to the cache system using the following interfaces:

// Set cache data item, overwrite if the item exists
func (c *Cache) Set(k string, v interface{}, d time.Duration) {
    var e int64
    if d == DefaultExpiration {
        d = c.defaultExpiration
    }
    if d > 0 {
        e = time.Now().Add(d).UnixNano()
    }
    c.mu.Lock()
    defer c.mu.Unlock()
    c.items[k] = Item{
        Object:     v,
        Expiration: e,
    }
}

// Set data item without lock operation
func (c *Cache) set(k string, v interface{}, d time.Duration) {
    var e int64
    if d == DefaultExpiration {
        d = c.defaultExpiration
    }
    if d > 0 {
        e = time.Now().Add(d).UnixNano()
    }
    c.items[k] = Item{
        Object:     v,
        Expiration: e,
    }
}

// Get data item, also need to check if item has expired
func (c *Cache) get(k string) (interface{}, bool) {
    item, found := c.items[k]
    if !found {
        return nil, false
    }
    if item.Expired() {
        return nil, false
    }
    return item.Object, true
}

// Add data item, returns error if item already exists
func (c *Cache) Add(k string, v interface{}, d time.Duration) error {
    c.mu.Lock()
    _, found := c.get(k)
    if found {
        c.mu.Unlock()
        return fmt.Errorf("Item %s already exists", k)
    }
    c.set(k, v, d)
    c.mu.Unlock()
    return nil
}

// Get data item
func (c *Cache) Get(k string) (interface{}, bool) {
    c.mu.RLock()
    defer c.mu.RUnlock()

    item, found := c.items[k]
    if !found {
        return nil, false
    }
    if item.Expired() {
        return nil, false
    }
    return item.Object, true
}

In the above code, we have implemented the Set() and Add() interfaces. The main difference between the two is that the former overwrites the data item in the cache system if it already exists, while the latter throws an error if the data item already exists, preventing the cache from being incorrectly overwritten. We have also implemented the Get() method, which retrieves the data item from the cache system. It is important to note that the true meaning of the existence of a cache data item is that the item exists and has not expired.

Next, we can implement the delete and update interfaces.

// Replace an existing data item
func (c *Cache) Replace(k string, v interface{}, d time.Duration) error {
    c.mu.Lock()
    _, found := c.get(k)
    if !found {
        c.mu.Unlock()
        return fmt.Errorf("Item %s doesn't exist", k)
    }
    c.set(k, v, d)
    c.mu.Unlock()
    return nil
}

// Delete a data item
func (c *Cache) Delete(k string) {
    c.mu.Lock()
    c.delete(k)
    c.mu.Unlock()
}

The above code is self-explanatory, so I won't go into much detail.

âœĻ Check Solution and Practice

Import and Export of Cache System

Previously, we mentioned that the cache system supports importing data into a file and loading data from a file. Now let's implement this functionality.

// Write cache data items to io.Writer
func (c *Cache) Save(w io.Writer) (err error) {
    enc := gob.NewEncoder(w)
    defer func() {
        if x := recover(); x != nil {
            err = fmt.Errorf("Error registering item types with Gob library")
        }
    }()
    c.mu.RLock()
    defer c.mu.RUnlock()
    for _, v := range c.items {
        gob.Register(v.Object)
    }
    err = enc.Encode(&c.items)
    return
}

// Save data items to a file
func (c *Cache) SaveToFile(file string) error {
    f, err := os.Create(file)
    if err != nil {
        return err
    }
    if err = c.Save(f); err != nil {
        f.Close()
        return err
    }
    return f.Close()
}

// Read data items from io.Reader
func (c *Cache) Load(r io.Reader) error {
    dec := gob.NewDecoder(r)
    items := map[string]Item{}
    err := dec.Decode(&items)
    if err == nil {
        c.mu.Lock()
        defer c.mu.Unlock()
        for k, v := range items {
            ov, found := c.items[k]
            if !found || ov.Expired() {
                c.items[k] = v
            }
        }
    }
    return err
}

// Load cache data items from a file
func (c *Cache) LoadFile(file string) error {
    f, err := os.Open(file)
    if err != nil {
        return err
    }
    if err = c.Load(f); err != nil {
        f.Close()
        return err
    }
    return f.Close()
}

In the above code, the Save() method encodes the binary cache data using the gob module and writes it to an object that implements the io.Writer interface. On the other hand, the Load() method reads binary data from an io.Reader and then deserializes the data using the gob module. Essentially, here we are serializing and deserializing the cache data.

âœĻ Check Solution and Practice

Other Interfaces of the Cache System

Up to now, the functionality of the entire cache system has been completed, and most of the work has been done. Let's wrap up the final tasks.

// Return the number of cached data items
func (c *Cache) Count() int {
    c.mu.RLock()
    defer c.mu.RUnlock()
    return len(c.items)
}

// Clear the cache
func (c *Cache) Flush() {
    c.mu.Lock()
    defer c.mu.Unlock()
    c.items = map[string]Item{}
}

// Stop cleaning expired cache
func (c *Cache) StopGc() {
    c.stopGc <- true
}

// Create a new cache system
func NewCache(defaultExpiration, gcInterval time.Duration) *Cache {
    c := &Cache{
        defaultExpiration: defaultExpiration,
        gcInterval:        gcInterval,
        items:             map[string]Item{},
        stopGc:            make(chan bool),
    }
    // Start the expiration cleanup goroutine
    go c.gcLoop()
    return c
}

In the above code, we have added several methods. Count() will return the number of data items cached in the system, Flush() will clear the entire cache system, and StopGc() will stop the cache system from cleaning expired data items. Finally, we can create a new cache system using the NewCache() method.

Up to now, the entire cache system has been completed. It is quite simple, isn't it? Now let's perform some testing.

âœĻ Check Solution and Practice

Testing the Cache System

We will write a sample program which source code is located at ~/project/golang/src/cache/sample/sample.go. The content of the program is as follows:

package main

import (
    "cache"
    "fmt"
    "time"
)

func main() {
    defaultExpiration, _ := time.ParseDuration("0.5h")
    gcInterval, _ := time.ParseDuration("3s")
    c := cache.NewCache(defaultExpiration, gcInterval)

    k1 := "hello labex"
    expiration, _ := time.ParseDuration("5s")

    c.Set("k1", k1, expiration)
    s, _ := time.ParseDuration("10s")
    if v, found := c.Get("k1"); found {
        fmt.Println("Found k1: ", v)
    } else {
        fmt.Println("Not found k1")
    }
    // Pause for 10 seconds
    time.Sleep(s)
    // Now k1 should have been cleared
    if v, found := c.Get("k1"); found {
        fmt.Println("Found k1: ", v)
    } else {
        fmt.Println("Not found k1")
    }
}

The sample code is very simple. We create a cache system using the NewCache method, with a data expiration cleaning period of 3 seconds and a default expiration time of half an hour. We then set a data item "k1" with an expiration time of 5 seconds. After setting the data item, we immediately retrieve it and then pause for 10 seconds. When we retrieve "k1" again, it should have been cleared. The program can be executed using the following command:

cd ~/project/golang/src/cache
go mod init
go run sample/sample.go

The output will be as follows:

Found k1: hello labex
Not found k1
âœĻ Check Solution and Practice

Summary

In this project, we have developed a cache system that allows for adding, deleting, replacing, and querying data objects. It also includes the capability to delete expired data. If you are familiar with Redis, you may be aware of its ability to increment a numeric value associated with a key. For instance, if the value of "key" is set to "20", it can be increased to "22" by passing the parameter "2" to the Increment interface. Would you like to attempt implementing this functionality using our cache system?

Other Golang Tutorials you may like