ZFS Backup Tool Part 6

Now that I can read and write a snapshot, how do I process a list of snapshots in a useful manner? First, let me define what I mean by a useful manner. I want the tool to keep a copy of all automatic snapshot on the source ZFS tree on the destination tree as an automatic snapshot is aged off of the source it needs to be aged off of the destination as well. It will transfer snapshots one at a time instead of transferring all of the intermediate snapshots at the same time; the ZFS send ‘-i’ option versus the ‘-I’ option.

The best data structure for this is a tree or graph. The tree starts with a list of yearly snapshots. Every snapshot has two slices of children—one for the child frequency snapshots older than it. The younger slice will be populated only if the current snapshot is the youngest child at its frequency strata. A picture demonstrating my idea follows this paragraph. I will delve into implementation details in the next part of the ZFS Backup Tool series.

A diagram of a snapshot tree

ZFS Backup Tool Part 5

Now that I can read a list of snapshots, I need to read a snapshot and transfer it to the destination. The three functions that allow me to do that are exec.StdinPipe(), exec.StdoutPipe(), and io.CopyBuffer().

The process consists of the following steps:

  1. Create an exec.Cmd representing the zfs send command
  2. Use exec.StdoutPipe() to connect a pipe to the output of the command created in step 1.
  3. Create an exec.Cmd representing the zfs receive command
  4. Use exec.StdinPipe() to connect a pipe to the input of the command created in step 3.
  5. Start both commands
  6. Use io.CopyBuffer() to read from the snapshot to the receiver.

You can view the code here.

Self Hosting a Git Server

Which software to use?

With the ZFS backup tool, I want to host the code for it here on my website instead of GitHub. What options are available? If I want to host the bare repo, I can use ssh for write access and add a virtual host for apache so you can have read access. If I want a nice web interface, though, I need a different setup.

A bit of online searching shows four major self-hosted Git web frontends. They are GitLab, Gitea, GitBucket, and Gogs. GitLab and GitBucket are out because they require a lot of extra software to support the service. GitLab could almost qualify as its own Linux distro with a bit more work. GitBucket is nearly as bad. That leaves the two clones, Gogs and Gitea. Gitea is a fork of Gogs with more maintainers. The increase in maintainers gives Gitea a faster issue resolution, so I chose it.

System requirements

Gitea has very moderate system requirements. Golang, about 256MB of RAM, and optionally MariaDB, MySQL, or PostgreSQL. An external database is a recommendation for large sites. I will use MariaDB because I am already using it and have a working scheduled backup of my entire database server.

Installation

Since Ubuntu doesn’t have a current package for Gitea, I followed the From binary instructions on docs.gitea.io. I followed the MySQL portion of the Database preparation page to create the needed MariaDB database. I followed the Using Apache HTTPD as a reverse proxy section of the Reverse Proxies page to finish the setup.

The manual setup was quicker than the Docker setup I played with on my lab network.

You can explore my repositories by clicking the My Git Repositories link in the header menu on desktop or the dropdown menu on mobile.

Mustie1: Good Small Engine Channel

Mustie1 does videos of small engine repair. Most of his videos start with something simple that someone overlooked with the “dead” engine. He fixes that and usually cleans the engine as well.

Here are three videos where he fixed a forklift that someone abandoned because two previous mechanics wouldn’t follow their troubleshooting workflow to the end.

ZFS Backup Tool Part 4

Welcome to Part 4 of my series on my tool for backing up ZFS Snapshots to an external device. In this part, I am discussing how to exec a command and read its output.

To deal with external commands in Go, you use the os/exec package. The primary pieces of the package that I need for now are exec.Command() and CombinedOutput(). exec.Command() sets up the Command structure with the command and any arguments that I am passing to it.

var listCommand = exec.Command("zfs", "list", "-Hrt", "snapshot", "dpool")

That code creates a variable called listCommand, which is ready to run the command zfs with the arguments list, -Hrt, and snapshot as individual arguments.

var snapList, err = listCommand.CombinedOutput()

That line of code runs the command I previously prepared, puts both its Standard Output and Standard Error in a slice of bytes. If the command exited with an error code other than 0, CombinedOutput sets err to a non-nil value. snapList will have the Standard Error of the executed command, so printing snapList’s contents will be useful for debugging.

var snapScanner = bufio.NewScanner(bytes.NewReader(snapList))
	if err != nil {
		fmt.Println(listCommand)
		fmt.Println("Error trying to list snapshots:", err.Error())
		for snapScanner.Scan() {
			fmt.Println(snapScanner.Text())
		}
	}

I will need to use the more complicated IO redirection tools provided in the os/exec package for the zfs send and zfs receive commands. However, for a test run today, I can use a modification of the loop I used to print the output from zfs if it errored.

for snapScanner.Scan() {
		if snapshotLineRegex.MatchString(snapScanner.Text()) {
			var temp = strings.SplitN(snapScanner.Text(), "\t", 2)
			var snapshot = ParseSnapshot(temp[0])
			if snapshot != nil {
				fmt.Println("I found snapshot", snapshot.Name(), "at", snapshot.Path())
			}
		}
	}

ZFS Backup Tool Part 3

Today’s project is parsing a snapshot into a custom datatype that gives us more accessible options to manipulate snapshots. First, the regular expression strings need to be moved into separate files so I can reference them across other files.

The essential parts of a snapshot are:

  • the pool name
  • the filesystem tree
  • the Interval
  • the TimeStamp

To parse a snapshot out of a string.

  • Confirm that the string matches our regular expression for snapshots and only contains the regular expression for a snapshot.
  • If it does not return an error otherwise continue
  • Split the input string into the path and the snapshot name
  • Parse the snapshot name into the interval and timestamp fields
  • Split the path into the pool name and any filesystem tree portions.

Below is a function that implements the listed requirements.

/*
ParseSnapshot parses a string into a Snapshot.

It returns nil on error.
*/
func ParseSnapshot(input string) *Snapshot {
	var snapshotOnly, err = regexp.Compile("^" + PoolNameRegex + "@" + ZfsSnapshotNameRegex + "$")
	if err != nil {
		return nil
	}
	if !snapshotOnly.MatchString(input) {
		return nil
	}
	var snapshotPieces []string = snapshotOnly.FindStringSubmatch(input)
	var theSnapshot = Snapshot{}
	theSnapshot.Interval = intervalStringToUInt(snapshotPieces[1])
	var year, month, day, hour, minute int
	year, err = strconv.Atoi(snapshotPieces[2])
	if err != nil {
		return nil
	}
	month, err = strconv.Atoi(snapshotPieces[3])
	if err != nil {
		return nil
	}
	day, err = strconv.Atoi(snapshotPieces[4])
	if err != nil {
		return nil
	}
	hour, err = strconv.Atoi(snapshotPieces[5])
	if err != nil {
		return nil
	}
	minute, err = strconv.Atoi(snapshotPieces[6])
	if err != nil {
		return nil
	}
	theSnapshot.TimeStamp = time.Date(year, time.Month(month), day, hour, minute, 0, 0, time.UTC)
	var splitInput []string = strings.Split(input, "@")
	if len(splitInput) != 2 {
		return nil
	}
	var paths []string = strings.Split(splitInput[0], "/")
	theSnapshot.pool = paths[0]
	if len(paths) > 1 {
		copy(theSnapshot.fsTree, paths[1:])
	}
	return &theSnapshot
}

func intervalStringToUInt(input string) uint64 {
	switch input {
	case "yearly":
		return 0
	case "monthly":
		return 1
	case "weekly":
		return 2
	case "daily":
		return 3
	case "hourly":
		return 4
	}
	return 5
}

Now that I can create a Snapshot structure I need some utility methods for them.

  • Read the pool snapshot and file system tree as a single string.
  • Compare two snapshots by date and interval
  • Get the snapshot name
  • Get the full snapshot string. <path>@<snapshot name>

The following code will implement those utility methods.

/*
Path returns a string containing the path of the snapshot
*/
func (s Snapshot) Path() string {
	var temp strings.Builder
	temp.WriteString(s.pool)
	if len(s.fsTree) > 0 {
		for _, v := range s.fsTree {
			temp.WriteString("/" + v)
		}
	}
	return temp.String()
}

/*
Name returns a string containing the full name of snapshot
*/
func (s Snapshot) Name() string {
	var temp strings.Builder
	temp.WriteString("zfs-auto-snap_")
	temp.WriteString(intervalUIntToString(s.Interval) + "-")
	fmt.Fprintf(&temp, "%d-%d-%d-%d%d", s.TimeStamp.Year(), s.TimeStamp.Month(), s.TimeStamp.Day(), s.TimeStamp.Hour(), s.TimeStamp.Minute())
	return temp.String()
}

func intervalUIntToString(x uint64) string {
	switch x {
	case 0:
		return "yearly"
	case 1:
		return "monthly"
	case 2:
		return "weekly"
	case 3:
		return "daily"
	case 4:
		return "hourly"
	}
	return "frequent"
}

/*
String returns a string equal to s.Path() + "@" + s.Name() for Snapshot s
*/
func (s Snapshot) String() string {
	return s.Path() + "@" + s.Name()
}

/*
CompareSnapshotDates returns -2 if x occured before y and would include y in its interval
returns -1 if x occured before y
returns 0 if x and y are the same snapshot
returns +1 if y occured after x
err is non nill if the snapshots do not have the same path
*/
func CompareSnapshotDates(x Snapshot, y Snapshot) (int, error) {
	if x.Path() != y.Path() {
		return 0, errors.New("Can only compare snapshots with the same path")
	}
	if x.Interval == y.Interval {
		if x.TimeStamp.Equal(y.TimeStamp) {
			return 0, nil
		}
		if x.TimeStamp.Before(y.TimeStamp) {
			return -1, nil
		}
		return 1, nil
	}
	if x.Interval < y.Interval { // y is from a more frequent backup interval than x
		var interval time.Time
		switch x.Interval {
		case 0:
			interval = x.TimeStamp.AddDate(-1, 0, 0)
		case 1:
			interval = x.TimeStamp.AddDate(0, -1, 0)
		case 2:
			interval = x.TimeStamp.AddDate(0, 0, -7)
		case 3:
			interval = x.TimeStamp.AddDate(0, 0, -1)
		case 4:
			interval = x.TimeStamp.Add(time.Hour * -1)
		case 5:
			interval = x.TimeStamp.Add(time.Minute * -15)
		}
		if x.TimeStamp.Before(y.TimeStamp) {
			return 1, nil
		}
		if interval.Before(y.TimeStamp) {
			return -2, nil
		}
		return -1, nil
	}
	// y is from a less frequent backup interval than x
	if x.TimeStamp.Before(y.TimeStamp) {
		return -1, nil
	}
	if x.TimeStamp.After(y.TimeStamp) {
		return 1, nil
	}
	return 0, nil
}

You can get the entire source code for the tool below.