ZFS Backup Tool Part 6

Now that I can read and write a snapshot, how do I process a list of snapshots in a useful manner? First, let me define what I mean by a useful manner. I want the tool to keep a copy of all automatic snapshot on the source ZFS tree on the destination tree as an automatic snapshot is aged off of the source it needs to be aged off of the destination as well. It will transfer snapshots one at a time instead of transferring all of the intermediate snapshots at the same time; the ZFS send ‘-i’ option versus the ‘-I’ option.

The best data structure for this is a tree or graph. The tree starts with a list of yearly snapshots. Every snapshot has two slices of children—one for the child frequency snapshots older than it. The younger slice will be populated only if the current snapshot is the youngest child at its frequency strata. A picture demonstrating my idea follows this paragraph. I will delve into implementation details in the next part of the ZFS Backup Tool series.

A diagram of a snapshot tree

ZFS Backup Tool Part 5

Now that I can read a list of snapshots, I need to read a snapshot and transfer it to the destination. The three functions that allow me to do that are exec.StdinPipe(), exec.StdoutPipe(), and io.CopyBuffer().

The process consists of the following steps:

  1. Create an exec.Cmd representing the zfs send command
  2. Use exec.StdoutPipe() to connect a pipe to the output of the command created in step 1.
  3. Create an exec.Cmd representing the zfs receive command
  4. Use exec.StdinPipe() to connect a pipe to the input of the command created in step 3.
  5. Start both commands
  6. Use io.CopyBuffer() to read from the snapshot to the receiver.

You can view the code here.

ZFS Backup Tool Part 4

Welcome to Part 4 of my series on my tool for backing up ZFS Snapshots to an external device. In this part, I am discussing how to exec a command and read its output.

To deal with external commands in Go, you use the os/exec package. The primary pieces of the package that I need for now are exec.Command() and CombinedOutput(). exec.Command() sets up the Command structure with the command and any arguments that I am passing to it.

var listCommand = exec.Command("zfs", "list", "-Hrt", "snapshot", "dpool")

That code creates a variable called listCommand, which is ready to run the command zfs with the arguments list, -Hrt, and snapshot as individual arguments.

var snapList, err = listCommand.CombinedOutput()

That line of code runs the command I previously prepared, puts both its Standard Output and Standard Error in a slice of bytes. If the command exited with an error code other than 0, CombinedOutput sets err to a non-nil value. snapList will have the Standard Error of the executed command, so printing snapList’s contents will be useful for debugging.

var snapScanner = bufio.NewScanner(bytes.NewReader(snapList))
	if err != nil {
		fmt.Println("Error trying to list snapshots:", err.Error())
		for snapScanner.Scan() {

I will need to use the more complicated IO redirection tools provided in the os/exec package for the zfs send and zfs receive commands. However, for a test run today, I can use a modification of the loop I used to print the output from zfs if it errored.

for snapScanner.Scan() {
		if snapshotLineRegex.MatchString(snapScanner.Text()) {
			var temp = strings.SplitN(snapScanner.Text(), "\t", 2)
			var snapshot = ParseSnapshot(temp[0])
			if snapshot != nil {
				fmt.Println("I found snapshot", snapshot.Name(), "at", snapshot.Path())

ZFS Backup Tool Part 3

Today’s project is parsing a snapshot into a custom datatype that gives us more accessible options to manipulate snapshots. First, the regular expression strings need to be moved into separate files so I can reference them across other files.

The essential parts of a snapshot are:

  • the pool name
  • the filesystem tree
  • the Interval
  • the TimeStamp

To parse a snapshot out of a string.

  • Confirm that the string matches our regular expression for snapshots and only contains the regular expression for a snapshot.
  • If it does not return an error otherwise continue
  • Split the input string into the path and the snapshot name
  • Parse the snapshot name into the interval and timestamp fields
  • Split the path into the pool name and any filesystem tree portions.

Below is a function that implements the listed requirements.

ParseSnapshot parses a string into a Snapshot.

It returns nil on error.
func ParseSnapshot(input string) *Snapshot {
	var snapshotOnly, err = regexp.Compile("^" + PoolNameRegex + "@" + ZfsSnapshotNameRegex + "$")
	if err != nil {
		return nil
	if !snapshotOnly.MatchString(input) {
		return nil
	var snapshotPieces []string = snapshotOnly.FindStringSubmatch(input)
	var theSnapshot = Snapshot{}
	theSnapshot.Interval = intervalStringToUInt(snapshotPieces[1])
	var year, month, day, hour, minute int
	year, err = strconv.Atoi(snapshotPieces[2])
	if err != nil {
		return nil
	month, err = strconv.Atoi(snapshotPieces[3])
	if err != nil {
		return nil
	day, err = strconv.Atoi(snapshotPieces[4])
	if err != nil {
		return nil
	hour, err = strconv.Atoi(snapshotPieces[5])
	if err != nil {
		return nil
	minute, err = strconv.Atoi(snapshotPieces[6])
	if err != nil {
		return nil
	theSnapshot.TimeStamp = time.Date(year, time.Month(month), day, hour, minute, 0, 0, time.UTC)
	var splitInput []string = strings.Split(input, "@")
	if len(splitInput) != 2 {
		return nil
	var paths []string = strings.Split(splitInput[0], "/")
	theSnapshot.pool = paths[0]
	if len(paths) > 1 {
		copy(theSnapshot.fsTree, paths[1:])
	return &theSnapshot

func intervalStringToUInt(input string) uint64 {
	switch input {
	case "yearly":
		return 0
	case "monthly":
		return 1
	case "weekly":
		return 2
	case "daily":
		return 3
	case "hourly":
		return 4
	return 5

Now that I can create a Snapshot structure I need some utility methods for them.

  • Read the pool snapshot and file system tree as a single string.
  • Compare two snapshots by date and interval
  • Get the snapshot name
  • Get the full snapshot string. <path>@<snapshot name>

The following code will implement those utility methods.

Path returns a string containing the path of the snapshot
func (s Snapshot) Path() string {
	var temp strings.Builder
	if len(s.fsTree) > 0 {
		for _, v := range s.fsTree {
			temp.WriteString("/" + v)
	return temp.String()

Name returns a string containing the full name of snapshot
func (s Snapshot) Name() string {
	var temp strings.Builder
	temp.WriteString(intervalUIntToString(s.Interval) + "-")
	fmt.Fprintf(&temp, "%d-%d-%d-%d%d", s.TimeStamp.Year(), s.TimeStamp.Month(), s.TimeStamp.Day(), s.TimeStamp.Hour(), s.TimeStamp.Minute())
	return temp.String()

func intervalUIntToString(x uint64) string {
	switch x {
	case 0:
		return "yearly"
	case 1:
		return "monthly"
	case 2:
		return "weekly"
	case 3:
		return "daily"
	case 4:
		return "hourly"
	return "frequent"

String returns a string equal to s.Path() + "@" + s.Name() for Snapshot s
func (s Snapshot) String() string {
	return s.Path() + "@" + s.Name()

CompareSnapshotDates returns -2 if x occured before y and would include y in its interval
returns -1 if x occured before y
returns 0 if x and y are the same snapshot
returns +1 if y occured after x
err is non nill if the snapshots do not have the same path
func CompareSnapshotDates(x Snapshot, y Snapshot) (int, error) {
	if x.Path() != y.Path() {
		return 0, errors.New("Can only compare snapshots with the same path")
	if x.Interval == y.Interval {
		if x.TimeStamp.Equal(y.TimeStamp) {
			return 0, nil
		if x.TimeStamp.Before(y.TimeStamp) {
			return -1, nil
		return 1, nil
	if x.Interval < y.Interval { // y is from a more frequent backup interval than x
		var interval time.Time
		switch x.Interval {
		case 0:
			interval = x.TimeStamp.AddDate(-1, 0, 0)
		case 1:
			interval = x.TimeStamp.AddDate(0, -1, 0)
		case 2:
			interval = x.TimeStamp.AddDate(0, 0, -7)
		case 3:
			interval = x.TimeStamp.AddDate(0, 0, -1)
		case 4:
			interval = x.TimeStamp.Add(time.Hour * -1)
		case 5:
			interval = x.TimeStamp.Add(time.Minute * -15)
		if x.TimeStamp.Before(y.TimeStamp) {
			return 1, nil
		if interval.Before(y.TimeStamp) {
			return -2, nil
		return -1, nil
	// y is from a less frequent backup interval than x
	if x.TimeStamp.Before(y.TimeStamp) {
		return -1, nil
	if x.TimeStamp.After(y.TimeStamp) {
		return 1, nil
	return 0, nil

You can get the entire source code for the tool below.

ZFS Backup Tool Part 2

Recognizing a snapshot made by zfs-auto-snapshot.

First, what does a list of these snapshots look like?

robert@mars:~/src/go/zfs_backup$ zfs list -Hrt snapshot dpool
dpool@zfs-auto-snap_monthly-2020-05-12-1245     96K     -       148K    -
dpool@zfs-auto-snap_monthly-2020-06-11-1248     8K      -       23.3G   -
dpool@zfs-auto-snap_monthly-2020-07-11-1245     0B      -       23.3G   -
dpool@zfs-auto-snap_weekly-2020-07-26-1242      0B      -       30.5G   -
dpool@zfs-auto-snap_daily-2020-07-27-1238       4.74G   -       31.3G   -
dpool@zfs-auto-snap_daily-2020-08-02-1235       0B      -       143G    -
dpool@zfs-auto-snap_weekly-2020-08-02-1240      0B      -       143G    -
dpool@zfs-auto-snap_hourly-2020-08-03-1117      0B      -       143G    -
dpool@zfs-auto-snap_daily-2020-08-03-1236       0B      -       143G    -
dpool@zfs-auto-snap_frequent-2020-08-04-2030    0B      -       143G    -
dpool@zfs-auto-snap_frequent-2020-08-04-2045    0B      -       143G    -
dpool@zfs-auto-snap_frequent-2020-08-04-2100    0B      -       143G    -
dpool@zfs-auto-snap_frequent-2020-08-04-2115    0B      -       143G    -
dpool@zfs-auto-snap_hourly-2020-08-04-2117      0B      -       143G    -
dpool/home@zfs-auto-snap_hourly-2020-08-04-1717 0B      -       96K     -
dpool/home@zfs-auto-snap_hourly-2020-08-04-1817 0B      -       96K     -
dpool/home@zfs-auto-snap_hourly-2020-08-04-1917 0B      -       96K     -
dpool/home@zfs-auto-snap_hourly-2020-08-04-2017 0B      -       96K     -
dpool/home@zfs-auto-snap_frequent-2020-08-04-2030       0B      -       96K     -
dpool/home@zfs-auto-snap_frequent-2020-08-04-2045       0B      -       96K     -
dpool/home@zfs-auto-snap_frequent-2020-08-04-2100       0B      -       96K     -
dpool/home@zfs-auto-snap_frequent-2020-08-04-2115       0B      -       96K     -
dpool/home@zfs-auto-snap_hourly-2020-08-04-2117 0B      -       96K     -
dpool/home/robert@zfs-auto-snap_frequent-2020-08-04-2115        0B      -       69.1G   -
dpool/home/robert@zfs-auto-snap_hourly-2020-08-04-2117  0B      -       69.1G   -
dpool/plex@snap1        116G    -       442G    -
dpool/plex@zfs-auto-snap_monthly-2020-05-12-1245        8K      -       344G    -

I trimmed the previous list down a bit. So what is a regular expression that will match this? The first question is which regular expression library am I using? I am writing this tool in Go. Thus I will use the regexp Go package. Go’s regexp package is based on Google’s RE2 library. The syntax for it is here.

I will start with the snapshot names. The part after the @. Those start with zfs-auto-snap so "zfs-auto-snap" will match it.

The next section is which timer made the snapshot. This section can also be called the increment. The valid timers are yearly, monthly, weekly, daily, hourly, and frequent for a default install of zfs-auto-snapshot. The regex "yearly|monthly|weekly|daily|hourly|frequent" will match these timers. However, I would like to get which timer created the snapshot without further parsing. That is the perfect job for a capturing sub match. After adding the capturing sub match, the regex looks like "(?P<increment>yearly|monthly|weekly|daily|hourly|frequent)".

The final section is the timestamp of the snapshot. Like with the timer section, it is useful not to have to parse this data a second time. With the sub matches "(?P<year>[[:digit:]]{4})-(?P<month>[[:digit:]]{2})-(?P<day>[[:digit:]]{2})-(?P<hour>[[:digit:]]{2})(?P<minute>[[:digit:]]{2})" will work.

With the snapshot names completed, I need to capture the zfs tree structure before the @ symbol. I haven’t found a reliable regular expression that will capture that tree but "(?:[[:word:]-.]+)+(?:/?[[:word:]-.]+)*" will recognize a subset of all valid zfs trees. Avoid using anything it won’t recognize, or you may end up with inaccessible files.

Including some test code the tool’s source code looks like this so far.

package main

import (

const zfsRegexStart string = "zfs-auto-snap"
const zfsRegexIncrement string = "(?P<increment>yearly|monthly|weekly|daily|hourly|frequent)"
const zfsRegexDateStamp string = "(?P<year>[[:digit:]]{4})-(?P<month>[[:digit:]]{2})-(?P<day>[[:digit:]]{2})-(?P<hour>[[:digit:]]{2})(?P<minute>[[:digit:]]{2})"

var zfsRegex = regexp.MustCompile(zfsRegexStart + "_" + zfsRegexIncrement + "-" + zfsRegexDateStamp)

func testSnapshot(possible string, increment string) (bool, bool) {
	var matches = zfsRegex.FindStringSubmatch(possible)
	if matches == nil {
		return false, false
	var isASnapshot = true
	if matches[1] == increment {
		return isASnapshot, true
	return isASnapshot, false

func isAYearlySnapshot(possible string) bool {
	_, isYearly := testSnapshot(possible, "yearly")
	return isYearly

func isAMonthlySnapshot(possible string) bool {
	_, isMonthly := testSnapshot(possible, "monthly")
	return isMonthly

func isAWeeklySnapshot(possible string) bool {
	_, isWeekly := testSnapshot(possible, "weekly")
	return isWeekly

func isADailySnapshot(possible string) bool {
	_, isDaily := testSnapshot(possible, "daily")
	return isDaily

func isAnHourlySnapshot(possible string) bool {
	_, isHourly := testSnapshot(possible, "hourly")
	return isHourly

func isAFrequentSnapshot(possible string) bool {
	_, isFrequent := testSnapshot(possible, "frequent")
	return isFrequent

const poolNameRegex string = "(?:[[:word:]-.]+)+(?:/?[[:word:]-.]+)*"

var snapshotLineRegex = regexp.MustCompile("^" + poolNameRegex + "@" + zfsRegex.String() + ".*$")

func main() {
	input := bufio.NewScanner(os.Stdin)
	for input.Scan() {
		if snapshotLineRegex.MatchString(input.Text()) {
		} else {
			fmt.Printf("%s\t%s\n", input.Text(), "Is not a snapshot.")
	if err := input.Err(); err != nil {
		fmt.Fprintln(os.Stderr, "reading Standard Input:", err)

I will continue this tomorrow. See you then!

ZFS Backup Tool Part 1

I haven’t seen a lot of tools that are designed to backup ZFS snapshots to removable media. So, I am writing my own. I am going to document the process here.

The basic loop for a backup tool is

  1. Read a list of snapshots on the source
  2. Read a list of snapshots on the destination
  3. Find the list of snapshots on the source that are not on the destination. These are the non backed up snapshots.
  4. Find the list of snapshots on the destination that are not on the source. These are the aged out snapshots.
  5. Copy all non backed up snapshots to the destination, preferably one at a time to make recovery from IO failure easier.
  6. Remove the aged out snapshots.

I am designing this tool to only backup snapshots taken by zfs-auto-snapshot. These are named <pool|filesystem>@zfs-auto-snap_<time interval>-<year>-<month>-<day>-<hour><minute>. The command zfs -Hrt snapshot <source poolname> will generate a list of all snapshots in a pool in a machine parseable format.

Issuing the command zfs send -ci <old snapshot> <pool|filesystem>@<new snapshot> will send an incremental snapshot from old to new to the commands standard output. I can estimate the amount of data to be transferred by replacing -ci with -cpvni in the zfs send command.

Issuing the command zfs receive -u <backup location> will store a snapshot from its standard input to the backup location.

Snapshots are removed by zfs destroy -d <pool|filesystem>@<snapshot name>. The snapshot name is the portion of the snapshot pattern mentioned above after the @ symbol.