ZFS Backup Tool Part 2

Written by: Robert R. Russell on Wednesday, August 5, 2020.

Recognizing a snapshot made by zfs-auto-snapshot.

First, what does a list of these snapshots look like?

robert@mars:~/src/go/zfs_backup$ zfs list -Hrt snapshot dpool
dpool@zfs-auto-snap_monthly-2020-05-12-1245     96K     -       148K    -
dpool@zfs-auto-snap_monthly-2020-06-11-1248     8K      -       23.3G   -
dpool@zfs-auto-snap_monthly-2020-07-11-1245     0B      -       23.3G   -
dpool@zfs-auto-snap_weekly-2020-07-26-1242      0B      -       30.5G   -
dpool@zfs-auto-snap_daily-2020-07-27-1238       4.74G   -       31.3G   -
dpool@zfs-auto-snap_daily-2020-08-02-1235       0B      -       143G    -
dpool@zfs-auto-snap_weekly-2020-08-02-1240      0B      -       143G    -
dpool@zfs-auto-snap_hourly-2020-08-03-1117      0B      -       143G    -
dpool@zfs-auto-snap_daily-2020-08-03-1236       0B      -       143G    -
dpool@zfs-auto-snap_frequent-2020-08-04-2030    0B      -       143G    -
dpool@zfs-auto-snap_frequent-2020-08-04-2045    0B      -       143G    -
dpool@zfs-auto-snap_frequent-2020-08-04-2100    0B      -       143G    -
dpool@zfs-auto-snap_frequent-2020-08-04-2115    0B      -       143G    -
dpool@zfs-auto-snap_hourly-2020-08-04-2117      0B      -       143G    -
dpool/home@zfs-auto-snap_hourly-2020-08-04-1717 0B      -       96K     -
dpool/home@zfs-auto-snap_hourly-2020-08-04-1817 0B      -       96K     -
dpool/home@zfs-auto-snap_hourly-2020-08-04-1917 0B      -       96K     -
dpool/home@zfs-auto-snap_hourly-2020-08-04-2017 0B      -       96K     -
dpool/home@zfs-auto-snap_frequent-2020-08-04-2030       0B      -       96K     -
dpool/home@zfs-auto-snap_frequent-2020-08-04-2045       0B      -       96K     -
dpool/home@zfs-auto-snap_frequent-2020-08-04-2100       0B      -       96K     -
dpool/home@zfs-auto-snap_frequent-2020-08-04-2115       0B      -       96K     -
dpool/home@zfs-auto-snap_hourly-2020-08-04-2117 0B      -       96K     -
dpool/home/robert@zfs-auto-snap_frequent-2020-08-04-2115        0B      -       69.1G   -
dpool/home/robert@zfs-auto-snap_hourly-2020-08-04-2117  0B      -       69.1G   -
dpool/plex@snap1        116G    -       442G    -
dpool/plex@zfs-auto-snap_monthly-2020-05-12-1245        8K      -       344G    -

I trimmed the previous list down a bit. So what is a regular expression that will match this? The first question is which regular expression library am I using? I am writing this tool in Go. Thus I will use the regexp Go package. Go’s regexp package is based on Google’s RE2 library. The syntax for it is here.

I will start with the snapshot names. The part after the @. Those start with zfs-auto-snap so “zfs-auto-snap” will match it.

The next section is which timer made the snapshot. This section can also be called the increment. The valid timers are yearly, monthly, weekly, daily, hourly, and frequent for a default install of zfs-auto-snapshot. The regex “yearly|monthly|weekly|daily|hourly|frequent” will match these timers. However, I would like to get which timer created the snapshot without further parsing. That is the perfect job for a capturing sub match. After adding the capturing sub match, the regex looks like “(?P<increment>yearly|monthly|weekly|daily|hourly|frequent)”.

The final section is the timestamp of the snapshot. Like with the timer section, it is useful not to have to parse this data a second time. With the sub matches “(?P<year>[[:digit:]]{4})-(?P<month>[[:digit:]]{2})-(?P<day>[[:digit:]]{2})-(?P<hour>[[:digit:]]{2})(?P<minute>[[:digit:]]{2})” will work.

With the snapshot names completed, I need to capture the zfs tree structure before the @ symbol. I haven’t found a reliable regular expression that will capture that tree but “(?:[[:word:]-.]+)+(?:/?[[:word:]-.]+)*” will recognize a subset of all valid zfs trees. Avoid using anything it won’t recognize, or you may end up with inaccessible files.

Including some test code the tool’s source code looks like this so far.

package main

import (
	"bufio"
	"fmt"
	"os"
	"regexp"
)

const zfsRegexStart string = "zfs-auto-snap"
const zfsRegexIncrement string = "(?P<increment>yearly|monthly|weekly|daily|hourly|frequent)"
const zfsRegexDateStamp string = "(?P<year>[[:digit:]]{4})-(?P<month>[[:digit:]]{2})-(?P<day>[[:digit:]]{2})-(?P<hour>[[:digit:]]{2})(?P<minute>[[:digit:]]{2})"

var zfsRegex = regexp.MustCompile(zfsRegexStart + "_" + zfsRegexIncrement + "-" + zfsRegexDateStamp)

func testSnapshot(possible string, increment string) (bool, bool) {
	var matches = zfsRegex.FindStringSubmatch(possible)
	if matches == nil {
		return false, false
	}
	var isASnapshot = true
	if matches[1] == increment {
		return isASnapshot, true
	}
	return isASnapshot, false
}

func isAYearlySnapshot(possible string) bool {
	_, isYearly := testSnapshot(possible, "yearly")
	return isYearly
}

func isAMonthlySnapshot(possible string) bool {
	_, isMonthly := testSnapshot(possible, "monthly")
	return isMonthly
}

func isAWeeklySnapshot(possible string) bool {
	_, isWeekly := testSnapshot(possible, "weekly")
	return isWeekly
}

func isADailySnapshot(possible string) bool {
	_, isDaily := testSnapshot(possible, "daily")
	return isDaily
}

func isAnHourlySnapshot(possible string) bool {
	_, isHourly := testSnapshot(possible, "hourly")
	return isHourly
}

func isAFrequentSnapshot(possible string) bool {
	_, isFrequent := testSnapshot(possible, "frequent")
	return isFrequent
}

const poolNameRegex string = "(?:[[:word:]-.]+)+(?:/?[[:word:]-.]+)*"

var snapshotLineRegex = regexp.MustCompile("^" + poolNameRegex + "@" + zfsRegex.String() + ".*$")

func main() {
	//fmt.Println(snapshotLineRegex.MatchString("dpool/www@zfs-auto-snap_frequent-2020-08-04-1830\t0B\t-\t201M\t-"))
	input := bufio.NewScanner(os.Stdin)
	for input.Scan() {
		if snapshotLineRegex.MatchString(input.Text()) {
			fmt.Println(snapshotLineRegex.FindStringSubmatch(input.Text()))
			fmt.Println(snapshotLineRegex.SubexpNames())
		} else {
			fmt.Printf("%s\t%s\n", input.Text(), "Is not a snapshot.")
		}
	}
	if err := input.Err(); err != nil {
		fmt.Fprintln(os.Stderr, "reading Standard Input:", err)
	}
}

I will continue this tomorrow. See you then!

©2020 Robert R. Russell — All rights reserved


ZFS Backup Tool Part 1

Written by: Robert R. Russell on Tuesday, August 4, 2020.

I haven’t seen a lot of tools that are designed to backup ZFS snapshots to removable media. So, I am writing my own. I am going to document the process here.

The basic loop for a backup tool is

  1. Read a list of snapshots on the source.
  2. Read a list of snapshots on the destination.
  3. Find the list of snapshots on the source that are not on the destination. These are the non backed up snapshots.
  4. Find the list of snapshots on the destination that are not on the source. These are the aged out snapshots.
  5. Copy all non backed up snapshots to the destination, preferably one at a time to make recovery from IO failure easier.
  6. Remove the aged out snapshots.

I am designing this tool to only backup snapshots taken by zfs-auto-snapshot. These are named <pool|filesystem>@zfs-auto-snap_<time interval>-<year>-<month>-<day>-<hour><minute>. The command zfs -Hrt snapshot <source poolname> will generate a list of all snapshots in a pool in a machine parseable format.

Issuing the command zfs send -ci <old snapshot> <pool|filesystem>@<new snapshot> will send an incremental snapshot from old to new to the commands standard output. I can estimate the amount of data to be transferred by replacing -ci with -cpvni in the zfs send command.

Issuing the command zfs receive -u <backup location> will store a snapshot from its standard input to the backup location.

Snapshots are removed by zfs destroy -d <pool|filesystem>@<snapshot name>. The snapshot name is the portion of the snapshot pattern mentioned above after the @ symbol.

©2020 Robert R. Russell — All rights reserved


Recommended Analyses of Star Wars the Force Awakens

Written by: Robert R. Russell on Sunday, August 2, 2020.

I have watched Mauler’s videos analyzing Star Wars: The Force Awakens. I recommend his detail-oriented and rational critique.

Part 1

Part 2

Part 3

©2020 Robert R. Russell — All rights reserved


ZFS-auto-snapshot

Written by: Robert R. Russell on Saturday, August 1, 2020.

If you use ZFS and you don’t have an auto snapshot tool installed, you need to install one. This tool will make backups a lot easier.

©2020 Robert R. Russell — All rights reserved


Forager 001

Written by: Robert R. Russell on Friday, July 31, 2020.

©2020 Robert R. Russell — All rights reserved


Today's Video Is Delayed

Written by: Robert R. Russell on Friday, July 31, 2020.

I am planning on a video for today’s post. It is processing right now, and that will take some time. Hopefully, it will be prepared and uploaded before 17:00 tonight.

©2020 Robert R. Russell — All rights reserved


No Post Today

Written by: Robert R. Russell on Thursday, July 30, 2020.

There will not be a new post today. I have been running into several problems trying to get a video done for the Friday post. The Saturday post maybe a rant about getting OBS to record in a better intermediate format than x264.

©2020 Robert R. Russell — All rights reserved


NPM Encourages Abandonware

Written by: Robert R. Russell on Wednesday, July 29, 2020.

I am using Gulp.js or just Gulp for automating the compilation of CSS stylesheets for the upcoming custom WordPress theme for this blog. In the process of getting that automation setup, I have concluded that NPM’s extremely lax requirements for adding a package to their servers have resulted in an explosion of abandonware.

One of Gulp’s useful advantages over a more traditional solution like make is its choice to pass virtual files around between stages of the processing chain. Parts of the Gulp chain can modify the contents of a file and pass on those modifications without writing them to disk and creating dozens or more temporary files that require exclusion from git and other tools. The most constructive use of this ability I have found so far is a tool that can replace strings in files based on variables I setup.

NPM manages Gulp’s dependencies and plugins. Since creating a new public NPM package is pretty easy, sharing a plugin you wrote doesn’t take any time. That all sounds great until you end up trying to use a plugin and find that no one has updated it for one or two major versions of Gulp. Worse yet is the situation where some dependency is several versions behind, and either is a security vulnerability itself or requires another dependency that is.

I don’t have time to maintain a public NPM package. I may fix one or two outdated plugins I am probably not going to share those fixes on NPM.

©2020 Robert R. Russell — All rights reserved


Current Opinions on Web Design

Written by: Robert R. Russell on Tuesday, July 28, 2020.

I have been designing websites as a side gig for about a year now. Most of that design work has been CSS modifications to existing themes. Since January, I have needed to do more extensive design changes. That work culminated in a scratch-made WordPress theme designed for people using WordPress as a CMS, not a blog platform.

During that time, I have begun preferring Gutenburg’s design philosophy over Elementor’s. Gutenburg does a better job of separating content, and content-specific layout from the general theme layout than Elementor does. Gutenburg also seems less opinionated about its block styling than Elementor.

The downside to flexibility is a lack of capability to micromanage the layout. I don’t see the appeal of complicated website designs that demand pixel alignment from individual paragraphs or worse letters. I tend towards a utility first approach. That utility first approach doesn’t mean that I do not appreciate any artistry. I cannot entirely agree with form over function.

I like the mobile-first approach to design. However, I do get frustrated at the limitations of mobile devices because they can complicate the implementation of proper form and function. A navigatable mobile interface for tabular data is one example.

©2020 Robert R. Russell — All rights reserved


Programming Languages and Nonoptimum Tool Use

Written by: Robert R. Russell on Monday, July 27, 2020.

Some backstory

In my post about OpenWRT on x86 hardware, I mentioned that I considered balenaEtcher’s size to be a negative trait. I also complained in Yarn versus NPM about the dependency management for Node.js projects.

Both complaints stem from a similar problem, Using a tool without respect for its constraints.

Constraints?

All tools have constraints. For physical tools like a hammer or saw, they have to do with how the tools materials, size, and shape affect whether it scratches a surface on impact or how smooth the cut is. Digital tools like software also have constraints. C, for example, treats nearly all data structures as merely a location in memory. This provides excellent flexibility at the ease of creating memory leaks and buffer overruns.

The creators of a programming language have both a problem domain and several other constraints to juggle when creating the language. The original problem domain especially restricts the general utility of the language.

What does that have to do with the examples in the backstory?

All of those programs use the JavaScript programming language. Netscape originally designed JavaScript to make small dynamic changes to a browser’s DOM. Combined with a choice for fast iteration, JavaScript ended up without a robust typing system and little attention to modularity.

You can see the attachment to a browser’s DOM in Electron. The programming environment that balenaEtcher uses. Two hundred and four megabytes, installed, for a program whose task is to order a dozen or so command-line tools around.

The failure to update transitive dependencies that I ran into using Yarn is a side effect of how Yarn deals with JavaScript’s lack of module separation. The original program I wanted to use didn’t help matters by choosing a broad semantic range for their dependencies.

Conclusion

Like physical tools, software, including programming languages, has constraints. Strive to be a polyglot so you can use the appropriate tool for the job.

©2020 Robert R. Russell — All rights reserved