Written by: Robert R. Russell on Wednesday, August 5, 2020.
First, what does a list of these snapshots look like?
[email protected]:~/src/go/zfs_backup$ zfs list -Hrt snapshot dpool
[email protected]_monthly-2020-05-12-1245 96K - 148K -
[email protected]_monthly-2020-06-11-1248 8K - 23.3G -
[email protected]_monthly-2020-07-11-1245 0B - 23.3G -
[email protected]_weekly-2020-07-26-1242 0B - 30.5G -
[email protected]_daily-2020-07-27-1238 4.74G - 31.3G -
[email protected]_daily-2020-08-02-1235 0B - 143G -
[email protected]_weekly-2020-08-02-1240 0B - 143G -
[email protected]_hourly-2020-08-03-1117 0B - 143G -
[email protected]_daily-2020-08-03-1236 0B - 143G -
[email protected]_frequent-2020-08-04-2030 0B - 143G -
[email protected]_frequent-2020-08-04-2045 0B - 143G -
[email protected]_frequent-2020-08-04-2100 0B - 143G -
[email protected]_frequent-2020-08-04-2115 0B - 143G -
[email protected]_hourly-2020-08-04-2117 0B - 143G -
dpool/[email protected]_hourly-2020-08-04-1717 0B - 96K -
dpool/[email protected]_hourly-2020-08-04-1817 0B - 96K -
dpool/[email protected]_hourly-2020-08-04-1917 0B - 96K -
dpool/[email protected]_hourly-2020-08-04-2017 0B - 96K -
dpool/[email protected]_frequent-2020-08-04-2030 0B - 96K -
dpool/[email protected]_frequent-2020-08-04-2045 0B - 96K -
dpool/[email protected]_frequent-2020-08-04-2100 0B - 96K -
dpool/[email protected]_frequent-2020-08-04-2115 0B - 96K -
dpool/[email protected]_hourly-2020-08-04-2117 0B - 96K -
dpool/home/[email protected]_frequent-2020-08-04-2115 0B - 69.1G -
dpool/home/[email protected]_hourly-2020-08-04-2117 0B - 69.1G -
dpool/[email protected] 116G - 442G -
dpool/[email protected]_monthly-2020-05-12-1245 8K - 344G -
I trimmed the previous list down a bit. So what is a regular expression that will match this? The first question is which regular expression library am I using? I am writing this tool in Go. Thus I will use the regexp Go package. Go’s regexp package is based on Google’s RE2 library. The syntax for it is here.
I will start with the snapshot names. The part after the @. Those start with
zfs-auto-snap so “zfs-auto-snap
” will match it.
The next section is which timer made the snapshot. This section can also be
called the increment. The valid timers are yearly, monthly, weekly, daily,
hourly, and frequent for a default install of zfs-auto-snapshot. The regex
“yearly|monthly|weekly|daily|hourly|frequent
” will match these timers. However,
I would like to get which timer created the snapshot without further parsing.
That is the perfect job for a capturing sub match. After adding the capturing
sub match, the regex looks like
“(?P<increment>yearly|monthly|weekly|daily|hourly|frequent)
”.
The final section is the timestamp of the snapshot. Like with the timer section,
it is useful not to have to parse this data a second time. With the sub matches
“(?P<year>[[:digit:]]{4})-(?P<month>[[:digit:]]{2})-(?P<day>[[:digit:]]{2})-(?P<hour>[[:digit:]]{2})(?P<minute>[[:digit:]]{2})
”
will work.
With the snapshot names completed, I need to capture the zfs tree structure
before the @ symbol. I haven’t found a reliable regular expression that will
capture that tree but “(?:[[:word:]-.]+)+(?:/?[[:word:]-.]+)*
” will recognize
a subset of all valid zfs trees. Avoid using anything it won’t recognize, or you
may end up with inaccessible files.
Including some test code the tool’s source code looks like this so far.
package main
import (
"bufio"
"fmt"
"os"
"regexp"
)
const zfsRegexStart string = "zfs-auto-snap"
const zfsRegexIncrement string = "(?P<increment>yearly|monthly|weekly|daily|hourly|frequent)"
const zfsRegexDateStamp string = "(?P<year>[[:digit:]]{4})-(?P<month>[[:digit:]]{2})-(?P<day>[[:digit:]]{2})-(?P<hour>[[:digit:]]{2})(?P<minute>[[:digit:]]{2})"
var zfsRegex = regexp.MustCompile(zfsRegexStart + "_" + zfsRegexIncrement + "-" + zfsRegexDateStamp)
func testSnapshot(possible string, increment string) (bool, bool) {
var matches = zfsRegex.FindStringSubmatch(possible)
if matches == nil {
return false, false
}
var isASnapshot = true
if matches[1] == increment {
return isASnapshot, true
}
return isASnapshot, false
}
func isAYearlySnapshot(possible string) bool {
_, isYearly := testSnapshot(possible, "yearly")
return isYearly
}
func isAMonthlySnapshot(possible string) bool {
_, isMonthly := testSnapshot(possible, "monthly")
return isMonthly
}
func isAWeeklySnapshot(possible string) bool {
_, isWeekly := testSnapshot(possible, "weekly")
return isWeekly
}
func isADailySnapshot(possible string) bool {
_, isDaily := testSnapshot(possible, "daily")
return isDaily
}
func isAnHourlySnapshot(possible string) bool {
_, isHourly := testSnapshot(possible, "hourly")
return isHourly
}
func isAFrequentSnapshot(possible string) bool {
_, isFrequent := testSnapshot(possible, "frequent")
return isFrequent
}
const poolNameRegex string = "(?:[[:word:]-.]+)+(?:/?[[:word:]-.]+)*"
var snapshotLineRegex = regexp.MustCompile("^" + poolNameRegex + "@" + zfsRegex.String() + ".*$")
func main() {
//fmt.Println(snapshotLineRegex.MatchString("dpool/ww[email protected]_frequent-2020-08-04-1830\t0B\t-\t201M\t-"))
input := bufio.NewScanner(os.Stdin)
for input.Scan() {
if snapshotLineRegex.MatchString(input.Text()) {
fmt.Println(snapshotLineRegex.FindStringSubmatch(input.Text()))
fmt.Println(snapshotLineRegex.SubexpNames())
} else {
fmt.Printf("%s\t%s\n", input.Text(), "Is not a snapshot.")
}
}
if err := input.Err(); err != nil {
fmt.Fprintln(os.Stderr, "reading Standard Input:", err)
}
}
I will continue this tomorrow. See you then!
©2020 Robert R. Russell — All rights reserved