Massive rewrite incl error handling and formatting

main
capntack 1 year ago
parent 78ac8a36a7
commit 6caa8390fc

@ -4,7 +4,7 @@
<br> <br>
> **Disclaimer:** As with anything to do with your data, you should read and understand what this script does before applying it. I am not responsible for any mishaps. Even if you only replace the variables and run it as I do, me saying "it works on my machine" should not be sufficient. It wouldn't be for me. > **Disclaimer:** As with anything to do with your data, you should read and understand what this script does before applying it. I am not responsible for any mishaps. This script was written for my own personal use case, and I am sharing it in hopes it will hope someone craft their own solution. Even if you only replace the variables and run it as I do, me saying "it works on my machine" should not be sufficient. It wouldn't be for me.
<br> <br>
@ -19,60 +19,79 @@ This script assumes you are running Linux, and have at least basic working knowl
1. Install rsync, restic, and depencies for the script: 1. Install rsync, restic, and depencies for the script:
```bash ```bash
apt install rsync restic moreutils # moreutils installs the `ts` command for timestamping the logs apt install rsync restic moreutils
``` ```
Moreutils installs the `ts` command for timestamping the logs
<br> <br>
2. Create the directory where rsync will backup to: 2. Ensure restic is up to date, in case the version from your repos is behind:
```bash ```bash
mkdir /path/to/dir/to/backup/to restic self-update
``` ```
<br> <br>
3. Copy the rsync manifest template to the script's root directory, rename it as you like, and then fill it out. This will allow the `--include-from` option to only backup what you want. There is some comments in the template, but the gist of it is that the file is read in order. The initial include, `+ */` includes the `$RSYNC_SOURCE` variable from the script all directories within, recursively. The following lines are where you specify the directories and files you explicitely want to backup. The final line, `- *` excludes everything that wasn't explicitely included prior. This allows you to choose a higher directory, say $HOME, but pick and choose what you want within it instead of rsyncing the whole thing. The script also includes the `--prune-empty-dirs` option, which will prevent it from syncing all the empty directory folders within the directoris along the path to what you actually want at the end of it. 3. Initialize the restic "repo" (what restic calls the backup destination):
```bash
restic init --repo /path/to/repo
```
<br> <br>
4. Ensure restic is up to date, in case the version from your repos are behind: Create your repo password when prompted. Do not lose this, as you will otherwise be unable to access your backups. I would suggest a password manager. And possibly a physical copy stored in a very safe place.
<br>
4. Verify your repo is at least version 2, in order to support compression:
```bash ```bash
restic self-update restic -r path/to/repo cat config
``` ```
If it isn't, you may need to revisit step 2 and figure out why your install isn't up to date. Then recreate the repo (you can just delete the faulty repo directory to get rid of it).
<br> <br>
5. Initialize the restic "repo" (what restic calls the backup destination): 5. Create the directory where rsync will backup to:
```bash ```bash
restic init --repo /path/to/repo mkdir -p /path/to/dir/to/backup/to
``` ```
<br> <br>
Create your repo password when prompted. Do not lose this, as you will otherwise be unable to access your backups. I would suggest a password manager. And possibly a physical copy stored in a very safe place. 6. cd to the directory where you want to store the script and clone the repo:
```bash
git clone https://tacksupport.net/git/Tack-Support/Rsync-and-Restic-Backup-Scripts.git
```
<br> <br>
6. Verify your repo is at least version 2, in order to support compression: 7. Run the setup script:
```bash ```bash
restic -p $REPO_PASSWORD -r path/to/repo cat config cd Rsync-and-Restic-Backup-Scripts && sudo chmod +x setup.sh && sudo ./setup.sh
``` ```
If it isn't, you may need to revisit step 4 and figure out why your install isn't up to date. Then recreate the repo (you can just delete the faulty repo directory to get rid of it). > Or, alternatively, read setup.sh and manually perform the steps.
<br> <br>
7. Run the first restic backup. This will take a while, depending on how much data you have. 250 GB took me about an hour and a half. Edit, remove, or add to the tags as desired. Tags can be shared between repos in various combinations. They can be used to search for, query, and prune your backups from their various sources. The `--exclude-caches` option will exclude directories containing the `CACHEDIR.TAG` file. Which isn't all caches, but it's a happy medium between not excluding any, and having to script/search them all out. Pay attention to lack of trailing slashes. 8. Configure the `rsyncManifest`. The `--include-from` option in the script will read this file to only backup what you want. There is some comments in the manifest, but the gist of it is that the file is read in order. The initial include, `+ */` includes the `$RSYNC_SOURCE` variable from the script all directories within, recursively. The following lines are where you specify the directories and files you explicitly want to backup. The final line, `- *` excludes everything that wasn't explicitly included prior. This allows you to choose a higher directory, say $HOME, but pick and choose what you want within it instead of rsyncing the entire directory. The script also includes the `--prune-empty-dirs` option, which will prevent it from syncing all the empty directory folders within the directories along the path to what you actually want at the end of it.
<br>
> Note: if you ever run this command as sudo, whether in your terminal or as a cronjob or any other way, you must always run it and other commands against that repo as sudo. So make your choice now. 9. Run the first restic backup. This will take a while, depending on how much data you have. 250 GB took me about an hour and a half. Edit, remove, or add to the tags as desired. Tags can be shared between repos in various combinations. They can be used to search for, query, and prune your backups from their various sources. The `--exclude-caches` option will exclude directories containing the `CACHEDIR.TAG` file. Which isn't all caches, but it's a happy medium between not excluding any, and having to script/search them all out. Pay attention to lack of trailing slashes.
> Note: if you ever run this command as sudo or root, whether in your terminal or as a cronjob or any other way, you must always run it and other commands against that repo as sudo/root. So make your choice now.
```bash ```bash
restic backup --verbose --compression max \ restic backup --verbose --compression max \
-p $REPO_PASSWORD \
-r /path/to/repo \ -r /path/to/repo \
--tag $TAG1 --tag $TAG2 \ --tag $TAG1 --tag $TAG2 \
--exclude-caches \ --exclude-caches \
@ -81,57 +100,53 @@ restic backup --verbose --compression max \
<br> <br>
8. Verify your backup by first fetching the snapshot ID: 10. Verify your backup by first fetching the snapshot ID:
```bash ```bash
restic -p $REPO_PASSWORD -r /path/to/repo snapshots restic -r /path/to/repo snapshots
``` ```
Then list the files within to verify eveything is there: Then list the files within to verify eveything is there:
```bash ```bash
restic ls -p $REPO_PASSWORD -r /path/to/repo --long $SNAPSHOT_ID restic ls -r /path/to/repo --long $SNAPSHOT_ID
``` ```
Then compare the backup size to the size of the source. This will retrieve the uncompressed size of the repo, and it won't perfectly align. But it should give you an idea. Then compare the backup size to the size of the source. This will retrieve the uncompressed size of the repo, and it won't perfectly align. But it should give you an idea.
```bash ```bash
restic ls -p $REPO_PASSWORD -r /path/to/repo stats $SNAPSHOT_ID restic ls -r /path/to/repo stats $SNAPSHOT_ID
``` ```
And finally, check the integrity of the repo: And finally, check the integrity of the repo:
```bash ```bash
restic -p $REPO_PASSWORD -r /path/to/repo check restic -r /path/to/repo check
``` ```
<br> <br>
9. Copy the restic password template to the script's root directory, rename it as you like, and replace all text within it with just the password. Then secure the file: 11. Replace all text in `resticPassword` with just the password.
```bash
sudo chmod 600 /path/to/restic/password/.file
```
<br> <br>
10. Copy the restic excludes template to the script's root directory, rename it as you like, and replace the `/path/to/restic/password/.file` line with the path to your restic password file. You can also add any other excludes you would like. 12. Replace the `/path/to/restic/password/.file` line in `resticExcludes` with the path to your restic password file. You can also add any other excludes you would like.
<br> <br>
11. Copy the script template to the script's root directory, rename as you like, and then fill out the variables the comments call out. Pay attention to where leading/trailing slashes are omitted. That is on purpose. I find it's best to use absolute paths, that way if you every move the script to a different directory, it won't break. A few notes and definitions above and beyond the comments in the script: 12. Configure `backups.sh` by filling out the variables the comments call out. Pay attention to where leading/trailing slashes are omitted. That is on purpose. I find it's best to use absolute paths, that way if you ever move the script to a different directory, it won't break. A few notes and definitions above and beyond the comments in the script:
a, The script dumps a log of its output into a directory of your choosing (the first variable in the script). There's a directory in script's root directory for that, but feel free to put them wherever you like. a. The script dumps a log of its output into a `backupLogs`.
b. The script includes variables and scripts for both a second rsync and a second restic source/destination. You can add more or remove them as you like. Just note that each rsync really should have a separate source, destination, and manifest. While restic can have multiple sources syncing to the same repo, which also increases the benefit from its deduplication. You can also mix and match tags (though I would advise against using the exact same set of tags on two different sources). And while you can use the same password for each source, maybe don't? b. If you want to run multiple rsync or restic sources/destinations on the same host, copy the relevant section and increment the variables (i.e. "01" to "02"). Just note that each rsync really should have a separate source, destination, and manifest. While restic can have multiple sources syncing to the same repo, which also increases the benefit from its deduplication. You can also mix and match tags (though I would advise against using the exact same set of tags on two different sources). And while you can use the same password for each source, maybe don't?
c. By default, rsync will backup incrementally, but not track version history. The script gets around this by putting each new backup into its own dated directory, and then hardlinking to the inodes of already backed up files, and only backing up new files. The `--delete` option in this case simply doesn't backup a file instead of deleting it at the destination. A "latest" folder is also created for both the script to check against and for ease of finding the lastest backup. This leads us to... c. By default, rsync will backup incrementally, but not track version history. This script gets around this by putting each new backup into its own dated directory, and then hardlinking to the inodes of already backed up files, and only backing up new files. The `--delete` option in this case simply doesn't backup a file instead of deleting it at the destination. A `latest`` folder is also created for both the script to check against and for ease of finding the lastest backup. This leads us to...
d. The rsync script also allows for days of retention. After which older backup directories are deleted. And, thanks to hardlinking, files that were initially backed up in it are not deleted if they are hardlinked in any subsequent backup. `$RSYNC_RETENTION_DAYS` variables are calculated thusly: # of days wanted (i.e. 7) + the latest directory (1) + 1. So in this case, to keep 7 days worth of versioning, you would use a 9 for this variable. d. The rsync script also allows for days of retention. After which older backup directories are deleted. And, thanks to hardlinking, files that were initially backed up in it are not deleted if they are hardlinked in any subsequent backup. `$RSYNC_RETENTION_DAYS` variables are calculated thusly: # of days wanted (i.e. 7) + the latest directory (1) + 1. So in this case, to keep 7 days worth of versioning, you would use a 9 for this variable.
e. The rsync script includes a hacky fix for an issue I ran into rsyncing to an NFS destination. After backing up to the new directory as desired and updating the `latest` hardlink, the timestamps of both would change to the most recent date for the timestamp of 21:20. I have no idea why. And that would mess with the retention if I ran the backup multiple times in a day. As they would all have the same timestamp. So in between updating the `latest` hardlink and running the retention policy, the script runs a `touch` on a `timestamp.fix` file within the `$RSYNC_DEST_PATH`, which fixes the timestamps. If you aren't backing up to an NFS destination, you likely don't need this. And if you know why this is happening, please let me know. Or clone the repo, patch it, and do a pull request so that your fix can be tested and included. e. The rsync script includes a hacky fix for an issue I ran into rsyncing to an NFS destination. After backing up to the new directory as desired and updating the `latest` hardlink, the timestamps of both would change to the most recent date for the timestamp of 21:20. I have no idea why. And that would mess with the retention if I ran the backup multiple times in a day. As they would all have the same timestamp. So in between updating the `latest` hardlink and running the retention policy, the script runs a `touch` on a `timestamp.fix` file within the `$RSYNC_DEST_PATH`, which fixes the timestamps. If you aren't backing up to an NFS destination, you likely don't need this. And if you know why this is happening, please let me know. Or clone the repo, patch it, and do a pull request so that your fix can be tested and included.
f. Pay attention to the restic tags in the script. When the script runs the forget and prune commands, it will run that against the entire repo. So you want to ensure the tags in that command match the backups you want it to actually affect. I would suggest, after running the initial backup in step 7 and then have the script ready, run it and then run the verification steps from step 8 again. Just to be sure you have it right. And if you have multiple sources going to the same repo, do the same. You can also perform [dry runs](https://restic.readthedocs.io/en/latest/060_forget.html#removing-snapshots-according-to-a-policy) on removal polices (and [on backups](https://restic.readthedocs.io/en/latest/040_backup.html#dry-runs) too, btw) to sanity check yourself before accidentally nuking your repo. See the disclaimer at the start of this README. f. Pay attention to the restic tags in the script. When the script runs the forget and prune commands, it will run that against the entire repo. So you want to ensure the tags in that command match the backups you want it to actually affect. I would suggest that after running the initial backup in step 9 and then have the script ready, run it and then run the verification steps from step 10 again. Just to be sure you have it right. And if you have multiple sources going to the same repo, do the same. You can also perform [dry runs](https://restic.readthedocs.io/en/latest/060_forget.html#removing-snapshots-according-to-a-policy) on removal polices (and [on backups](https://restic.readthedocs.io/en/latest/040_backup.html#dry-runs) too, btw) to sanity check yourself before accidentally nuking your repo. See the disclaimer at the start of this README.
g. Regarding the [compression level](https://restic.readthedocs.io/en/latest/047_tuning_backup_parameters.html?highlight=compress#compression) of the restic backup, you can choose `off`, `auto`, or `max`. I ran a super scientific one run each on my backup source and got the following results: g. Regarding the [compression level](https://restic.readthedocs.io/en/latest/047_tuning_backup_parameters.html?highlight=compress#compression) of the restic backup, you can choose `off`, `auto`, or `max`. I ran a super scientific one run each on my backup source and got the following results:
@ -145,13 +160,9 @@ sudo chmod 600 /path/to/restic/password/.file
h. At the end of the script, just prior to the completion of the log file, there is a line that will delete logs older than (by default) 14 days. Feel free to remove this or to edit the retention variable at the top of the script to your liking. h. At the end of the script, just prior to the completion of the log file, there is a line that will delete logs older than (by default) 14 days. Feel free to remove this or to edit the retention variable at the top of the script to your liking.
<br> i. If needed, you can debug the script by uncommenting the 21st line in `backups.sh` to print out commands ran to the log so you can see what the last command ran was. Be sure to comment it back out afterwards so your logs aren't bloated.
12. Make the script executable: j. I self-host an [ntfy](https://ntfy.sh) server to receive notifications on my homelab. (Boilerplate can be found [here](https://tacksupport.net/git/Tack-Support/Boilerplates/src/branch/main/docker-compose/ntfy/docker-compose.yml).) The commented out sections from lines 23 to 33 notify me in case the script fails and lines 125 to 131 notify me if it succeeds. Both also attach the log file. Delete, use, or modify to your own use case.
```bash
sudo chmod u+x /path/to/script.sh
```
<br> <br>
@ -161,14 +172,15 @@ sudo chmod u+x /path/to/script.sh
crontab -e crontab -e
``` ```
> Run as sudo if your restic repo requires.
Paste something like the following to the end of your crontab: Paste something like the following to the end of your crontab:
```bash ```bash
0 0 * * * cd /path/to/script/dir/ && ./script.sh PATH=/absolute/path/to/script/dir
0 0 * * * /absolute/path/to/script/dir/backups.sh
``` ```
You can avoid having to have crontab cd into the script's directory if you place it somewhere in your path. If you do, I would suggest copying the script you just edited to said path folder. That way you can fiddle with and test it without messing with your production script. Then replace the prod script once you have any tweaks figured out.
<br> <br>
### Sources, Inspiration, and Further Reading ### Sources, Inspiration, and Further Reading

@ -0,0 +1,131 @@
#!/bin/bash
############################
# LOGGING & ERROR HANDLING #
############################
# Ensure you set the SCRIPT_DIR variable correctly as the error handling will not catch it
# Change the LOG_RETENTION if you wish for more or less.
readonly SCRIPT_DIR="/path/to/script/dir"
readonly LOG_DIR="${SCRIPT_DIR}/backupLogs"
readonly DATETIME="$(date '+%Y-%m-%d_%H:%M:%S')"
readonly BACKUP_LOG="${LOG_DIR}/backupLog_"${DATETIME}".log"
readonly LOG_RETENTION="14"
exec 3<&1 4<&2
trap "exec 2<&4 1<&3" 0 1 2 3
exec > >(tee >(ts "%Y-%m-%d_%H:%M:%S" > "${BACKUP_LOG}")) 2>&1
set -eEuo pipefail
# Uncomment the below to debug the script
# set -x
# trap 'err_report' ERR
# function err_report() {
# sleep 5
# curl \
# -T "${BACKUP_LOG}" \
# -H "Filename: backupLog_"${DATETIME}".log" \
# -H prio:high \
# -H "Title: Backup Failed on ${HOSTNAME}" \
# ntfyUser:ntfyPassword@ntfyDomain/ntfyTopic
# }
################
# RSYNC SCRIPT #
################
# Configure variables from here...
readonly RSYNC_SOURCE_01="/path/to/dir/to/backup-01"
readonly RSYNC_DEST_01="//path/to/dir/to/backup/to-01"
readonly RSYNC_MANIFEST_01="${SCRIPT_DIR}/rsyncManifest"
readonly RSYNC_RETENTION_DAYS_01="9"
# ...to here
readonly RSYNC_DEST_PATH_01="${RSYNC_DEST_01}/${DATETIME}"
readonly RSYNC_LATEST_LINK_01="${RSYNC_DEST_01}/latest"
# Creates the backup directory
mkdir -p "${RSYNC_DEST_01}"
# -avP will tell rsync to run in archive mode, be verbose, keep partial files if interrupted, and show progress
rsync -avP --delete --prune-empty-dirs --include-from="${RSYNC_MANIFEST_01}" \
"${RSYNC_SOURCE_01}/" \
--link-dest "${RSYNC_LATEST_LINK_01}" \
"${RSYNC_DEST_PATH_01}"
# This will update the latest hardlink
rm -rf "${RSYNC_LATEST_LINK_01}"
ln -s "${RSYNC_DEST_PATH_01}" "${RSYNC_LATEST_LINK_01}"
# The hacky fix for the NFS destination timestamp bug
touch "${RSYNC_DEST_PATH_01}"/timestamp.fix
# This will prune excess version folders.
cd "${RSYNC_DEST_01}"
rm -rf `ls -t | tail -n +"${RSYNC_RETENTION_DAYS_01}"`
#################
# RESTIC SCRIPT #
#################
# Configure all but first and last accordingly.
readonly RESTIC_PASSWORD_01="${SCRIPT_DIR}/.resticPassword"
readonly RESTIC_SOURCE_01="/path/to/dir/to/backup-01"
readonly RESTIC_REPO_01="/path/to/restic/repo-01"
readonly RESTIC_RETENTION_DAYS_01="7"
readonly RESTIC_RETENTION_WEEKS_01="4"
readonly RESTIC_RETENTION_MONTHS_01="6"
readonly RESTIC_RETENTION_YEARS_01="1"
# If you prefer a keep last retention policy, comment out the above 4 and uncomment the below and configure
# readonly RESTIC_RETENTION_KEEP_LAST_01="2"
readonly RESTIC_TAG_01="tag01"
readonly RESTIC_TAG_02="tag02"
readonly RESTIC_EXCLUDES_01="${SCRIPT_DIR}/resticExcludes"
# --p points to the password file, -r points to the restic repo path
restic backup --verbose \
-p "${RESTIC_PASSWORD_01}" \
-r "${RESTIC_REPO_01}" \
--tag "${RESTIC_TAG_01}" --tag "${RESTIC_TAG_02}" \
--exclude-caches \
--exclude-file="${RESTIC_EXCLUDES_01}" \
"${RESTIC_SOURCE_01}"
# Now we forget snapshots and prune data for the same tags in the repo
restic forget --prune --verbose --tag "${RESTIC_TAG_01}","${RESTIC_TAG_02}" \
-p "${RESTIC_PASSWORD_01}" \
-r "${RESTIC_REPO_01}" \
--keep-daily "${RESTIC_RETENTION_DAYS_01}" \
--keep-weekly "${RESTIC_RETENTION_WEEKS_01}" \
--keep-monthly "${RESTIC_RETENTION_MONTHS_01}" \
--keep-yearly "${RESTIC_RETENTION_YEARS_01}"
# If using a keep last retention policy, comment out the above forget command and uncomment the below
# restic forget --prune --verbose --tag "${RESTIC_TAG_01}","${RESTIC_TAG_02}" \
# -p "${RESTIC_PASSWORD_01}" \
# -r "${RESTIC_REPO_01}" \
# --keep-last "${RESTIC_RETENTION_KEEP_LAST_01}"
# Finally, we verify the integrity of the repo
restic check \
-p "${RESTIC_PASSWORD_01}" \
-r "${RESTIC_REPO_01}"
##############
# TIDYING UP #
##############
# Clean up log files older than 14 days
# find "${LOG_DIR}" -mtime +"${LOG_RETENTION}" -type f -delete
find "${LOG_DIR}"/*.log -mtime +"${LOG_RETENTION}" -type f -delete
# End of script message in log
echo > >(tee >(echo "$(ts "%Y-%m-%d_%H:%M:%S") Backup Script Complete" >> "${BACKUP_LOG}"))
# sleep 5
# curl \
# -T "${BACKUP_LOG}" \
# -H "Filename: backupLog_"${DATETIME}".log" \
# -H prio:low \
# -H "Title: Backup Succeeded on ${HOSTNAME}" \
# ntfyUser:ntfyPassword@ntfyDomain/ntfyTopic
Loading…
Cancel
Save