../
Introducing shroudage
Published: , updated:
tl;dr
shroudage
is a shell wrapper around age
to
set up transparent encryption of content in a git repository, allowing you to
have plaintext in your working copy, while keeping the contents encrypted in
the repository.
The only dependency is age
itself.
Quick overview
Running shroudage init
will configure a repository to use the git filters.
There are two pieces - the in-repository files, and the repository
configuration files.
The in-repository files are populated on the first run - a
passphrase-encrypted Age key that will be created for you if it doesn’t
exist, a .gitattributes
file to define which files should use which git
filters, and a copy of the shroudage
script so you don’t need to install is
separately in future.
The repository configuration files actually configure how the filters work,
and needs to happen with every clone of the repository. If you don’t do this,
you’ll just see the encrypted contents. These are the clean
(working copy
to index/commit) and smudge
(commit to working copy) filters.
By default, the content
folder will be transparently encrypted for you.
You can use git diff --staged --no-textconv
to ensure that when you add a
file to the staging area, it is being encrypted as you expect. Similarly, if
you have a commit, you can use git show --no-textconv
to verify it. Then
you can push it to upstream.
Otherwise, git diff
and git show
will show you plaintext changes.
On a new clone, you’ll need to run git restore content
to decrypt the files
after running .scripts/shroudage init
.
The full story
Run-up
I’ve been wanting to try out diary-style journaling lately to augment some other types of journaling I’ve been doing lately, and had some requirements that were hard to find in existing tools. I wanted something low-friction, with the ability to synchronize across devices (and not just Apple devices), while also being encrypted.
Ideally, it would be text-based, easy to search and navigate, and so forth.
I did look through a bunch of options, but they all had some non-trivial irritations. At the end of the day, it’s basically what I want from the environment I use today - text files, version control, and so forth, just with encryption.
I don’t know where the idea of git filters came from exactly, but I’ve been
researching what’s changed in the security space since I last spent time
in it, and I’d encountered age
along the
way, including using age
-encrypted files in public repositories, and the
idea of using age
with git filters to transparently encrypt content in a git
repository appeared somewhere.
I quickly found git-crypt
,
and transcrypt
, which looked
like they would work, but for whatever reason I wanted to look at age
, and
continued until I found git-agecrypt
.
At this point, I almost committed, but then I ran into a blog post about
using the age
command line directly as a filter,
and I tried this out. And it worked-ish! Nice and simple use of an existing
tool without much overhead. Easy to understand. No unnecessary dependencies.
This allowed me to start that journaling.
The issue
This worked-ish, as I said, and the -ish
was a bit annoying - other
checkouts of the repository would show the files as different from what is in
the repository in git status
.
Short version is that encrypting the same content twice won’t generate the same encrypted content - and in these cases git viewed the files on-disk as changed, and was re-processing them and thus encryting them, and the content was shown as different.
The solution
git-agecrypt
had a solution for this - it maintains its own cache of hashed
plaintext content to encrypted content, and could consult that and return the
cached encrypted content instead if the plaintext didn’t change.
Again, probably easiest to move over to it?
I then realized that this copy wasn’t really even needed - git already has the content in the repository, and I could just fetch it. It would mean moving from the simple command line, though…
So I needed a name for the script (the most important thing, obviously), and
I quickly created a wrapper around those commands. Then, updated the clean
subcommand to query the git repository itself for HEAD
’s copy of the file,
and reuse the encrypted content if the plaintext hasn’t changed.
Scaling
Git appears to do a good job avoiding doing work it doesn’t need to do. So,
running strace -f -e trace=%process git diff
(and similar operations) shows
that it only ends up calling shroudage
for changed content. I wasn’t
exactly planning on doing this with tens of thousands of files, but even a
hundred or two files will eventually be slow if there’s some operations run on
all of them.
Anyway, nice to know it’ll support a few years of daily journaling.
Limitations and concerns
Probably the biggest limitation right now is that it requires vigilance to validate it is operating the way you expect, and the commands to do these are a little esoteric. Feeling like something might just silently fail to run and suddenly there’s plaintext in the repository isn’t all that reassuring.
Maybe something like a pre-commit hook would help do make this less fraught?
And maybe some check
target that actually stages a change to validate things
are working?