Over-engineering a Blog with Bear and Astro Part 1

I use Bear to take notes and Astro to publish my blog. In fact I’m writing this in Bear right now as I figure out how to create a workflow that allows me to promote a note in Bear into a blog post that will appear on my site which is built with Astro.

Note: I’m not going to cover setting up an Astro project with automated publishing in CI but that’s ultimately where I want to get to.

What am I even doing here?

I’m new to Bear and haven’t built anything significant with Astro so I’ve got some exploring to do first. I know I’ll need to figure out a few things in order to tie it all together:

Sort out how to query Bear’s SQLite database for the relevant data: notes, matching some tag(s)
Write a script that will run this query and generate files that I can transform into astro posts (markdown with frontmatter), JSON is a suitable candidate
Write a script that will take this JSON and emit the markdown/frontmatter files in the proper location so Astro can pick them up
Automatically commit those changes to git, create and push a commit

Working with Bear’s local data

Bear’s data is located at: ~/Library/Group Containers/9K33E3U3T4.net.shinyfrog.bear/Application Data/database.sqlite (I don’t think this is likely to change in the future, but just in case I’m currently using Bear Version 1.9.6 9044).

Bear recommends copying the SQLite database to avoid corrupting Bear’s data.

To be extra careful, I’ll do this temporarily as part of the query/publish workflow. In general, it should be safe to read data from the original, if you want to take that approach.

To get started working with the data I need to understand Bear’s database schema which I’m exploring using the sqlite3 CLI program.

First I’ll make a copy of the database (in this case, into my home directory) just in case:

$ cp "~/Library/Group Containers/9K33E3U3T4.net.shinyfrog.bear/Application Data/database.sqlite" ~/database.sqlite

Then crack open the CLI:

$ sqlite3 ~/database.sqlite
SQLite version 3.37.0 2021-12-09 01:34:53
Enter ".help" for usage hints.
sqlite>

Bear’s Database Schema

To get a sense for how Bear structures things, I start with examining the full database schema by entering .schema. This will output a whole bunch of SQL CREATE statements. The main ones I’m interested in are the ZSFNOTE table which contains the Note data, the ZSFNOTETAG table which contains the Tags data, and the Z_7TAGS table which joins Notes and Tags in a many-to-many relationship (the following is a subset of the full schema, formatted to make it a bit easier to read):

CREATE TABLE ZSFNOTE (
	Z_PK INTEGER PRIMARY KEY,
	Z_ENT INTEGER,
	Z_OPT INTEGER,
	ZARCHIVED INTEGER,
	ZENCRYPTED INTEGER,
	ZHASFILES INTEGER,
	ZHASIMAGES INTEGER,
	ZHASSOURCECODE INTEGER,
	ZLOCKED INTEGER,
	ZORDER INTEGER,
	ZPERMANENTLYDELETED INTEGER,
	ZPINNED INTEGER,
	ZSHOWNINTODAYWIDGET INTEGER,
	ZSKIPSYNC INTEGER,
	ZTODOCOMPLETED INTEGER,
	ZTODOINCOMPLETED INTEGER,
	ZTRASHED INTEGER,
	ZFOLDER INTEGER,
	ZPASSWORD INTEGER,
	ZSERVERDATA INTEGER,
	ZARCHIVEDDATE TIMESTAMP,
	ZCONFLICTUNIQUEIDENTIFIERDATE TIMESTAMP,
	ZCREATIONDATE TIMESTAMP,
	ZLOCKEDDATE TIMESTAMP,
	ZMODIFICATIONDATE TIMESTAMP,
	ZORDERDATE TIMESTAMP,
	ZPINNEDDATE TIMESTAMP,
	ZTRASHEDDATE TIMESTAMP,
	ZCONFLICTUNIQUEIDENTIFIER VARCHAR,
	ZENCRYPTIONUNIQUEIDENTIFIER VARCHAR,
	ZLASTEDITINGDEVICE VARCHAR,
	ZSUBTITLE VARCHAR,
	ZTEXT VARCHAR,
	ZTITLE VARCHAR,
	ZUNIQUEIDENTIFIER VARCHAR,
	ZENCRYPTEDDATA BLOB,
	ZVECTORCLOCK BLOB
);
CREATE TABLE Z_7TAGS (
	Z_7NOTES INTEGER,
	Z_14TAGS INTEGER,
	PRIMARY KEY (Z_7NOTES, Z_14TAGS)
);
CREATE TABLE ZSFNOTETAG (
	Z_PK INTEGER PRIMARY KEY,
	Z_ENT INTEGER,
	Z_OPT INTEGER,
	ZMODIFICATIONDATE TIMESTAMP,
	ZTITLE VARCHAR
);

To view the schema for an individual table you can run .schema <table name>.

What I really need is to query for Notes with some Tags but not others and aggregate Tags into a comma-separated list:

SELECT n.ZTITLE as title, n.ZTEXT as markdown, datetime(n.ZMODIFICATIONDATE, 'unixepoch', '+31 years') as modifiedAt, json_group_array(t.ZTITLE) AS tags
  FROM ZSFNOTE n
    JOIN Z_7TAGS as nt ON nt.Z_7NOTES = n.Z_PK
    JOIN ZSFNOTETAG as t ON nt.Z_14TAGS = t.Z_PK
  GROUP BY n.ZTITLE, n.ZTEXT
  HAVING tags LIKE ‘%blog/solomonhawk%’;

Here I’m querying for Notes that are tagged with #blog/solomonhawk. This is based on how I chose to structure my Note Tags in Bear.

Note: I had some trouble formatting the timestamp values Bear stores for z.MODIFICATIONDATE. Eventually I found a way to get the result I was looking for by using SQLite’s datetime function to format the date as a "unixepoch"and adding 31 years. ¯\_(ツ)_/¯

I’m also using json_group_array to concatenate tags into an array of strings. This is available as part of the JSON extension which was recently included in SQLite core (as far as I can tell).

In order to programmatically get data out of SQLite, rather than interactively, one option is to execute some SQL from a file on disk. This file can also contain sqlite3 commands intermixed with SQL statements which is useful for configuring the output mode (csv, json, etc.), the output destination (defaults to stdout), and which database to query.

The following file will query the database.sqlite in the home directory and output JSON to a file named blog_posts.json:

query.sql

.mode json
.output ~/blog_posts.json
.open ~/database.sqlite

SELECT ...;

This query can be invoked by “shoveling” the file into sqlite3:

$ sqlite3 < query.sql

Or alternatively, using pipes (although shellcheck indicates that this is a useless cat):

$ cat query.sql | sqlite3

If for some reason you’d prefer to keep the file as SQL only, you can accomplish the same thing using pipes and CLI arguments:

$ sqlite3 database.sqlite -cmd “.mode json” < query.sql > blog_posts.json

Building the Import Workflow

Since the purpose of this workflow is to pull blog posts written in Bear into my Astro site, I’m going to keep the scripts in that repo starting with an executable script at /scripts/import:

./scripts/import

#!/usr/bin/env bash

set -e

cd "$(dirname "$0")/.."

# create temporary directory
TMPFOLDER=$(mktemp -d)
TMPFILE="$TMPFOLDER/bear-database.sqlite"

# copy the database
cp "$HOME/Library/Group Containers/9K33E3U3T4.net.shinyfrog.bear/Application Data/database.sqlite" "$TMPFILE"

# query for posts and tags, convert JSON to markdown
sqlite3 "$TMPFILE" -cmd ".mode json" < ./scripts/import-from-bear/query-posts-tags.sql | ts-node ./scripts/import-from-bear

This script will:

Set up a temporary directory
Copy Bear’s database into that directory
Execute the SQL to query tagged posts and emit them as JSON
Feed the resulting JSON into a TS/Node script

Next I’ll need to create ./scripts/import-from-bear/query-posts-tags.sql and ./scripts/import-from-bear. I’ve opted to do the rest of my scripting in TypeScript /Node.js for 2 reasons:

I’m working with JSON data, so manipulating it will be pretty straightforward
The rest of my Astro site is also written in TypeScript, and I want to share the schema type for Posts between the two

To get that up and running I just needed to install ts-node and @types/node. To give myself a bit of extra type safety I’m also going to pull in io-ts which is a runtime type-checking library and fp-ts which is a general purpose functional programming library that I’ve been itching to play with.

$ npm i -D ts-node @types/node fp-ts io-ts

io-ts allows me to write out a schema for Posts that I can use to validate the data I’m pulling in from SQLite so that in case something changes I’ll get slightly better error messages and know where to look. It’s also super simple to have TypeScript infer the type of that schema for use elsewhere in the system.

Note: I also had to add a tsconfig.json to my ./scripts/import-from-bear/ folder so that the TypeScript compiler (tsc) knows that the .ts file therein is a standalone node script (and to avoid tsc picking up the Astro-specific TypeScript configuration in the root tsconfig.json).

./scripts/import-from-bear/tsconfig.json

{
  "extends": "ts-node/node16/tsconfig.json",
  "ts-node": {
    "transpileOnly": true,
    "files": true,
    "compilerOptions": {
      "strict": true
    }
  }
}

The SQL file at ./scripts/import-from-bear/query-posts-tags.sql just contains the query we looked at above for pulling out Notes tagged with #blog/solomonhawk.

While I’m getting stuff set up, I’ll also add a npm script npm run import in my package.json to fire off the import.

package.json

{
  ...
  "scripts": {
    "dev": "astro dev",
    "start": "astro dev",
    "build": "astro check && tsc && astro build",
    "preview": "astro preview",
+    "import": "./scripts/import"
  },
  ...
}

That’s as far as I’ve taken this for now and there’s still a lot of fun things left to figure out. Notably:

Automating the publish flow with a GitHub workflow
Handling embedded media (like images) from blog Notes
Adding additional functionality to the markdown parser, like labels for code blocks
Watching for changes and re-running the import in dev mode

More on that to come. If you made it this far, thanks for sticking around! I hope you learned something along with me. :)