Developing Loaders
Introduction
Loaders are the programs that pull voter file data from local governments and transform it into Specification-compliant data.
Each loader covers a single political division (e.g. a county) and lives as a TypeScript module automatically loaded from the packages/core/src/divisions/ directory tree at the path corresponding to the OCD Division ID.
e.g. packages/core/src/divisions/country-us/state-tx/county/travis.ts
Loaders must export an object that complies with the LoaderSpec interface:
Properties
| Property | Type | Defined in |
|---|---|---|
bbox | Envelope | utils/codec.ts:91 |
codec | object | utils/codec.ts:90 |
codec.safeDecode | | { data: infer; success: true; } | { error: unknown; success: false; } | utils/codec.ts:90 |
division | DivisionShape | utils/codec.ts:92 |
metadata? | string | utils/codec.ts:94 |
Codec shall be a Zod Codec, and metadata should be a JSON string.
Getting Started
Prerequisites
- Node.js
- Yarn
- A local clone of the Protovoters repository
- A publicly available voter file data source for the division you want to support
Create the loader
Create a new .ts file at the appropriate path under packages/core/src/divisions/. The path should mirror the OCD Division ID hierarchy of the division you're targeting, using - instead of :.
For example, to create a loader for Queens County, NY:
packages/core/src/divisions/country-us/state-ny/county/queens.tsDefine the bounding box
Export a bbox constant with the geographic bounding box of the division. This is used for spatial queries (e.g. forBbox() and forPolygon()).
export const bbox = {
xMin: 0,
yMin: 0,
xMax: 0,
yMax: 0,
} as const;Then include the bbox in your final default export.
You can find the bbox using bbox tool.
Define the codec
Define a Zod codec and input schema to transform the fetched voter objects into VoterRecords. If you include an address column, we will geocode it from an OpenStreetMap Slice of your bbox.
import z from "zod";
import { voterId, iri, literal, VoterRecord } from "#utils/codec.js";
const InputVoterShape = z.object({
voter_id: z.coerce.string(),
full_name: z.string(),
status: z.enum(["ACTIVE", "INACTIVE"]),
address: z.string(),
precinct: z.coerce.number(),
});
const STATUS_MAP: Record<string, string> = {
ACTIVE: "https://protovoters.org/spec#Active",
INACTIVE: "https://protovoters.org/spec#Inactive",
};
const codec = z.codec(
InputVoterShape,
VoterRecord,
{
decode(row) {
return {
address: row.address,
voter: {
"@id": `${BASE_URI}/voter:${row.voter_id}`,
$type: "VoterShape",
voterID: [
voterId("https://protovoters.org/spec#StateVoterRegistrationId", row.voter_id),
],
name: [literal(row.full_name, "en-US")],
registrationStatus: iri(STATUS_MAP[row.status]),
residentialAddress: iri(row.address),
divisionMembership: [iri(`${BASE_URI}/precinct:${row.precinct}`)],
},
};
},
encode() {
throw new Error("Encode not supported");
},
},
);If your input data has dynamic columns (e.g. election history columns that vary per file), use .passthrough() on your input schema so unknown keys survive parsing, then access them in the decode function.
Create the root division
Define a DivisionShape for the division your loader covers:
import { DivisionShape } from "#types/codegen/schema.js";
import { DataFactory } from "n3";
const BASE_URI = "https://protovoters.org/divisions/country-us/state-ny/county-queens";
const rootDivision = DivisionShape.$create({
$identifier: BASE_URI,
ocdID: "ocd-division/country:us/state:ny/county:queens",
divisionName: [DataFactory.literal("Queens County", "en-US")],
isAtomic: false,
});Export the loader
Export a default object that satisfies the LoaderSpec interface. This is where you will include your data stream--the core library will handle routing it asynchronously through Zod. Records that fail validation are silently skipped.
import * as csv from "#utils/csv.js";
import type { LoaderSpec } from "#utils/codec.js";
const VOTERFILE_URL = "https://example.gov/voterfile.csv";
export default {
codec,
bbox,
division: rootDivision,
source: () => csv.stream(VOTERFILE_URL, { header: true }).pipeThrough(
new TransformStream({
transform(chunk, controller) {
controller.enqueue(chunk.data as unknown[]);
},
}),
),
} satisfies LoaderSpec; Utilities
Import utilities to build your loaders from #utils. e.g:
import * as csv from "#utils/csv"codec
voterId(type, value)— Creates aVoterIdShape.$Jsonentryiri(value)— Creates a JSON-LD IRI reference ({ "@id": value })literal(value, language?)— Creates a JSON-LD literal ({ "@value": value, "@language": language })
csv
csv.stream(url, options)— Returns aReadableStreamthat fetches a CSV and emits chunks of parsed rows. Use this for voter file data.csv.fetch(url, options)— Fetches and parses an entire CSV into memory. Use this for smaller reference data.
Both accept { header: true } as options to treat the first row as column headers.
Testing
After implementing your loader, you can test it with the CLI:
yarn @protovoters/core build
yarn protovoters build
node packages/cli/dist/protovoters.mjs build ocd-division/country:us/state:ny/county:queensVerify that:
- The output file is a valid FlatGeobuf
- The schema follows the Protovoters Specification
- Voter records contain the expected fields
- Coordinates fall within the bounding box
Example
For a complete example, see the Travis County loader.