Database Schema

folksbase uses Neon (serverless Postgres) with Drizzle ORM. The schema lives in packages/db/src/schema/ and is shared across the API and background jobs. This page documents every table, their relationships, and the design decisions behind the data model.

Entity Relationships

Every table (except workspace_settings, which uses workspace_id as its primary key) has a UUID primary key generated by Postgres. All data is scoped to a workspace.

Multi-Tenancy Model

folksbase uses workspace-scoped multi-tenancy. This means:

Every table has a workspace_id foreign key pointing to workspaces
All queries filter by workspace_id to isolate data between workspaces
Data isolation happens at the query level, not at the database level (no separate schemas or databases per tenant)

Why query-level isolation?

It’s the simplest approach that works for the scale folksbase targets. There’s no need for separate Postgres schemas or row-level security policies — every repository method accepts workspaceId as a parameter and includes it in the WHERE clause. This keeps the code straightforward and the database simple.

// Every repository query includes workspace_id
export async function findById(workspaceId: string, id: string) {
  return db.query.contacts.findFirst({
    where: and(eq(contacts.id, id), eq(contacts.workspace_id, workspaceId)),
  });
}

Tables

workspaces

The top-level tenant. Every other table references this.

Column	Type	Constraints	Description
`id`	`uuid`	PK, auto-generated	Workspace identifier
`name`	`varchar(200)`	NOT NULL	Display name
`slug`	`varchar(100)`	UNIQUE, NOT NULL	URL-friendly identifier
`created_at`	`timestamp`	DEFAULT now()	Creation timestamp

contacts

The core entity. Stores contact information with support for custom fields.

Column	Type	Constraints	Description
`id`	`uuid`	PK, auto-generated	Contact identifier
`workspace_id`	`uuid`	FK → workspaces (CASCADE), NOT NULL	Owning workspace
`email`	`varchar(254)`	NOT NULL	Email address
`first_name`	`varchar(100)`	nullable	First name
`last_name`	`varchar(100)`	nullable	Last name
`phone`	`varchar(50)`	nullable	Phone number
`company`	`varchar(200)`	nullable	Company name
`avatar_url`	`text`	nullable	Gravatar URL (auto-fetched)
`notes`	`text`	nullable	Free-text notes
`custom_fields`	`jsonb`	nullable, GIN indexed	Arbitrary key-value pairs
`is_unsubscribed`	`boolean`	NOT NULL, DEFAULT false	Unsubscribe flag
`import_id`	`uuid`	nullable	Which import created this contact
`created_at`	`timestamp`	DEFAULT now()	Creation timestamp
`updated_at`	`timestamp`	DEFAULT now()	Last update timestamp

Indexes:

contacts_email_workspace_unique — UNIQUE on (email, workspace_id). Prevents duplicate emails within a workspace. Used by upsert during CSV import.
contacts_workspace_id_idx — on workspace_id. Speeds up workspace-scoped queries.
contacts_email_idx — on email. Speeds up email lookups.
contacts_custom_fields_gin_idx — GIN index on custom_fields. Enables fast JSONB queries.
contacts_created_at_idx — on created_at. Speeds up the 30-day growth query.
contacts_workspace_created_idx — on (workspace_id, created_at). Speeds up workspace-scoped time-range queries.

The custom_fields JSONB Column

The custom_fields column stores arbitrary key-value pairs as JSONB. When a CSV file has columns that don’t map to standard contact fields (email, name, phone, etc.), those columns are stored here. For example, if a CSV has a “Department” column that the user doesn’t map to any standard field, it becomes:

{ "Department": "Engineering", "Employee ID": "12345" }

A GIN index on this column enables fast queries against custom field values. GIN (Generalized Inverted Index) is Postgres’s way of indexing structured data inside JSONB — it creates an index entry for each key-value pair, so queries like “find all contacts where custom_fields contains key X” are fast regardless of how many different custom fields exist.

Column	Type	Constraints	Description
`id`	`uuid`	PK, auto-generated	Tag identifier
`workspace_id`	`uuid`	FK → workspaces (CASCADE), NOT NULL	Owning workspace
`name`	`varchar(100)`	NOT NULL	Tag name
`color`	`varchar(20)`	nullable	Display color
`emoji`	`varchar(10)`	nullable	Display emoji

contact_tags

Junction table for the many-to-many relationship between contacts and tags. A contact can have multiple tags, and a tag can be applied to multiple contacts.

Column	Type	Constraints	Description
`contact_id`	`uuid`	FK → contacts (CASCADE), NOT NULL	The contact
`tag_id`	`uuid`	FK → tags (CASCADE), NOT NULL	The tag

Primary key: Composite on (contact_id, tag_id). This prevents duplicate tag assignments and serves as the index for lookups in both directions. Both foreign keys use CASCADE delete — when a contact or tag is deleted, the junction rows are automatically cleaned up.

csv_imports

Tracks CSV file uploads and their processing status.

Column	Type	Constraints	Description
`id`	`uuid`	PK, auto-generated	Import identifier
`workspace_id`	`uuid`	FK → workspaces (CASCADE), NOT NULL	Owning workspace
`filename`	`varchar(500)`	NOT NULL	Original filename (sanitized)
`blob_url`	`text`	nullable	Vercel Blob storage URL
`status`	`varchar(50)`	DEFAULT ‘pending’	One of: `pending`, `processing`, `completed`, `failed`
`total_rows`	`integer`	DEFAULT 0	Total rows in the CSV
`processed_rows`	`integer`	DEFAULT 0	Rows successfully inserted
`updated_rows`	`integer`	DEFAULT 0	Rows updated (duplicates)
`failed_rows`	`integer`	DEFAULT 0	Rows that failed validation
`column_mapping`	`jsonb`	nullable	User-confirmed column mapping
`error_log`	`jsonb`	nullable	Error samples from failed rows
`ai_suggestions`	`jsonb`	nullable	AI-generated column mapping suggestions
`summary`	`text`	nullable	AI-generated import summary
`created_at`	`timestamp`	DEFAULT now()	Upload timestamp
`completed_at`	`timestamp`	nullable	Processing completion timestamp

csv_exports

Tracks CSV export jobs and their output files.

Column	Type	Constraints	Description
`id`	`uuid`	PK, auto-generated	Export identifier
`workspace_id`	`uuid`	FK → workspaces (CASCADE), NOT NULL	Owning workspace
`filename`	`varchar(500)`	NOT NULL	Output filename
`blob_url`	`text`	nullable	Vercel Blob storage URL
`status`	`varchar(50)`	DEFAULT ‘pending’	One of: `pending`, `processing`, `completed`, `failed`
`total_rows`	`integer`	DEFAULT 0	Total rows exported
`processed_rows`	`integer`	DEFAULT 0	Rows written so far
`tag_ids`	`jsonb`	nullable	Tag filter applied to the export
`error_message`	`text`	nullable	Error message if failed
`created_at`	`timestamp`	DEFAULT now()	Export start timestamp
`completed_at`	`timestamp`	nullable	Export completion timestamp

workspace_settings

Per-workspace configuration. Uses workspace_id as the primary key (one-to-one with workspaces).

Column	Type	Constraints	Description
`workspace_id`	`uuid`	PK, FK → workspaces (CASCADE)	Owning workspace
`resend_api_key`	`text`	nullable	Encrypted Resend API key (AES-256-GCM)
`notify_on_import_complete`	`boolean`	DEFAULT true	Email on import completion
`notify_on_export_complete`	`boolean`	DEFAULT true	Email on export completion
`notify_weekly_digest`	`boolean`	DEFAULT true	Weekly digest email

Naming Conventions

Element	Convention	Example
Table names	`snake_case` plural	`contacts`, `csv_imports`, `contact_tags`
Column names	`snake_case`	`workspace_id`, `first_name`, `created_at`
Index names	`{table}_{column}_idx`	`contacts_workspace_id_idx`
Unique constraints	`{table}_{columns}_unique`	`contacts_email_workspace_unique`

Why These Choices?

Why UUIDs for primary keys?

UUIDs are generated client-side (or by Postgres), so there’s no need to query the database to get the next ID. This matters for batch inserts during CSV import — you can generate all IDs upfront without round-trips.

Why JSONB for custom fields?

Custom fields are inherently schema-less — every CSV file can have different columns. JSONB lets you store arbitrary key-value pairs without altering the table schema. The GIN index keeps queries fast even as the variety of custom fields grows.

Why CASCADE deletes?

When a workspace is deleted, all its data should go with it. CASCADE deletes handle this automatically at the database level, so the application code doesn’t need to manually clean up related records.

Getting Started

Features

Architecture

Background Jobs

API

Deployment

Contributing

Entity Relationships

Multi-Tenancy Model

Why query-level isolation?

Tables

workspaces

contacts

The custom_fields JSONB Column

tags

contact_tags

csv_imports

csv_exports

workspace_settings

Naming Conventions

Why These Choices?

Why UUIDs for primary keys?

Why JSONB for custom fields?

Why CASCADE deletes?

What’s Next?

Backend Architecture

Streaming Architecture

Getting Started

Features

Architecture

Background Jobs

API

Deployment

Contributing

Documentation Index

​Entity Relationships

​Multi-Tenancy Model

​Why query-level isolation?

​Tables

​workspaces

​contacts

​The custom_fields JSONB Column

​tags

​contact_tags

​csv_imports

​csv_exports

​workspace_settings

​Naming Conventions

​Why These Choices?

​Why UUIDs for primary keys?

​Why JSONB for custom fields?

​Why CASCADE deletes?

​What’s Next?

Backend Architecture

Streaming Architecture

Entity Relationships

Multi-Tenancy Model

Why query-level isolation?

Tables

workspaces

contacts

The custom_fields JSONB Column

tags

contact_tags

csv_imports

csv_exports

workspace_settings

Naming Conventions

Why These Choices?

Why UUIDs for primary keys?

Why JSONB for custom fields?

Why CASCADE deletes?

What’s Next?