Individual Jobs - folksbase

folksbase has four background jobs. This page walks through each one — what triggers it, what steps it runs, and how it handles errors. For the general Inngest patterns (step.run, retry isolation, error handling), see the Overview.

process-csv

Trigger: import/csv.confirmed Retries: 3 Source: apps/api/src/jobs/process-csv.ts This is the most complex job in the codebase. It takes a CSV file that a user uploaded, parses it into contacts, and upserts them into the database — potentially hundreds of thousands of rows.

Why chunked processing?

A naive approach would parse the entire CSV and insert all rows in one step. That breaks down at scale for two reasons:

Memory. Loading 100K rows into memory at once risks OOM on the server.
Retries. If the job fails at row 80,000, Inngest retries the entire step — re-parsing and re-inserting all 80,000 rows that already succeeded.

The chunked approach solves both problems. The CSV is parsed in a streaming fashion and split into batches of 500 rows, each stored temporarily in Redis. Then each batch is processed as its own Inngest step. If batch 12 fails, only batch 12 retries.

Steps

Step	What it does
`fetch-import`	Loads the import record from the database. Throws `NonRetriableError` if the record doesn’t exist or isn’t in `processing` state.
`capture-start-time`	Captures `Date.now()` so the finalize step can calculate total duration.
`download-and-chunk`	Streams the CSV from Vercel Blob, parses it with `csv-parse`, and stores batches of 500 rows in Redis with a 1-hour TTL. Returns the chunk count.
`process-batch-{i}`	Reads chunk `i` from Redis, maps columns using the user’s column mapping, validates emails with Zod, fetches Gravatar avatars (concurrency-limited to 10), deduplicates by email within the batch, and upserts into the database in sub-batches of 50. Deletes the Redis key after success.
`finalize`	Marks the import as completed, invalidates stats caches, generates an AI summary via Claude Haiku (with graceful fallback), and sends a completion email. If the failure rate exceeds 5%, also sends an error report email.

Chunking in detail

The downloadAndChunk function streams the CSV — it never loads the full file into memory. As rows are parsed, they accumulate in a buffer. Every 500 rows, the buffer is flushed to Redis:

import:{importId}:chunk:0  →  [500 rows as JSON]  (TTL: 1 hour)
import:{importId}:chunk:1  →  [500 rows as JSON]  (TTL: 1 hour)
...

Redis writes happen concurrently (promises are collected and awaited together after parsing completes). The 1-hour TTL ensures chunks are cleaned up even if the job fails before processing them all. Successfully processed chunks are deleted immediately.

Batch processing details

Each process-batch-{i} step handles its chunk independently:

Column mapping — applies the user’s CSV header → contact field mapping
Email validation — rows without a valid email are counted as failures (up to 10 error samples are collected across all batches)
Gravatar lookup — concurrent HEAD requests with p-limit(10) and a 2-second timeout per request
Deduplication — if the same email appears multiple times in a batch, the last row wins
Cross-chunk deduplication — emails already processed by earlier chunks are skipped (see below)
Upsert — inserts in sub-batches of 50, using onConflictDoUpdate on (email, workspace_id). Returns separate counts for inserted vs updated rows
Progress tracking — importsRepo.incrementProgress() is called after each sub-batch so the frontend can show real-time progress

Cross-chunk email deduplication

Each chunk runs as a separate Inngest step, so in-memory deduplication only catches duplicates within a single chunk. If the same email appears in chunk 0 (rows 1–500) and chunk 3 (rows 1501–2000), the second chunk’s upsert would hit ON CONFLICT DO UPDATE in Postgres. The xmax = 0 check used to distinguish inserts from updates would then report it as an “update” — inflating the updated count even on a first-ever import. To prevent this, the job maintains a Redis set (import:{importId}:seen-emails) that tracks every email processed across all chunks:

Before upserting, the batch checks SMISMEMBER against the set to identify emails already handled by earlier chunks
Those emails are filtered out — they won’t be sent to the database at all
After upserting, newly processed emails are added to the set via SADD
The set uses the same 1-hour TTL as the chunk data and is cleaned up in both the finalize step and the error handler

import:{importId}:seen-emails  →  Redis SET of all processed emails  (TTL: 1 hour)

This ensures that inserted vs updated counts are accurate regardless of how duplicates are distributed across chunks.

AI summary

The finalize step asks Claude Haiku to generate a one-sentence summary of the import results. This call has a 10-second timeout via AbortSignal.timeout(10_000). If it fails for any reason — API error, timeout, unexpected response — a plain fallback string is used instead:

Import completed: 1,234 contacts added, 56 updated, 3 failed.

The AI failure never blocks the import from completing.

Error handling

The entire job is wrapped in a try/catch. On failure:

The import status is set to failed with the error message in error_log
Stats caches are invalidated
A best-effort failure notification email is sent (wrapped in its own try/catch so email errors don’t mask the original)
The error is re-thrown so Inngest can track it

process-export

Trigger: export/csv.confirmed Retries: 2 Source: apps/api/src/jobs/process-export.ts Exports contacts to a CSV file in Vercel Blob storage. Supports optional tag filtering — only export contacts with specific tags.

Why streaming?

Exports can be large. A workspace with 200K contacts would produce a CSV file that’s too big to build in memory. Instead, the job streams contacts in cursor-based batches directly into a CSV stream that pipes to Vercel Blob.

Steps

Step	What it does
`resolve-tag-filter`	If `tagIds` are provided, fetches the matching contact IDs. Returns `null` if no filter.
`upload-empty`	Short-circuit: if the tag filter matched zero contacts, uploads an empty CSV and marks the export as completed.
`resolve-export-metadata`	Discovers all distinct custom field keys across the contacts being exported, and counts total rows. Both queries run in parallel.
`stream-export`	Fetches contacts in cursor-based batches of 1,000, pipes them through `csv-stringify`’s streaming API, and uploads directly to Vercel Blob.
`send-notification`	Checks workspace settings for `notify_on_export_complete`. If enabled, looks up the user’s email via Supabase and sends a completion email.

Two-pass approach

The export needs to know all custom field keys before it can write the CSV header row. That’s why resolve-export-metadata runs first — it discovers the full set of custom field keys across all contacts being exported, so the CSV header includes every possible column. The actual data streaming is a single pass: contacts are fetched in batches of 1,000 using cursor-based pagination, transformed to CSV rows, and piped directly to blob storage. No intermediate file is created.

Error handling

On failure, the export status is set to failed with the error message, and the error is re-thrown for Inngest tracking.

send-welcome

Trigger: user/signed.up Retries: 2 Source: apps/api/src/jobs/send-welcome.ts The simplest job. Sends a welcome email to a new user after they confirm their email address.

Steps

Step	What it does
`send-welcome`	Calls `emailService.sendWelcome()` with the user’s email. Throws if the email service returns `success: false`.

Event data

{
  data: {
    email: string;   // User's email address
    userId: string;  // Supabase user ID
  }
}

The event is sent from the auth webhook handler when a user confirms their email. The job doesn’t need workspaceId — it only needs the email address to send the welcome message.

Why a background job?

Email delivery is a side effect that shouldn’t block the auth webhook response. If Resend is slow or temporarily down, the webhook would time out. Running it as a background job with 2 retries means the welcome email will be delivered even if the first attempt fails.

weekly-digest

Trigger: Cron — 0 8 * * 1 (every Monday at 8:00 AM UTC) Retries: 1 Source: apps/api/src/jobs/weekly-digest.ts Sends a weekly activity summary email to each workspace owner. Only sends if there was activity during the past week.

Steps

Step	What it does
`get-workspaces`	Fetches all active workspace IDs from the database.
`digest-{workspaceId}`	For each workspace: checks if digest is enabled in settings, gathers stats for the past 7 days, and sends the email if there was activity.

Per-workspace logic

Each workspace gets its own step (so a failure in one workspace doesn’t block others). Inside the step:

Check settings — if notify_weekly_digest is false, skip this workspace
Look up user email — via supabaseAdmin.auth.admin.getUserById(workspaceId)
Gather stats — four queries run in parallel:
- Total contacts in the workspace
- New contacts added in the last 7 days
- Imports completed in the last 7 days
- Top 3 email domains from new contacts
Send or skip — if newThisWeek > 0, send the digest email. Otherwise, skip (no point emailing “nothing happened this week”)

Error isolation

Each workspace step has its own try/catch. If gathering stats or sending the email fails for one workspace, the error is logged and the job continues to the next workspace. This prevents one broken workspace from blocking digests for everyone else.

Quick Reference

Job	Event / Trigger	Retries	Key pattern
`process-csv`	`import/csv.confirmed`	3	Chunked step processing (500 rows/chunk)
`process-export`	`export/csv.confirmed`	2	Cursor-based streaming to blob
`send-welcome`	`user/signed.up`	2	Single step, simple email send
`weekly-digest`	Cron: Monday 8am UTC	1	Per-workspace step isolation

​process-csv

​Why chunked processing?

​Steps

​Chunking in detail

​Batch processing details

​Cross-chunk email deduplication

​AI summary

​Error handling

​process-export

​Why streaming?

​Steps

​Two-pass approach

​Error handling

​send-welcome

​Steps

​Event data

​Why a background job?

​weekly-digest

​Steps

​Per-workspace logic

​Error isolation

​Quick Reference

process-csv

Why chunked processing?

Steps

Chunking in detail

Batch processing details

Cross-chunk email deduplication

AI summary

Error handling

process-export

Why streaming?

Steps

Two-pass approach

Error handling

send-welcome

Steps

Event data

Why a background job?

weekly-digest

Steps

Per-workspace logic

Error isolation

Quick Reference