Scope of ownership
Huy works in staging only and does not touch the production environment. For every task below, Huy implements and verifies the change on staging and produces a written handover guideline. Quang owns production and applies the change there by following the handover doc. This split applies to all three tasks — staging implementation + guideline by Huy, production execution by Quang.
Huy — staging
- Implement code changes
- Apply on Azure App Service staging
- Verify & smoke-test
- Write the handover guideline
Quang — production
- Read the handover guideline
- Provision / configure production infra
- Execute the migration / swap in prod
- Own ongoing production operation
Summary
| # | Task | Outcome | Estimate |
|---|---|---|---|
| T1 | Postgres on staging + production handover guide | Staging matches prod stack; Quang has a ready-to-follow checklist | ~10h |
| T2 | Master-data GET APIs: replace Payload hydration step + unit tests & deploy gate | Removes ORM hydration overhead; fixes random 500s; tests block any deploy on failure | ~12h |
| T3 | Azure deployment-slot swap research | Two scenarios — see §T3 | 4h – ~14h |
Estimates are for a single engineer. T3 starts with a 4h research spike whose outcome decides whether the implementation half is needed.
PostgreSQL on staging + production handover guide
Outcome
Staging runs on Postgres so every change is validated against the production engine. Huy does all the work on staging only and produces a written procedure; Quang later applies that procedure in production. Huy does not log into the production database or production servers.
Work items
- Swap the database adapter in the app from SQLite to Postgres.
- Generate an initial Postgres schema + migration.
- Port the master-data search backend to Postgres.
src/search/searchService.ts(~250 lines) andsrc/search/sqliteClient.tsare SQLite-specific raw SQL (json_extract,@libsql/client) — they stop working the moment the adapter switches. Rewrite the JSON-path SQL with Postgres JSONB operators and replace the libsql client with apgwrapper. Without this, the master-data routes are broken on staging. - Provision a Postgres instance for the staging Azure App Service (e.g. Azure Database for PostgreSQL Flexible Server), configure the App Service's outbound access to it, and set
DATABASE_URIwithsslmode=requirein the staging slot's app settings. - Run migrations and re-import master data on staging (the existing
test/migrate-to-pg.tsscript already covers this). - Smoke-test landing, admin, and the master-data endpoints on staging.
- Produce a written production handover document for Quang covering: Postgres provisioning, network access, env-var configuration, migration command to run, data restore from the existing backup scripts, smoke checklist, and rollback path.
Deliverables
- App running on Postgres in staging.
- Initial Postgres migration committed to the repo.
docs/postgres-production-handover.md— the production checklist for Quang.
Estimate — Task 1
| App-side adapter swap + initial migration | 1.5h |
Port searchService.ts + sqliteClient.ts to Postgres (JSONB operators, pg client) | 5h |
| Postgres install + configuration on staging | 1.5h |
| Migrate + re-import master data + smoke test | 1h |
| Write production handover doc | 1h |
| Total | ~10h |
Master-data GET APIs: replace the Payload hydration step, with unit tests & deploy gate
Outcome
The expensive part of the master-data search route is not the WHERE/COUNT query (that is already raw SQL in searchService.ts) — it is the payload.find({ where: { id: { in: ids } }, depth: 1 }) hydration step in src/app/api/master-data/search/route.ts:55-61 which runs all relationship-resolution and field hooks for every paged row. Replace that one call with direct SQL hydration so the route does no Payload-ORM work on the hot path. A unit-test suite is added and made a mandatory gate on every deploy — no path to staging or production that skips tests.
Endpoints in scope (read paths only)
GET /api/master-data/searchGET /api/master-data/versionsGET /api/master-data/facet-values
Writes (admin edits, imports) stay on Payload — they need hooks and validation.
Work items
- Replace the
payload.find({ where: { id: { in: ids } }, depth: 1 })hydration step insearch/route.tswith direct SQL hydration that loads the row + itsdepth: 1relations per collection. There are 18 master-data collections inmasterDataCatalog, each with different relation fields — hydration logic has to cover all of them. - Preserve the existing response shape so WPA and the admin UI need no changes.
- Keep the existing auth check on each route.
- Set up a real unit-test layer — currently absent. The repo's
test/folder is ad-hoc benchmark scripts; vitest'sincludepoints attests/int/**/*.int.spec.tswhich does not exist yet. - One spec per route covering: response shape, filter / pagination correctness, auth rejection (401), and input validation (400).
- Add a shared fixture loader that seeds a minimal master-data dataset (one coding system + a handful of rows across the catalogued collections) — Payload boot is slow, so the loader needs to run once per suite, not per test.
- Add a mandatory test gate to the Azure App Service deploy pipeline: run
pnpm test:intbefore the build/publish step — non-zero exit aborts the deploy and the previous slot keeps serving traffic. - Apply the same gate to the future production deploy path documented in T1.
Deliverables
- Hydration step in
search/route.tsrewritten as direct SQL, covering all 18 catalogued collections; same response contract. - Unit-test suite for the three master-data routes, runnable via
pnpm test:int. - Updated deploy path that fails fast on a red test run.
Estimate — Task 2
Replace hydration step in search route (joins covering all 18 collections + their relations) | 5h |
Verify versions + facet-values against the ported search backend (already raw SQL in searchService.ts) | 1h |
| Unit-test scaffold + fixture loader + 3 route specs | 4h |
| Wire test gate into staging deploy | 1h |
| Manual verification on staging | 1h |
| Total | ~12h |
Research: Azure App Service deployment-slot swap feasibility
Outcome
A clear yes/no on whether this app can use Azure deployment-slot swap for near-zero-downtime releases, plus the artifacts needed to act on the answer. As with the other tasks, Huy researches and prepares the app on staging only; the actual swap setup in the production Azure subscription is owned by Quang, following the runbook Huy writes.
Precondition to confirm before starting
Slot swap is an App Service-only feature. Staging already runs on App Service so research can happen there directly, but the earlier CR proposal (CR-001) suggested moving production off App Service onto self-managed VMs. If that direction is final for prod, T3 is moot — confirm the production target (App Service vs VM) before starting Phase 1.
Phase 1 — Research spike (always happens)
- Audit every place the app holds state on local disk (DB file, media uploads, generated caches, sitemap, logs).
- Audit env-var usage to identify which values must be slot-pinned vs. swappable.
- Review startup behavior against App Service's warmup window.
- Produce a written feasibility verdict.
Phase 1 estimate: 4h
Scenario A
Verdict: cannot use swap
Likely blockers we are looking for:
- State on the slot's local disk (DB file, cache).
- Local-only runtime artifacts the app writes at runtime.
- Startup that exceeds the warmup window.
- Startup contract that the warmup probe cannot satisfy (long Payload boot exceeding the swap warmup window).
Deliverable: written feasibility note explaining the blocker(s) and recommending an alternative release strategy (blue/green via two VMs, or controlled restart with brief downtime).
Total: 4h (research only — stop here)
Scenario B
Verdict: can use swap
Phase 2 work items:
- Ensure no runtime-generated files live on the slot's local disk (the app will not be using media uploads).
- Mark slot-pinned env vars as App Service slot settings.
- Add a warmup endpoint that boots Payload + touches the DB once.
- Write a production swap runbook for Quang: create the staging slot, deploy, migrate, warm up, swap, monitor, and how to swap back.
Deliverable: swap-safe code + docs/azure-slot-swap-runbook.md.
Total: ~14h (4h research + ~10h implementation & docs)