Architecture
Project structure, deep link flows, database schema, and dual DB adapter pattern.
Project Structure
deeplink-server/
├── src/
│ ├── index.js # Entry point — starts Express server
│ ├── secrets.js # GCP Secret Manager resolver
│ ├── utils.js # detectPlatform, buildFingerprint, sha256, getClientIp
│ ├── db/
│ │ ├── index.js # DB factory — selects adapter at startup
│ │ ├── sqlite.js # SQLite adapter (node:sqlite, Node 22+)
│ │ └── postgres.js # PostgreSQL adapter (pg.Pool)
│ └── routes/
│ ├── links.js # CRUD links, bulk, templates
│ ├── redirect.js # GET /:slug — smart redirect logic
│ ├── analytics.js # Overview, realtime, per-link, query engine
│ ├── match.js # POST /match — deferred deep link resolution
│ ├── resolve.js # GET /resolve/:tenant/:slug — Universal Link tracking
│ ├── events.js # Custom events + impressions
│ ├── auth.js # Login, refresh, logout, me, change-password
│ ├── users.js # User CRUD (admin only)
│ ├── apikeys.js # API key management
│ ├── tenants.js # Tenant settings + super-admin routes
│ ├── notifications.js # Notifications CRUD
│ ├── audit.js # Audit log
│ ├── privacy.js # GDPR purge + retention
│ ├── dpa.js # DPA template
│ ├── diagnostic.js # Link tester
│ ├── search.js # Global search
│ ├── export.js # GCS / S3 stats export
│ ├── skan.js # SKAdNetwork postbacks
│ ├── banner.js # Smart Banner config + JS
│ ├── funnel.js # Conversion funnel
│ └── wellknown.js # AASA + assetlinks.json
├── public/
│ ├── dashboard.html # SPA dashboard
│ ├── docs/ # Multi-page documentation (this site)
│ └── *.webp / *.svg # Logos + assets
├── schema.sql # PostgreSQL schema (also used for reference)
├── .env.example
├── docker-compose.yml
└── package.json
Direct Deep Link Flow
When a user clicks a short link from a device where the app is installed, the server serves a lightweight HTML page that attempts to open the URI scheme and detects failure via visibilitychange.
Platform Routing
| Platform | App installed | App absent |
|---|---|---|
| iOS | ios_uri_scheme | ios_store_url → web_url |
| Android | android_uri_scheme via intent:// | android_store_url → web_url |
| Desktop | — | web_url |
Deferred Deep Link Flow
The deferred deep link solves the problem of users who don't have the app at the time of the click. The server remembers the intent and returns it to the app at first launch.
Fingerprinting Algorithm
// src/utils.js — buildFingerprint()
function buildFingerprint(ip, userAgent, date) {
const raw = `${ip}:${userAgent}:${date}`; // "1.2.3.4:Mozilla/5.0...:2026-04-11"
return crypto.createHash('sha256').update(raw).digest('hex'); // 64-char hex
}
// Match tries both today and yesterday to handle midnight timezone edge cases
const todayFp = buildFingerprint(ip, ua, todayUTC);
const yesterdayFp= buildFingerprint(ip, ua, yesterdayUTC);
// → LEFT JOIN clicks ON fingerprint IN (todayFp, yesterdayFp)
Health checks
État réel 2026-05-02 : un seul endpoint health-check unifié. Pas de
séparation classique /healthz (liveness) / /readyz
(readiness) / /livez à ce jour. Source :
server.js:289.
Réponse 200 OK (état nominal)
{
"server": "ok",
"uptime": 12345.67, // secondes depuis boot
"database": "ok",
"db_latency_ms": 3, // SELECT 1 round-trip
"background": { ... }, // stats jobs background (cf. bg-stats.js)
"status": "ok" // overall
}
Réponse 503 Service Unavailable (DB down)
{
"server": "ok",
"uptime": ...,
"database": "error",
"db_error": "ECONNREFUSED 127.0.0.1:5432",
"background": { ... },
"status": "degraded"
}
Configuration probes Cloud Run / Kubernetes
Comme un seul endpoint existe, readiness probe = liveness probe = startup probe
pointent tous vers /api/v1/health. Pour un découpage plus fin
(probe rapide sans DB pour liveness), il faudrait créer un /livez
dédié — chantier hors scope DOC-INFRA-04 (doc-only).
# Cloud Run YAML
spec:
template:
spec:
containers:
- name: deeplink-server
startupProbe:
httpGet:
path: /api/v1/health
port: 8080
failureThreshold: 30
periodSeconds: 10
livenessProbe:
httpGet:
path: /api/v1/health
port: 8080
periodSeconds: 30
failureThreshold: 3
# Kubernetes Deployment équivalent
livenessProbe:
httpGet: { path: /api/v1/health, port: 8080 }
initialDelaySeconds: 15
periodSeconds: 30
readinessProbe:
httpGet: { path: /api/v1/health, port: 8080 }
initialDelaySeconds: 5
periodSeconds: 5
Limite — pas de découpage liveness/readiness
Conséquence : si la DB devient indispo, l'instance est marquée NOT READY
ET unhealthy → Cloud Run/K8s la kille au lieu de la sortir du LB. Pour
une app qui peut servir du contenu cached même DB-down, c'est sub-optimal.
Roadmap : créer GET /livez (process-only, ignore DB)
+ GET /readyz (vérifie aussi quotas, secrets) en complément
de /api/v1/health.
Runbooks opérationnels
Procédures step-by-step pour les incidents/maintenance courantes. Hébergés
dans pilotage/runbooks/ du repo (Markdown). Consultables via
GitHub directement — pas d'inclusion HTML pour ne pas faire un build
pipeline juste pour ça.
| Runbook | Cas d'usage | Lien GitHub |
|---|---|---|
db-down.md |
Cloud SQL injoignable (maintenance, incident GCP, exhaustion connexions) | → ouvrir |
incident-securite.md |
Procédure NIST 800-61 : containment → eradication → recovery → post-mortem (cf. SEC-024) | → ouvrir |
scheduler-stuck.md |
Le cron scheduler.js n'avance plus / doublons de jobs / lock orphelin |
→ ouvrir |
secret-rotation.md |
Rotation manuelle d'un secret GCP Secret Manager (JWT, OAuth tokens, SA keys) | → ouvrir |
Cf. pilotage/runbooks/ pour la liste complète (s'enrichit régulièrement).
Database Schema
14 tables, compatible with both SQLite and PostgreSQL. The adapter layer handles dialect differences automatically.
Core Tables
| Table | Purpose | Key Columns |
|---|---|---|
| links | Short links with all routing rules | id, slug (UNIQUE), ios_uri_scheme, android_uri_scheme, web_url, custom_data (JSON), targeting_rules (JSON), expires_at, max_clicks, active |
| clicks | Every click event (+ deferred installs) | link_id (FK → links), platform, ip_hash, fingerprint (indexed), country, matched, source, created_at |
| events | Custom SDK events (purchase, signup…) | link_id (FK), name, properties (JSON), revenue, currency, created_at |
| impressions | View-through attribution | link_id (FK), placement, ip_hash, fingerprint, created_at |
Auth & Multi-Tenancy Tables
| Table | Purpose | Key Columns |
|---|---|---|
| tenants | Tenant isolation | slug (UNIQUE), name, plan, active, settings (JSON) |
| users | Dashboard users (local + OIDC) | email, password_hash, role (admin|user), oidc_sub, disabled, tenant_id |
| api_keys | API key credentials | name, key_hash, key_prefix, scope (full|read), tenant_id |
| sessions | JWT deny list (logout) | jti, expires_at |
Feature Tables
| Table | Purpose | Key Columns |
|---|---|---|
| templates | Link preset templates | name, description, preset (JSON), tenant_id |
| notifications | In-app notifications | type, message, read, user_id, created_at |
| audit_logs | Mutation journal | action, resource_type, resource_id, user_id, details (JSON), created_at |
| skan_postbacks | SKAdNetwork postbacks | version, campaign_id, conversion_value, transaction_id (UNIQUE) |
| settings | Tenant key-value config (export creds, etc.) | key, value (encrypted at rest), tenant_id |
Key Indexes
-- Performance-critical indexes
CREATE INDEX idx_links_slug ON links(slug);
CREATE INDEX idx_links_active ON links(active);
CREATE INDEX idx_clicks_fingerprint ON clicks(fingerprint); -- deferred match
CREATE INDEX idx_clicks_link_id ON clicks(link_id);
CREATE INDEX idx_clicks_created_at ON clicks(created_at); -- analytics queries
CREATE INDEX idx_events_link_id ON events(link_id);
Dual DB Adapter
The module src/db/index.js automatically selects the adapter at startup based on DATABASE_URL. Routes use an identical interface regardless of the backend.
Unified Interface
// Same call in any route, regardless of DB backend:
const link = await db.queryOne(
'SELECT * FROM links WHERE slug = ?',
[slug]
);
await db.run(
'UPDATE links SET active = ? WHERE id = ?',
[0, link.id]
);
const rows = await db.query(
`SELECT ${db.sqlDate('created_at')} as date, COUNT(*) as clicks
FROM clicks WHERE link_id = ?
GROUP BY date ORDER BY date`,
[link.id]
);
db.sqlDate(col) is the only dialect-specific method. It generates strftime('%Y-%m-%d', col) for SQLite and TO_CHAR(col AT TIME ZONE 'UTC', 'YYYY-MM-DD') for PostgreSQL. All other SQL is portable between the two databases.GCP Secret Manager Integration
Any environment variable can be replaced with a gcp-sm:// reference. The server resolves all references at startup before any routes are initialized.
Reference Syntax
| Form | Example | Effect |
|---|---|---|
| Short | gcp-sm://secret-name | Latest version from $GCP_PROJECT_ID |
| Short + version | gcp-sm://secret-name@3 | Specific version — useful for pinning |
| Full resource path | gcp-sm://projects/my-proj/secrets/name/versions/latest | Explicit resource (ignores $GCP_PROJECT_ID) |
gcp-sm://... as an API key.Stats Export (GCS / S3)
The server can export the links and clicks tables to Google Cloud Storage and/or Amazon S3, triggered by an external cron calling POST /api/v1/export/run.
Export Formats
| Format | Files | Use Case |
|---|---|---|
| JSON | export-HHMMSS.json — single file with { exported_at, links, clicks } |
BigQuery, Snowflake, programmatic ingestion |
| CSV | links-HHMMSS.csv + clicks-HHMMSS.csv |
Athena, DuckDB, Excel |
Migrations history (recent)
Numbered, idempotent migrations in src/migrations.js applied at boot. Table _migrations tracks applied IDs. Never DROP COLUMN/DROP TABLE without explicit spec.
| # | Date | Chantier | Description |
|---|---|---|---|
| #60 | 2026-04-21 | FNL-UX | Tables funnels (saved + shared) + funnel_steps + index tenant_id/shared_with |
| #61 | 2026-04-23 | PAYOUT | Tables payout_batches, payout_items + columns stripe_transfer_id, payout_status on referrals |
| #62 | 2026-04-25 | AD-SYNC | Tables ad_accounts (Google Ads, TikTok, Meta) + ad_costs_daily (pre-aggregated rollup) |
| #63 | 2026-04-26 | TENANT-BACKFILL | Backfill tenant_settings for legacy tenants (pre-new-format) |
| #64 | 2026-04-28 | OG-AUTO-FETCH | Columns og_title_auto, og_description_auto, og_image_auto, og_fetched_at on links |
| #65 | 2026-04-30 | MSG-RE | Rename slack_url → slack_webhook_url + teams_url → teams_webhook_url (redact consistency) |
| #66 | 2026-05-01 | IDG-LIVE | Indexes events(visitor_id, timestamp) + events(device_anchor_id, timestamp) for live identity resolve |
Backend Jobs & Scheduled Tasks
Cloud Run scales to 0 and restarts instances: a naive setInterval only runs when the instance is alive (or runs multiple times if multiple instances). DeepLink uses a dual strategy:
- Prod (Cloud Run + Cloud Scheduler) — Cloud Scheduler calls
POST /api/v1/internal/tickevery minute (shared secret token). One execution guaranteed regardless of instance count. - Dev / self-hosted — if
CLOUD_SCHEDULER_TICK_TOKENis absent, server starts a localsetInterval(tick, 60_000). Simple, no external dependency.
Internal tick endpoint
- Route:
POST /api/v1/internal/tick(headerX-Tick-Token: <CLOUD_SCHEDULER_TICK_TOKEN>) - Security: 401 if invalid token. Cloud Scheduler configured with OIDC or shared secret.
- Action: invokes
runScheduledJobs()which dispatches to all jobs ready (next_run_at <= NOW()).
Distributed lock (anti-collision multi-instance)
-- Pseudo-code of the wrapper running each job
BEGIN;
SELECT * FROM scheduled_jobs
WHERE name = 'identity-merge'
AND next_run_at <= NOW()
FOR UPDATE SKIP LOCKED;
-- If row is locked by another instance → SKIP, do nothing
-- Otherwise: execute the job, then:
UPDATE scheduled_jobs
SET last_run_at = NOW(),
next_run_at = NOW() + interval '1 hour',
last_status = 'success'
WHERE id = $1;
COMMIT;
SELECT FOR UPDATE SKIP LOCKED is the standard Postgres pattern for distributed locks without Redis.
Jobs runtime
| Job | File | Frequency | Role |
|---|---|---|---|
identity-merge-job | src/jobs/identity-merge.js | hourly | Nightly merge of identity_aliases: for each (visitor_id ↔ user_id) alias since last run, propagates user_id to all events sharing the visitor_id (retro-attribution). See spec IDG-LIVE for live version. |
onboarding-reminder | src/jobs/onboarding-reminder.js | every 6h | Detects invited users who haven't completed onboarding for ≥ 48h, sends reminder email (max 2 reminders per user). |
webhook-retry | src/jobs/webhook-retry.js | every minute | Retries webhook deliveries with next_retry_at <= NOW(). Backoff: 1m/5m/30m/2h/12h. Marks abandoned after 5 fails. |
scheduled-reports-runner | src/jobs/scheduled-reports.js | hourly | Detects scheduled_reports due (cron parsed), generates PDF/CSV via services/report-generator.js, sends by email/Slack. |
audit-log-cleanup | src/jobs/audit-cleanup.js | daily 03h UTC | Deletes audit logs > tenant retention (default 90d, configurable via audit_retention_days setting). |
metrics-aggregator | src/jobs/metrics-aggregator.js | every 15min | Pre-aggregates raw events into events_daily (rollup for dashboard) — reduces analytics query cost. |
Cloud Scheduler setup (prod)
gcloud scheduler jobs create http deeplink-tick \
--location=europe-west1 \
--schedule="* * * * *" \
--uri="https://api.deeplink.com/api/v1/internal/tick" \
--http-method=POST \
--headers="X-Tick-Token=${CLOUD_SCHEDULER_TICK_TOKEN}" \
--attempt-deadline=30s
Manual admin trigger
- Endpoint:
POST /api/v1/admin/jobs/run/:job_name(super-admin) - Dashboard page:
Settings → Background Jobs— table jobs withname | last_run | next_run | last_status | last_duration_ms+ "Run now" button per row - Use case: force re-run after bug fix, debug staging, post-restore DB
Monitoring
- Each job logs
{ severity: "INFO", job_name, duration_ms, status, items_processed }→ Cloud Logging - Metrics exported to Cloud Monitoring:
deeplink/job/duration_ms,deeplink/job/last_status(gauge 0=fail, 1=success) - Recommended alert:
last_status == 0 for > 30minorduration_ms > p95 historical × 3
Identity lifecycle — chronology of attribution
How DeepLink reconciles a user's actions across multiple sessions, devices, and identification states. Concrete walkthrough with "Alice" example: clicked an ad on day J-3, opened the app, signed up on day J-1, purchased on day J0 — how do we attribute the purchase to the original click?
Identity cascade (priority order)
user_id > device_anchor_id > sdk_instance_id > visitor_id > fingerprint > ip_hash
↓ stronger ID wins. COALESCE() over the cascade in queries.
| ID | Source | Persistence | Cross-device? |
|---|---|---|---|
user_id | Set by identify() SDK call (your DB user PK) | Forever | ✅ Yes |
device_anchor_id | Mobile SDK — IDFV (iOS), Android ID hashed | Until app uninstall | ❌ No (per device) |
sdk_instance_id | UUID generated at SDK init | Per app install (resets on reinstall) | ❌ No |
visitor_id | Web cookie / localStorage | 1 year (cookie default) | ❌ No (per browser) |
fingerprint | SHA-256(IP + UA family + date) | Probabilistic, 2-day window | ~ partial |
ip_hash | SHA-256(IP) | Per request | ~ very partial |
Alice's chronology
J-3 Alice clicks an ad on Facebook (mobile web, anonymous)
→ click row: { visitor_id: V1, fingerprint: F1, user_id: NULL }
J-3 Alice opens the app for the first time (deferred deep link)
→ SDK calls /resolve with fingerprint
→ server matches click via fingerprint → app_open event
→ app_open row: { sdk_instance_id: S1, device_anchor_id: D1,
visitor_id: V1 (carried over), user_id: NULL }
→ identity_alias created: V1 ↔ S1 ↔ D1
J-1 Alice signs up — SDK identify("alice@acme.com" → user_id=U42)
→ all subsequent events tagged user_id=U42
→ identity_alias updated: V1 ↔ S1 ↔ D1 ↔ U42
J0 Alice purchases — purchase event with user_id=U42
→ purchase row: { user_id: U42, sdk_instance_id: S1 }
Attribution query: "which click drove this purchase?"
→ resolve U42 backwards through identity_aliases
→ V1 → fingerprint F1 → original click (J-3, Facebook ad)
→ ✅ purchase attributed to that click 3 days later
Live vs nightly resolve
- Nightly merge job (
identity-merge-job) — cascades alias creation to historical events. Latency: up to 24h. - Live resolve (IDG-LIVE chantier, in progress) — funnel/cohort/LTV routes call
identity-graph.jsin-flight, no delay. Cf.specs/active/identity-graph-live-resolve.md.
Service domain-resolver.js
Express middleware mounted very early in the pipeline (server.js), resolves req.headers.host → tenant_id for tenants with a custom domain configured (e.g. links.acme.com → tenant_id=acme).
Mechanics
- On every incoming request, reads
req.headers.host(lowercase, strip port) - If matches server's
BASE_URL→ bypass (admin/global dashboard route) - Otherwise: lookup LRU cache (max 500 entries, TTL 60s)
- Cache miss → query
SELECT id FROM tenants WHERE custom_domain = $1 LIMIT 1 - Found →
req.tenant_idinjected + cache populated. Not found → negative cache TTL 30s + 404 if route requires tenant
Cache invalidation
No explicit event — the 60s TTL covers normal latency of a custom_domain change. For immediate purge (rare), admin endpoint POST /api/v1/admin/cache/domain-resolver/flush (super-admin).
Logging
INFO log on each resolution with { host, tenant_id, cache_hit }. Cloud Monitoring counter deeplink/domain-resolver/lookups{cache_hit=true|false} to track hit ratio (target > 95%).
Service cloud-logging-reader.js
Wrapper of @google-cloud/logging SDK enabling super-admins to consult prod logs from the dashboard without GCP console access.
Implementation
- Lib:
const { Logging } = require('@google-cloud/logging') - Lazy init on first call — no boot-time connection (logs aren't hot path)
- Auth: ADC (Application Default Credentials) — on Cloud Run, the revision's SA is used directly
- Project ID:
process.env.GOOGLE_CLOUD_PROJECT(mandatory if viewer enabled)
Main method
async function readLogs({ severity, minutes, query, limit = 200, pageToken = null }) {
const filter = [
`resource.type="cloud_run_revision"`,
`timestamp >= "${isoNow(-minutes * 60_000)}"`,
severity ? `severity = ${severity}` : null,
query || null,
].filter(Boolean).join(' AND ');
const [entries, nextPageToken] = await logging.getEntries({
filter, pageSize: limit, pageToken, orderBy: 'timestamp desc'
});
return { entries: entries.map(serializeEntry), pageToken: nextPageToken };
}
Serialization
Strips verbose GCP metadata, keeps { timestamp, severity, jsonPayload, httpRequest, resource.labels } to limit response size.
Throttle & cost
Dedicated rate-limit wrapper: max 1 req/sec/user. Cloud Logging bills reads at $0.50/GB scanned — the cap on limit ≤ 1000 + minutes ≤ 1440 protects against runaway costs.
Dev fallback
If GOOGLE_CLOUD_PROJECT is absent → endpoint returns 503 { error: "cloud_logging_not_configured" } rather than crashing. Local dev uses the in-memory ring buffer of src/logger.js.
URL validator & safe-fetch (SSRF protection)
All outbound HTTP from the server (webhooks, OpenGraph fetch, ad-platforms) goes through a hardened wrapper that blocks SSRF attacks (Server-Side Request Forgery against internal infrastructure).
src/utils/url-validator.js
Schema validation applied via zod refines on user-supplied URL fields (web_url, *_store_url, image_url, og_image).
- Allowed schemes:
httpsstrict in prod,httpdev-only - Blocked IPs (RFC 1918 private):
10.*,172.16-31.*,192.168.* - Blocked localhost:
127.*,::1,0.0.0.0 - Blocked cloud metadata:
169.254.169.254(AWS/GCP metadata service),metadata.google.internal - Image extension whitelist (on
image_url/og_image):png,jpg,jpeg,gif,webp,svg
src/utils/safe-fetch.js
Wrapper around fetch() for all outbound tenant calls. Re-validates each redirect (3xx) against URL validator (no bypass through redirect chain).
- Cap response body: 5 MB
- Timeout: 10s
- Max redirects: 5
- Content-type check: rejects if expected image, returned HTML
- No-bypass enforcement: raw
fetch()banned insrc/services/via ESLintno-restricted-imports
Architecture diagrams (ASCII)
Request lifecycle (Cloud Run)
┌─────────────────┐
Internet ────────▶│ Cloud Load │
│ Balancer │
└────────┬────────┘
│ HTTPS (TLS termination)
┌────────▼────────┐
│ Cloud Run │ N instances (autoscale 0→100)
│ ┌───────────┐ │
│ │ Express │ │
│ └─────┬─────┘ │
└────────┼────────┘
│
┌───────────┬─────────┼─────────┬────────────┐
▼ ▼ ▼ ▼ ▼
┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ ┌──────────┐
│ Cloud │ │ Postgres│ │ GCS │ │ Pub/Sub│ │ Cloud │
│ SQL │ │ (primary│ │bucket │ │webhooks│ │ Scheduler│
│ │ │ DB) │ │exports │ │delivery│ │ tick 1/min│
└────────┘ └────────┘ └────────┘ └────────┘ └──────────┘
Multi-tenant routing
HTTP request → host header
│
├── BASE_URL (api.deeplink.com)
│ └── routes /api/v1/* → JWT extracts tenant_id from claim
│
└── custom domain (links.acme.com)
└── domain-resolver.js → SELECT tenants WHERE custom_domain=$1
└── inject req.tenant_id → routes / → public redirect
Identity resolve (live, IDG-LIVE)
Funnel/cohort/LTV query
│
▼
identity-graph.js resolveIdentities(events)
│
├── extract distinct visitor_id, sdk_instance_id, device_anchor_id
├── SELECT identity_aliases WHERE any_id IN (...)
├── BFS over the graph, build COALESCE map
│
▼
events with rewritten user_id (where alias known)
│
▼
GROUP BY user_id, dimensions → KPIs
Distributed lock for jobs
Instance A Instance B
│ │
├── tick @ T0 ├── tick @ T0
│ │
├── BEGIN ├── BEGIN
├── SELECT FOR UPDATE ├── SELECT FOR UPDATE
│ SKIP LOCKED ✅ │ SKIP LOCKED → SKIPPED ❌
│ │
├── run identity-merge └── COMMIT (no-op)
├── UPDATE next_run_at
└── COMMIT