Designing a Payment Service Under PCI DSS Constraints
This is microservice 4 of 8 in our digital wallet platform. Unlike Wallet, Transaction, or Ledger services, Payment Service operates under a fundamentally different constraint: PCI DSS compliance. That single requirement shaped every design decision.
The Core Constraint: Card Data Isolation
Payment Service is the only component that ever touches tokenised card data. Raw card numbers (PAN, CVV) never cross its boundary — not even in logs. The client submits card details directly to Stripe/Razorpay's SDK, which returns an opaque token. Payment Service stores that token, nothing more.
Every other service in the platform only sees a paymentId (our UUID). This hard boundary means Payment Service has a stricter security perimeter than anything else in the system.
Two Flows, One Outcome
Card top-ups follow two fundamentally different paths that converge at the same point.
Synchronous (no 3DS):
Client → POST /payments/topup (with token)
→ Fraud pre-check (BIN, velocity, device trust)
→ Persist PENDING + idempotency key
→ Charge gateway → receive charge ID
→ Mark CAPTURED → publish PAYMENT_CAPTURED
Async (3DS required):
Client → POST /payments/topup
→ Gateway returns 3DS redirect URL
→ Record enters PENDING_3DS, URL returned to client
→ Client completes bank auth in browser
→ Gateway posts webhook → signature verified
→ Mark CAPTURED → publish PAYMENT_CAPTURED
→ Scheduler expires PENDING_3DS older than 15 min
Both paths end with the same Kafka event: PAYMENT_CAPTURED on the payment.events topic, which triggers Wallet Service to credit the balance.
Webhook Idempotency at the DB Level
Stripe and Razorpay can deliver the same webhook multiple times. Retries, network hiccups, their own retry policies — you'll see duplicates. Our solution:
ALTER TABLE payments
ADD COLUMN gateway_event_id VARCHAR(255) UNIQUE;
The second delivery hits the UNIQUE constraint, we return 200 OK to the gateway and skip processing. No application-level dedup logic, no Redis locks — the database enforces it.
We also return 200 to the gateway immediately and process the event asynchronously. Gateways retry aggressively on non-200 responses, so slow processing must never block the HTTP response.
What Payment Service Does NOT Do
This was equally important to define:
- No Kafka consumption. Unlike Transaction Service which consumes wallet events, Payment Service is purely reactive to HTTP — client requests and webhook posts. It only publishes.
- No Cassandra. That's Ledger Service's concern. We use MySQL 8 with three tables:
payments,payment_methods, andwithdrawal_requests. - No raw card storage. The
payment_methodstable stores:userId,gateway,tokenRef, and card metadata (last4,brand,expiry). That's it.
The DB Schema
payments — one row per payment attempt. Status machine: PENDING → PENDING_3DS → CAPTURED / FAILED / EXPIRED
payment_methods — tokenised cards and bank accounts per user. Soft-deleted, never hard-deleted (audit trail).
withdrawal_requests — tracks NEFT/IMPS reference, bank account, and settlement status separately from card payments.
External Integrations
| Integration | Purpose |
|---|---|
| Stripe / Razorpay | Card charges, refunds, 3DS auth, webhooks |
| NPCI (UPI / NEFT) | Bank transfers, withdrawals, settlement |
Webhook security uses HMAC-SHA256 signature verification for both gateways. Signature mismatch = immediate rejection.
Key Takeaways
- PCI scope drives architecture — identify your compliance boundary first, then design around it
- Idempotency belongs in the database, not just application code — UNIQUE constraints are your friend
- Return 200 to webhooks immediately — process async, or gateways will flood you with retries
- Define what a service does NOT do — Payment Service not consuming Kafka was a deliberate choice, not a gap
- Two flows, one event — sync and 3DS paths converge at
PAYMENT_CAPTURED, keeping downstream services simple