Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreiggs5252iajjpi2hrv6uvp32najf6aa2a3cj2jdh3j7uuepdt3e3q",
    "uri": "at://did:plc:25rdn5elo5izoxrmtis34zuk/app.bsky.feed.post/3mppahdc5o6e2"
  },
  "coverImage": {
    "$type": "blob",
    "ref": {
      "$link": "bafkreifkqwhnmao7ezlde6kmz2bmikinzldcgnlabi5zbplcfi4bkagkwi"
    },
    "mimeType": "image/webp",
    "size": 81098
  },
  "path": "/tundeoladejo/designing-a-reliable-wallet-engine-event-driven-architecture-with-kafka-and-typescript-jjp",
  "publishedAt": "2026-07-02T23:23:57.000Z",
  "site": "https://dev.to",
  "tags": [
    "architecture",
    "fintech",
    "systemdesign",
    "typescript",
    "LinkedIn",
    "GitHub"
  ],
  "textContent": "##  The Problem\n\nIf you've ever built a fintech product, you know this truth: **the hardest part isn't moving money — it's making sure you never lose it.**\n\nA wallet engine sounds simple on the surface. Credit. Debit. Balance. But the moment you introduce concurrency, network failures, and the need for auditability, it becomes one of the most demanding distributed systems problems you can tackle.\n\nI built a wallet engine that needed to:\n\n  * Handle credits, debits, and wallet-to-wallet transfers\n  * Guarantee that no money is created or destroyed (even during failures)\n  * Process concurrent transactions without race conditions\n  * Maintain a full audit trail for every operation\n  * Notify downstream services reliably via events\n\n\n\nHere's how I approached it.\n\n##  Architecture Overview\n\nThe system is built on **NestJS** (TypeScript), **PostgreSQL** , **Apache Kafka** , and **Redis**. Each plays a distinct role:\n\nComponent | Responsibility\n---|---\nPostgreSQL | Source of truth — wallets, ledger entries, transactions\nKafka | Asynchronous event distribution to downstream consumers\nRedis | Idempotency locks and duplicate request detection\nNestJS | Application layer — orchestrates business logic\n\n\n    ┌──────────────┐        ┌──────────────┐        ┌──────────────┐\n    │   Client     │───────▶│  Wallet API  │───────▶│  PostgreSQL  │\n    │  (REST)      │        │  (NestJS)    │        │  (Source of  │\n    └──────────────┘        └──────┬───────┘        │   Truth)     │\n                                   │                └──────┬───────┘\n                                   │                       │\n                            ┌──────▼───────┐        ┌──────▼───────┐\n                            │    Redis     │        │ Outbox Table  │\n                            │ (Idempotency │        │ (Pending      │\n                            │   Locks)     │        │  Events)      │\n                            └──────────────┘        └──────┬───────┘\n                                                           │\n                                                    ┌──────▼───────┐\n                                                    │ Outbox Relay  │\n                                                    │ (Polls every  │\n                                                    │  2 seconds)   │\n                                                    └──────┬───────┘\n                                                           │\n                                                    ┌──────▼───────┐\n                                                    │    Kafka     │\n                                                    │  (Events)    │\n                                                    └──────┬───────┘\n                                                           │\n                                             ┌─────────────┼─────────────┐\n                                             │             │             │\n                                      ┌──────▼──┐   ┌─────▼───┐   ┌────▼────┐\n                                      │Webhooks │   │Analytics │   │ Notif.  │\n                                      │Consumer │   │Consumer  │   │Consumer │\n                                      └─────────┘   └─────────┘   └─────────┘\n\n\nThe key insight: **the database transaction is the boundary of correctness, and Kafka is the boundary of communication.** We never publish to Kafka directly from within business logic — we use the Transactional Outbox pattern to guarantee delivery.\n\n##  Double-Entry Ledger: The Financial Correctness Foundation\n\nThe most critical design decision was implementing a **double-entry bookkeeping system**. Every financial operation produces exactly two ledger entries: one debit and one credit.\n\n\n\n    // Every transfer creates a paired debit + credit\n    const debitEntry = manager.create(LedgerEntry, {\n      walletId: sourceWalletId,\n      transactionId,\n      type: EntryType.DEBIT,   // money leaves this wallet\n      amount: amount.toString(),\n      runningBalance: newSourceBalance.toString(),\n    });\n\n    const creditEntry = manager.create(LedgerEntry, {\n      walletId: destinationWalletId,\n      transactionId,\n      type: EntryType.CREDIT,  // money enters this wallet\n      amount: amount.toString(),\n      runningBalance: newDestBalance.toString(),\n    });\n\n\n**Why this matters:** The sum of all ledger entries for any transaction is always zero. This is a provable invariant — if it's ever violated, something has gone seriously wrong. I built a verification method that can audit any transaction:\n\n\n\n    async verifyTransactionBalance(transactionId: string): Promise<boolean> {\n      const entries = await this.ledgerRepo.find({ where: { transactionId } });\n      const net = entries.reduce((sum, entry) => {\n        const amount = BigInt(entry.amount);\n        return entry.type === EntryType.CREDIT ? sum + amount : sum - amount;\n      }, 0n);\n      return net === 0n; // Must ALWAYS be zero\n    }\n\n\nEach entry also carries a `runningBalance` — the wallet's balance at that point in time. This gives us a full historical trail: you can reconstruct any wallet's balance at any moment by looking at its ledger entries.\n\n##  Preventing Race Conditions with Pessimistic Locking\n\nWhen two transfers hit the same wallet simultaneously, you can get phantom reads or double-spending. The naive approach — read balance, check if sufficient, then debit — breaks under concurrency.\n\nI solved this with **pessimistic row-level locking** (`SELECT ... FOR UPDATE`):\n\n\n\n    // Lock both wallets within the same DB transaction\n    const [sourceWallet, destWallet] = await Promise.all([\n      manager.findOne(Wallet, {\n        where: { id: sourceWalletId },\n        lock: { mode: 'pessimistic_write' },\n      }),\n      manager.findOne(Wallet, {\n        where: { id: destinationWalletId },\n        lock: { mode: 'pessimistic_write' },\n      }),\n    ]);\n\n    const sourceBalance = BigInt(sourceWallet.balance);\n    if (sourceBalance < amount) {\n      throw new BadRequestException('Insufficient balance');\n    }\n\n\nThe `pessimistic_write` lock ensures that while one transaction is modifying a wallet's balance, all other transactions on that wallet wait. No race conditions. No double-spending.\n\n**Trade-off:** This serializes operations on the same wallet. For a wallet processing thousands of transactions per second, you'd need a different approach (like balance reservation or saga patterns). For the volume this system handles, the simplicity of pessimistic locking wins.\n\n##  Idempotency: Handling Duplicates Without Losing Money\n\nNetwork timeouts, client retries, message redelivery — in distributed systems, you will receive the same request multiple times. The system must handle this gracefully without duplicating financial operations.\n\nEvery request carries an `idempotency_key`. The flow:\n\n  1. Check Redis for an existing result with this key\n  2. If found and completed → return the cached result (no-op)\n  3. If found and processing → reject (another instance is handling it)\n  4. If not found → acquire a distributed lock, execute the operation, cache the result\n\n\n\n\n    async wrap<T>(idempotencyKey: string, operation: () => Promise<T>): Promise<T> {\n      const existing = await this.getExistingResult(idempotencyKey);\n      if (existing?.status === IdempotencyStatus.COMPLETED) {\n        return existing.result as T; // Safe replay\n      }\n      if (existing?.status === IdempotencyStatus.PROCESSING) {\n        throw new ConflictException('Request already being processed');\n      }\n\n      const locked = await this.acquireLock(idempotencyKey);\n      if (!locked) {\n        throw new ConflictException('Duplicate request detected');\n      }\n\n      await this.markProcessing(idempotencyKey);\n      try {\n        const result = await operation();\n        await this.markCompleted(idempotencyKey, result);\n        return result;\n      } catch (error) {\n        await this.markFailed(idempotencyKey, error.message);\n        throw error;\n      }\n    }\n\n\nA subtle detail: the lock has a TTL of 60 seconds, with automatic extension every 15 seconds while processing. This handles the case where a database transaction takes longer than expected — the lock won't expire mid-operation and allow a duplicate through.\n\n##  The Transactional Outbox Pattern: Reliable Event Publishing\n\nThis is where Kafka enters the picture. The challenge: after completing a transfer, we need to notify downstream services (webhooks, analytics, notifications). But publishing to Kafka directly from within a database transaction creates a dangerous coupling — if Kafka is down, the transaction fails even though the business operation was valid.\n\n**The Transactional Outbox solves this cleanly:**\n\n  1. Within the same database transaction as the business operation, write events to an `outbox_events` table\n  2. A separate relay process polls the outbox every 2 seconds and publishes pending events to Kafka\n  3. Once published, mark the event as delivered\n\n\n\n\n    // Inside the same DB transaction as the transfer\n    await this.outboxService.write(\n      {\n        aggregateId: completedTxn.id,\n        aggregateType: 'transaction',\n        eventType: 'transaction.completed',\n        kafkaTopic: 'transaction.completed',\n        payload: {\n          transactionId: completedTxn.id,\n          type: 'transfer',\n          sourceWalletId,\n          destinationWalletId,\n          amount,\n          currency: completedTxn.currency,\n          timestamp: new Date().toISOString(),\n        },\n      },\n      manager, // Same EntityManager = same DB transaction\n    );\n\n\n**Guarantees:**\n\n  * If the business transaction commits, the outbox event is guaranteed to exist\n  * If the transaction rolls back, the outbox event disappears with it\n  * The relay delivers at-least-once — Kafka consumers must be idempotent (which they are, using the same idempotency pattern)\n\n\n\nThe relay itself is simple but robust:\n\n\n\n    private async relay() {\n      const events = await this.outboxService.getPendingEvents(50);\n      for (const event of events) {\n        try {\n          await this.kafkaProducer.publish(event.kafkaTopic, event.payload, event.aggregateId);\n          await this.outboxService.markPublished(event.id);\n        } catch (error) {\n          await this.outboxService.markFailed(event.id, error.message, event.retryCount + 1);\n        }\n      }\n    }\n\n\nAfter 5 failed attempts, an event is marked as permanently failed — this triggers alerting and manual investigation.\n\n##  Atomic Transfers: Everything or Nothing\n\nA wallet-to-wallet transfer involves multiple operations: creating the transaction record, writing ledger entries, updating both wallet balances, logging the audit trail, and writing outbox events. **All of this happens inside a single PostgreSQL transaction:**\n\n\n\n    const transaction = await this.dataSource.transaction(async (manager) => {\n      // 1. Create transaction record\n      const savedTxn = await manager.save(Transaction, txn);\n\n      // 2. Audit log\n      await this.auditService.logWithManager({ ... }, manager);\n\n      // 3. Double-entry ledger (with pessimistic locks)\n      await this.ledgerService.createDoubleEntry({ ... }, manager);\n\n      // 4. Settle entries\n      await this.ledgerService.settleEntries(savedTxn.id, manager);\n\n      // 5. Mark complete\n      savedTxn.status = TransactionStatus.COMPLETED;\n      const completedTxn = await manager.save(Transaction, savedTxn);\n\n      // 6. Outbox events for Kafka\n      await this.outboxService.write({ ... }, manager);\n\n      return completedTxn;\n    });\n\n\nIf any step fails — insufficient balance, database error, constraint violation — **everything rolls back**. No partial state. No orphaned ledger entries. No wallet with a deducted balance but no corresponding credit.\n\n##  Reversals: Undoing Without Destroying History\n\nWhen a transfer needs to be reversed, I don't delete or modify existing entries. Instead, the system creates a **new transaction in the opposite direction** :\n\n\n\n    async reverseEntries(originalTransactionId, reversalTransactionId, manager) {\n      const originalEntries = await manager.find(LedgerEntry, {\n        where: { transactionId: originalTransactionId, status: EntryStatus.SETTLED },\n      });\n\n      // Reversal swaps the direction\n      return this.createDoubleEntry({\n        transactionId: reversalTransactionId,\n        sourceWalletId: creditEntry.walletId,      // original recipient sends back\n        destinationWalletId: debitEntry.walletId,   // original sender receives\n        amount: BigInt(debitEntry.amount),\n        description: `Reversal of transaction ${originalTransactionId}`,\n      }, manager);\n    }\n\n\nThe original transaction is marked as `REVERSED`, and the new reversal transaction is its own complete record. The audit trail is preserved end-to-end — you can always trace what happened, when, and why.\n\n##  Webhook Delivery with Retry Logic\n\nDownstream services need to be notified of completed transactions. The webhook system consumes from Kafka and delivers HTTP payloads with exponential backoff:\n\n  * Attempt 1: immediate\n  * Attempt 2: after 5 seconds\n  * Attempt 3: after 30 seconds\n  * After 3 failures: mark as permanently failed\n\n\n\nEach delivery attempt is tracked with response codes, error messages, and duration — full observability into integration health.\n\nThe consumer uses **manual offset commits** — it only advances the Kafka offset after successfully processing a message. If processing fails, the message will be redelivered on the next consumer restart.\n\n##  What I'd Do Differently at Scale\n\nThis architecture works well for its current volume, but at 10x or 100x scale, several things would change:\n\n  1. **Replace polling-based outbox relay with CDC** (Change Data Capture via Debezium). Eliminates the 2-second polling delay and reduces database load.\n\n  2. **Partition Kafka topics by wallet ID**. This ensures all events for a given wallet are processed in order, enabling parallel consumption without ordering violations.\n\n  3. **Add a balance reservation step** for high-throughput wallets. Instead of locking the row for the entire transaction, reserve the amount upfront and settle asynchronously.\n\n  4. **Introduce a dead-letter queue** for permanently failed events. Currently failed outbox events just sit — at scale, they need automated alerting and recovery workflows.\n\n  5. **Move to event sourcing** for the ledger. Instead of storing the current balance as a field, derive it entirely from the ledger entries. This eliminates any possibility of balance drift.\n\n\n\n\n##  Key Takeaways\n\nBuilding a wallet engine taught me that **financial systems demand a different mindset**. You can't think in terms of \"usually works\" — you have to think in terms of invariants that must never be violated.\n\nThe patterns that made this system reliable:\n\n  * **Double-entry bookkeeping** — provable correctness via zero-sum invariant\n  * **Pessimistic locking** — eliminates race conditions at the database level\n  * **Transactional Outbox** — decouples event publishing from business logic\n  * **Idempotency everywhere** — safe retries at every layer\n  * **Atomic transactions** — no partial state, ever\n  * **Append-only audit trail** — nothing is deleted, everything is traceable\n\n\n\nThese aren't novel ideas. They're proven patterns from decades of financial systems engineering. The craft is in applying them correctly, understanding their trade-offs, and building them into a cohesive system that a team can maintain and evolve.\n\n_Built with NestJS, TypeScript, PostgreSQL, Apache Kafka, and Redis._\n\n_Have questions about the implementation? Find me on LinkedIn or GitHub._",
  "title": "Designing a Reliable Wallet Engine: Event-Driven Architecture with Kafka and TypeScript"
}