External Publication
Visit Post

Designing a Reliable Wallet Engine: Event-Driven Architecture with Kafka and TypeScript

DEV Community [Unofficial] July 2, 2026
Source

The Problem

If you've ever built a fintech product, you know this truth: the hardest part isn't moving money — it's making sure you never lose it.

A wallet engine sounds simple on the surface. Credit. Debit. Balance. But the moment you introduce concurrency, network failures, and the need for auditability, it becomes one of the most demanding distributed systems problems you can tackle.

I built a wallet engine that needed to:

  • Handle credits, debits, and wallet-to-wallet transfers
  • Guarantee that no money is created or destroyed (even during failures)
  • Process concurrent transactions without race conditions
  • Maintain a full audit trail for every operation
  • Notify downstream services reliably via events

Here's how I approached it.

Architecture Overview

The system is built on NestJS (TypeScript), PostgreSQL , Apache Kafka , and Redis. Each plays a distinct role:

Component Responsibility
PostgreSQL Source of truth — wallets, ledger entries, transactions
Kafka Asynchronous event distribution to downstream consumers
Redis Idempotency locks and duplicate request detection
NestJS Application layer — orchestrates business logic
┌──────────────┐        ┌──────────────┐        ┌──────────────┐
│   Client     │───────▶│  Wallet API  │───────▶│  PostgreSQL  │
│  (REST)      │        │  (NestJS)    │        │  (Source of  │
└──────────────┘        └──────┬───────┘        │   Truth)     │
                               │                └──────┬───────┘
                               │                       │
                        ┌──────▼───────┐        ┌──────▼───────┐
                        │    Redis     │        │ Outbox Table  │
                        │ (Idempotency │        │ (Pending      │
                        │   Locks)     │        │  Events)      │
                        └──────────────┘        └──────┬───────┘
                                                       │
                                                ┌──────▼───────┐
                                                │ Outbox Relay  │
                                                │ (Polls every  │
                                                │  2 seconds)   │
                                                └──────┬───────┘
                                                       │
                                                ┌──────▼───────┐
                                                │    Kafka     │
                                                │  (Events)    │
                                                └──────┬───────┘
                                                       │
                                         ┌─────────────┼─────────────┐
                                         │             │             │
                                  ┌──────▼──┐   ┌─────▼───┐   ┌────▼────┐
                                  │Webhooks │   │Analytics │   │ Notif.  │
                                  │Consumer │   │Consumer  │   │Consumer │
                                  └─────────┘   └─────────┘   └─────────┘

The key insight: the database transaction is the boundary of correctness, and Kafka is the boundary of communication. We never publish to Kafka directly from within business logic — we use the Transactional Outbox pattern to guarantee delivery.

Double-Entry Ledger: The Financial Correctness Foundation

The most critical design decision was implementing a double-entry bookkeeping system. Every financial operation produces exactly two ledger entries: one debit and one credit.

// Every transfer creates a paired debit + credit
const debitEntry = manager.create(LedgerEntry, {
  walletId: sourceWalletId,
  transactionId,
  type: EntryType.DEBIT,   // money leaves this wallet
  amount: amount.toString(),
  runningBalance: newSourceBalance.toString(),
});

const creditEntry = manager.create(LedgerEntry, {
  walletId: destinationWalletId,
  transactionId,
  type: EntryType.CREDIT,  // money enters this wallet
  amount: amount.toString(),
  runningBalance: newDestBalance.toString(),
});

Why this matters: The sum of all ledger entries for any transaction is always zero. This is a provable invariant — if it's ever violated, something has gone seriously wrong. I built a verification method that can audit any transaction:

async verifyTransactionBalance(transactionId: string): Promise<boolean> {
  const entries = await this.ledgerRepo.find({ where: { transactionId } });
  const net = entries.reduce((sum, entry) => {
    const amount = BigInt(entry.amount);
    return entry.type === EntryType.CREDIT ? sum + amount : sum - amount;
  }, 0n);
  return net === 0n; // Must ALWAYS be zero
}

Each entry also carries a runningBalance — the wallet's balance at that point in time. This gives us a full historical trail: you can reconstruct any wallet's balance at any moment by looking at its ledger entries.

Preventing Race Conditions with Pessimistic Locking

When two transfers hit the same wallet simultaneously, you can get phantom reads or double-spending. The naive approach — read balance, check if sufficient, then debit — breaks under concurrency.

I solved this with pessimistic row-level locking (SELECT ... FOR UPDATE):

// Lock both wallets within the same DB transaction
const [sourceWallet, destWallet] = await Promise.all([
  manager.findOne(Wallet, {
    where: { id: sourceWalletId },
    lock: { mode: 'pessimistic_write' },
  }),
  manager.findOne(Wallet, {
    where: { id: destinationWalletId },
    lock: { mode: 'pessimistic_write' },
  }),
]);

const sourceBalance = BigInt(sourceWallet.balance);
if (sourceBalance < amount) {
  throw new BadRequestException('Insufficient balance');
}

The pessimistic_write lock ensures that while one transaction is modifying a wallet's balance, all other transactions on that wallet wait. No race conditions. No double-spending.

Trade-off: This serializes operations on the same wallet. For a wallet processing thousands of transactions per second, you'd need a different approach (like balance reservation or saga patterns). For the volume this system handles, the simplicity of pessimistic locking wins.

Idempotency: Handling Duplicates Without Losing Money

Network timeouts, client retries, message redelivery — in distributed systems, you will receive the same request multiple times. The system must handle this gracefully without duplicating financial operations.

Every request carries an idempotency_key. The flow:

  1. Check Redis for an existing result with this key
  2. If found and completed → return the cached result (no-op)
  3. If found and processing → reject (another instance is handling it)
  4. If not found → acquire a distributed lock, execute the operation, cache the result
async wrap<T>(idempotencyKey: string, operation: () => Promise<T>): Promise<T> {
  const existing = await this.getExistingResult(idempotencyKey);
  if (existing?.status === IdempotencyStatus.COMPLETED) {
    return existing.result as T; // Safe replay
  }
  if (existing?.status === IdempotencyStatus.PROCESSING) {
    throw new ConflictException('Request already being processed');
  }

  const locked = await this.acquireLock(idempotencyKey);
  if (!locked) {
    throw new ConflictException('Duplicate request detected');
  }

  await this.markProcessing(idempotencyKey);
  try {
    const result = await operation();
    await this.markCompleted(idempotencyKey, result);
    return result;
  } catch (error) {
    await this.markFailed(idempotencyKey, error.message);
    throw error;
  }
}

A subtle detail: the lock has a TTL of 60 seconds, with automatic extension every 15 seconds while processing. This handles the case where a database transaction takes longer than expected — the lock won't expire mid-operation and allow a duplicate through.

The Transactional Outbox Pattern: Reliable Event Publishing

This is where Kafka enters the picture. The challenge: after completing a transfer, we need to notify downstream services (webhooks, analytics, notifications). But publishing to Kafka directly from within a database transaction creates a dangerous coupling — if Kafka is down, the transaction fails even though the business operation was valid.

The Transactional Outbox solves this cleanly:

  1. Within the same database transaction as the business operation, write events to an outbox_events table
  2. A separate relay process polls the outbox every 2 seconds and publishes pending events to Kafka
  3. Once published, mark the event as delivered
// Inside the same DB transaction as the transfer
await this.outboxService.write(
  {
    aggregateId: completedTxn.id,
    aggregateType: 'transaction',
    eventType: 'transaction.completed',
    kafkaTopic: 'transaction.completed',
    payload: {
      transactionId: completedTxn.id,
      type: 'transfer',
      sourceWalletId,
      destinationWalletId,
      amount,
      currency: completedTxn.currency,
      timestamp: new Date().toISOString(),
    },
  },
  manager, // Same EntityManager = same DB transaction
);

Guarantees:

  • If the business transaction commits, the outbox event is guaranteed to exist
  • If the transaction rolls back, the outbox event disappears with it
  • The relay delivers at-least-once — Kafka consumers must be idempotent (which they are, using the same idempotency pattern)

The relay itself is simple but robust:

private async relay() {
  const events = await this.outboxService.getPendingEvents(50);
  for (const event of events) {
    try {
      await this.kafkaProducer.publish(event.kafkaTopic, event.payload, event.aggregateId);
      await this.outboxService.markPublished(event.id);
    } catch (error) {
      await this.outboxService.markFailed(event.id, error.message, event.retryCount + 1);
    }
  }
}

After 5 failed attempts, an event is marked as permanently failed — this triggers alerting and manual investigation.

Atomic Transfers: Everything or Nothing

A wallet-to-wallet transfer involves multiple operations: creating the transaction record, writing ledger entries, updating both wallet balances, logging the audit trail, and writing outbox events. All of this happens inside a single PostgreSQL transaction:

const transaction = await this.dataSource.transaction(async (manager) => {
  // 1. Create transaction record
  const savedTxn = await manager.save(Transaction, txn);

  // 2. Audit log
  await this.auditService.logWithManager({ ... }, manager);

  // 3. Double-entry ledger (with pessimistic locks)
  await this.ledgerService.createDoubleEntry({ ... }, manager);

  // 4. Settle entries
  await this.ledgerService.settleEntries(savedTxn.id, manager);

  // 5. Mark complete
  savedTxn.status = TransactionStatus.COMPLETED;
  const completedTxn = await manager.save(Transaction, savedTxn);

  // 6. Outbox events for Kafka
  await this.outboxService.write({ ... }, manager);

  return completedTxn;
});

If any step fails — insufficient balance, database error, constraint violation — everything rolls back. No partial state. No orphaned ledger entries. No wallet with a deducted balance but no corresponding credit.

Reversals: Undoing Without Destroying History

When a transfer needs to be reversed, I don't delete or modify existing entries. Instead, the system creates a new transaction in the opposite direction :

async reverseEntries(originalTransactionId, reversalTransactionId, manager) {
  const originalEntries = await manager.find(LedgerEntry, {
    where: { transactionId: originalTransactionId, status: EntryStatus.SETTLED },
  });

  // Reversal swaps the direction
  return this.createDoubleEntry({
    transactionId: reversalTransactionId,
    sourceWalletId: creditEntry.walletId,      // original recipient sends back
    destinationWalletId: debitEntry.walletId,   // original sender receives
    amount: BigInt(debitEntry.amount),
    description: `Reversal of transaction ${originalTransactionId}`,
  }, manager);
}

The original transaction is marked as REVERSED, and the new reversal transaction is its own complete record. The audit trail is preserved end-to-end — you can always trace what happened, when, and why.

Webhook Delivery with Retry Logic

Downstream services need to be notified of completed transactions. The webhook system consumes from Kafka and delivers HTTP payloads with exponential backoff:

  • Attempt 1: immediate
  • Attempt 2: after 5 seconds
  • Attempt 3: after 30 seconds
  • After 3 failures: mark as permanently failed

Each delivery attempt is tracked with response codes, error messages, and duration — full observability into integration health.

The consumer uses manual offset commits — it only advances the Kafka offset after successfully processing a message. If processing fails, the message will be redelivered on the next consumer restart.

What I'd Do Differently at Scale

This architecture works well for its current volume, but at 10x or 100x scale, several things would change:

  1. Replace polling-based outbox relay with CDC (Change Data Capture via Debezium). Eliminates the 2-second polling delay and reduces database load.

  2. Partition Kafka topics by wallet ID. This ensures all events for a given wallet are processed in order, enabling parallel consumption without ordering violations.

  3. Add a balance reservation step for high-throughput wallets. Instead of locking the row for the entire transaction, reserve the amount upfront and settle asynchronously.

  4. Introduce a dead-letter queue for permanently failed events. Currently failed outbox events just sit — at scale, they need automated alerting and recovery workflows.

  5. Move to event sourcing for the ledger. Instead of storing the current balance as a field, derive it entirely from the ledger entries. This eliminates any possibility of balance drift.

Key Takeaways

Building a wallet engine taught me that financial systems demand a different mindset. You can't think in terms of "usually works" — you have to think in terms of invariants that must never be violated.

The patterns that made this system reliable:

  • Double-entry bookkeeping — provable correctness via zero-sum invariant
  • Pessimistic locking — eliminates race conditions at the database level
  • Transactional Outbox — decouples event publishing from business logic
  • Idempotency everywhere — safe retries at every layer
  • Atomic transactions — no partial state, ever
  • Append-only audit trail — nothing is deleted, everything is traceable

These aren't novel ideas. They're proven patterns from decades of financial systems engineering. The craft is in applying them correctly, understanding their trade-offs, and building them into a cohesive system that a team can maintain and evolve.

Built with NestJS, TypeScript, PostgreSQL, Apache Kafka, and Redis.

Have questions about the implementation? Find me on LinkedIn or GitHub.

Discussion in the ATmosphere

Loading comments...