System Design: Designing a Digital Wallet and Ledger System

Case Study: Designing a Digital Wallet and Ledger System

Mental Model

A digital wallet ledger is not a single database column showing a current balance, but an immutable, append-only chronological stream of balanced debits and credits that guarantees zero financial discrepancy.

Designing a digital wallet (like PayPal, Venmo, Stripe, or AliPay) is a masterclass in data integrity. In fintech, "eventual consistency" is not an option. You need 100% mathematical accuracy, strict ACID compliance, and a verifiable audit trail for every single cent moved.

Requirements & Core Constraints

To build a reliable digital wallet system, we must clarify our functional goals and strict scaling limits.

Functional Constraints

Wallet Balances: Provide users with a highly available, real-time snapshot of their active balances across multiple currencies.
Transactional Transfers: Enable instantaneous transfer of funds between any two user wallets (sender and receiver).
Immutable Ledger Logging: Guarantee that all money movements are logged in an immutable, append-only ledger following strict double-entry bookkeeping rules.
Idempotency Gating: Guarantee that no payment request can be processed more than once, completely eliminating double-charging bugs.
Multi-Currency Support: Support currency exchange conversions and fee deductions per transaction.

Non-Functional SLAs

Strict Data Consistency (CP): The system must enforce strong consistency for financial ledgers. We prioritize correctness over availability under partition scenarios.
Ultra-Low Latency: P99 balance lookups must be under 30ms. P99 transfer executions must be under 100ms.
High Throughput Scaling: The system must scale to handle 1 Billion transactions per day globally, with a peak transaction capacity of 30,000 transfers per second.
Auditable Traceability: Every transaction must balance to zero, providing an automated audit trail for auditing and compliance teams.

Back-of-the-Envelope Estimates

Let's calculate the hardware, network, and storage capacity needed to support 1 Billion daily transactions.

1. Ingestion Bandwidth & Throughput

Total Daily Transactions: $1\text{ Billion}$
Average Transaction QPS: $1\text{B} / 86,400\text{s} \approx 11,574\text{ QPS}$
Peak Transaction QPS (3x Average): $\approx 35,000\text{ QPS}$
Size per Ledger Entry: Average of $200\text{ bytes}$ per record (including identifiers, transaction amounts, timestamps, and metadata).
Daily Ledger Data Ingress: $1\text{B} \times 200\text{ bytes} = 200\text{ Gigabytes (GB)}$ of raw transaction data per day.
Annual Storage Capacity (with Indexing Overhead & Replication):
- Raw storage per year: $200\text{ GB} \times 365 \approx 73\text{ Terabytes (TB)}$
- With database index overhead and $3\times$ replication: $73\text{ TB} \times 4 \approx 292\text{ TB}$ of high-speed persistent SSD storage per year.
Average Inbound Bandwidth: $11,574\text{ QPS} \times 200\text{ bytes} \approx 2.3\text{ MB/sec}$
Peak Inbound Bandwidth: $35,000\text{ QPS} \times 200\text{ bytes} \approx 7.0\text{ MB/sec}$

API Design & Core Contracts

The system exposes high-performance HTTP/REST gateways for transaction orchestration. All write requests must incorporate an idempotency key to prevent double charging.

1. Wallet Transfer Execution

Initiates a secure funds transfer between two internal accounts.

POST /api/v1/wallet/transfer

Headers:

Idempotency-Key: idem_uuid_8374982143 (Strict UUIDv4 format, checked at the API gateway layer)

Request Payload:

{
  "sender_account_id": "acc_sender_98314",
  "receiver_account_id": "acc_receiver_10923",
  "amount": "150.00",
  "currency": "USD",
  "description": "Payment for consult services"
}

Response Payload (Success):

{
  "status": "success",
  "data": {
    "transaction_id": "tx_ledger_89231478",
    "sender_account_id": "acc_sender_98314",
    "receiver_account_id": "acc_receiver_10923",
    "amount_transferred": "150.00",
    "fee_applied": "0.00",
    "currency": "USD",
    "status": "SETTLED",
    "timestamp": 1779435420000
  }
}

Response Payload (Failure / Conflict):

{
  "status": "error",
  "error": {
    "code": "INSUFFICIENT_FUNDS",
    "message": "Sender account balance is insufficient to cover the transaction amount",
    "details": {
      "available_balance": "42.50",
      "attempted_amount": "150.00"
    }
  }
}

High-Level Design (HLD)

To scale a digital wallet ledger, we split our system into a High-Speed Idempotency Gating Layer and a Sharded Multi-Account Consensus Ledger Engine.

1. Ingestion & Idempotency Gating Flow

This pipeline validates client requests, registers idempotency keys, and schedules transfer jobs to prevent duplicate processing.

graph TD
    Client[Mobile/Web Client] -->|POST Transfer Request| Gateway[API Gateway / Load Balancer]
    Gateway -->|Verify Idempotency Key| IdemCache[Redis Idempotency Store]
    
    IdemCache -->|Key Exists - Return Cache| Client
    IdemCache -->|New Key - Acquire Lock| Orchestrator[Wallet Transfer Orchestrator]
    
    Orchestrator -->|State: PENDING| DB[(Primary PostgreSQL Shard Cluster)]
    Orchestrator -->|Trigger Job| Executor[Ledger Execution Engine]
    
    Executor -->|Update Balance Cache| BalCache[(Redis Balance Cache)]
    Executor -->|Acknowledge Completion| Orchestrator

2. Multi-Shard Saga Orchestration Transaction Flow

When transferring funds between accounts located on different database shards, the system uses a Saga Orchestrator to coordinate execution and compensation.

graph TD
    Orchestrator[Saga Transaction Orchestrator] -->|1. Reserve Balance| ShardA[(PostgreSQL Shard A: Sender)]
    
    ShardA -->|Success: Reserved| Orchestrator
    ShardA -->|Failure: Abort| Compensate[Trigger Compensation Rollback]
    
    Orchestrator -->|2. Credit Balance| ShardB[(PostgreSQL Shard B: Receiver)]
    
    ShardB -->|Success: Credited| Orchestrator
    ShardB -->|Failure: Timeout/Error| Compensate
    
    Orchestrator -->|3. Commit Transaction & Log Ledger| ShardA
    Orchestrator -->|Update Cache| RedisCache[(Redis Balance Cache)]

Low-Level Design (LLD) & Data Models

We model our data using relational databases (PostgreSQL sharded by account_id) to leverage strict ACID transaction boundary features.

Database Selection & Schema DDL

1. Immutable Ledger Schema Design

We use three core tables: accounts (user balances), transactions (transaction metadata), and entries (individual debits and credits).

Following Double-Entry Bookkeeping rules, a transaction consists of at least two entries: one debit (negative) and one credit (positive). The sum of all entries for any transaction must equal zero.

CREATE TYPE account_type AS ENUM ('ASSET', 'LIABILITY', 'EQUITY', 'REVENUE', 'EXPENSE');
CREATE TYPE transaction_status AS ENUM ('PENDING', 'SETTLED', 'FAILED');
CREATE TYPE entry_direction AS ENUM ('DEBIT', 'CREDIT');

CREATE TABLE accounts (
    account_id VARCHAR(64) PRIMARY KEY,
    user_id VARCHAR(64) NOT NULL,
    currency VARCHAR(3) NOT NULL,
    type account_type NOT NULL,
    balance NUMERIC(18, 4) NOT NULL DEFAULT 0.0000,
    version INT NOT NULL DEFAULT 0, -- Used for Optimistic Concurrency Control
    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
);

CREATE TABLE transactions (
    transaction_id VARCHAR(64) PRIMARY KEY,
    idempotency_key VARCHAR(128) UNIQUE NOT NULL,
    description TEXT,
    status transaction_status NOT NULL DEFAULT 'PENDING',
    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
);

CREATE TABLE entries (
    entry_id VARCHAR(64) PRIMARY KEY,
    transaction_id VARCHAR(64) REFERENCES transactions(transaction_id),
    account_id VARCHAR(64) REFERENCES accounts(account_id),
    direction entry_direction NOT NULL,
    amount NUMERIC(18, 4) NOT NULL, -- Always positive value
    signed_amount NUMERIC(18, 4) NOT NULL, -- Negative for Debit, Positive for Credit
    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
);

-- Indexing for fast balances lookup and audit verification
CREATE INDEX idx_accounts_user_id ON accounts(user_id);
CREATE INDEX idx_entries_account_id ON entries(account_id);
CREATE INDEX idx_entries_transaction_id ON entries(transaction_id);

Core Ledger Transaction & Optimistic Locking Implementation

When thousands of users execute transactions concurrently, multiple threads may attempt to read and write to the same account balance simultaneously. We implement Optimistic Concurrency Control (OCC) using a version column to prevent race conditions without causing database locks.

Below is a production-grade Java Spring-style implementation of a thread-safe transaction execution service.

package com.codesprintpro.wallet.service;

import java.math.BigDecimal;
import java.util.UUID;
import org.springframework.stereotype.Service;
import org.springframework.transaction.annotation.Transactional;

public class LedgerTransactionService {

    // Simulated repository endpoints
    public interface AccountRepository {
        Account findById(String accountId);
        int updateBalance(String accountId, BigDecimal newBalance, int currentVersion);
    }

    public interface TransactionRepository {
        void saveTransaction(WalletTransaction tx);
        void saveEntry(LedgerEntry entry);
    }

    public static class Account {
        public String accountId;
        public BigDecimal balance;
        public int version;

        public Account(String accountId, BigDecimal balance, int version) {
            this.accountId = accountId;
            this.balance = balance;
            this.version = version;
        }
    }

    public static class WalletTransaction {
        public String transactionId;
        public String idempotencyKey;
        public String status;

        public WalletTransaction(String transactionId, String idempotencyKey, String status) {
            this.transactionId = transactionId;
            this.idempotencyKey = idempotencyKey;
            this.status = status;
        }
    }

    public static class LedgerEntry {
        public String entryId;
        public String transactionId;
        public String accountId;
        public BigDecimal signedAmount;

        public LedgerEntry(String entryId, String transactionId, String accountId, BigDecimal signedAmount) {
            this.entryId = entryId;
            this.transactionId = transactionId;
            this.accountId = accountId;
            this.signedAmount = signedAmount;
        }
    }

    private final AccountRepository accountRepo;
    private final TransactionRepository transactionRepo;

    public LedgerTransactionService(AccountRepository accountRepo, TransactionRepository transactionRepo) {
        this.accountRepo = accountRepo;
        this.transactionRepo = transactionRepo;
    }

    @Transactional(rollbackFor = Exception.class)
    public boolean executeTransfer(
            String idempotencyKey, 
            String senderId, 
            String receiverId, 
            BigDecimal amount
    ) throws Exception {
        
        if (amount.compareTo(BigDecimal.ZERO) <= 0) {
            throw new IllegalArgumentException("Transfer amount must be positive");
        }

        // Fetch both accounts from persistent sharded storage
        Account sender = accountRepo.findById(senderId);
        Account receiver = accountRepo.findById(receiverId);

        if (sender == null || receiver == null) {
            throw new IllegalArgumentException("Invalid sender or receiver account identifier");
        }

        // Enforce business rules at the application layer
        if (sender.balance.compareTo(amount) < 0) {
            throw new IllegalStateException("Insufficient funds in sender account");
        }

        // Compute new balance configurations
        BigDecimal newSenderBalance = sender.balance.subtract(amount);
        BigDecimal newReceiverBalance = receiver.balance.add(amount);

        // Perform atomic update using Optimistic Locking version checks
        int senderRows = accountRepo.updateBalance(sender.accountId, newSenderBalance, sender.version);
        if (senderRows == 0) {
            // Concurrent update detected, abort and retry
            throw new ConcurrentModificationException("Sender account state changed during execution");
        }

        int receiverRows = accountRepo.updateBalance(receiver.accountId, newReceiverBalance, receiver.version);
        if (receiverRows == 0) {
            // Concurrent update detected, abort and retry
            throw new ConcurrentModificationException("Receiver account state changed during execution");
        }

        // Log transaction meta log
        String txId = UUID.randomUUID().toString();
        WalletTransaction tx = new WalletTransaction(txId, idempotencyKey, "SETTLED");
        transactionRepo.saveTransaction(tx);

        // Create balanced Ledger Debit and Credit entries following double-entry rules
        LedgerEntry debitEntry = new LedgerEntry(
            UUID.randomUUID().toString(), txId, sender.accountId, amount.negate()
        );
        LedgerEntry creditEntry = new LedgerEntry(
            UUID.randomUUID().toString(), txId, receiver.accountId, amount
        );

        transactionRepo.saveEntry(debitEntry);
        transactionRepo.saveEntry(creditEntry);

        return true;
    }
}

// Exception definition
class ConcurrentModificationException extends Exception {
    public ConcurrentModificationException(String message) {
        super(message);
    }
}

Trade-offs & Architecture Decisions

Enforcing strict financial correctness requires careful trade-offs. Below are the key design choices made in this system architecture.

1. Database Selection: Sharded PostgreSQL vs. Google Spanner vs. Cassandra

Cassandra (NoSQL): Highly available and horizontally scalable, but lacks native ACID transactions across multiple partitions. Enforcing double-entry ledger balance constraints at the application layer introduces dangerous race conditions and is highly prone to ledger drift.
Google Spanner: Offers global ACID transactions and horizontal scalability, but comes with high cloud vendor lock-in and high cost.
Sharded PostgreSQL (Selected): Enforces ACID guarantees at the database engine level, is open-source, and has robust tooling. Sharding by account_id enables horizontal scaling, while we manage cross-shard transfers via Saga orchestration or two-phase commits.

2. Concurrency Strategy: Optimistic Concurrency Control (OCC) vs. Pessimistic Locking

Pessimistic Locking (SELECT FOR UPDATE):
- Pros: Prevents double-spending by blocking concurrent transactions on the same account until the lock is released.
- Cons: Highly prone to database connection pool starvation under hot-spot conditions (e.g., a highly popular merchant account receiving thousands of transactions per second).
Optimistic Concurrency Control (OCC) (Selected):
- Pros: Non-blocking reads and high throughput. Transactions fail quickly and retry if another update occurred in the background.
- Cons: High abort rate under extremely high contention. We mitigate this using a randomized exponential backoff retry mechanism at the application layer.

3. Balance Management: Update-in-Place vs. Event-Sourced Ledger

Update-in-Place (Hybrid Selected): Maintaining a materialized balance column in the accounts table enables rapid balance query latency (under 10ms) and easy schema verification.
Event-Sourced Ledger: Re-calculating the user balance by querying and summing every historical ledger entry guarantees perfect auditing but introduces unsustainable latency as a user's transaction history grows.
Our Hybrid Approach: We store the current materialized balance in accounts using OCC, and write each entry to the entries ledger table within the exact same database transaction block.

Failure Scenarios & Mitigation Strategies

Financial platforms must design for failure. The ledger must remain resilient under bad network conditions, hardware outages, and malicious double-spends.

1. The Partial Success Problem (Saga Compensations)

When transferring funds between two separate PostgreSQL shards (Sender on Shard A, Receiver on Shard B), a network failure after deducting funds from Shard A but before crediting Shard B would result in money vanishing.

Mitigation: The Saga Orchestrator manages the workflow. If the credit to Shard B fails, the Orchestrator executes a compensational transaction: a rollback credit entry back to Shard A. Both operations are tracked using a unique saga_id to guarantee trace integrity.

2. High-Frequency Account Deadlocks

If User A transfers money to User B at the exact same millisecond User B transfers money to User A, two concurrent threads could lock Shard A and Shard B in reverse order, creating a classic database deadlock.

Mitigation: In our application code, we always sort the lock acquisition order of accounts by their alphanumeric ID. For instance, if acc_A is alphabetically less than acc_B, any transaction involving both accounts will lock acc_A first, and then lock acc_B, eliminating deadlock loops.

3. Redundant Network Retries (Idempotency Gating)

A client sends a payment request, but their internet drops before receiving the success response. The client app automatically retries the request, risking a double-charge.

Mitigation: The load balancer routes all write requests through a Redis-backed idempotency filter. If the Idempotency-Key exists in Redis, the request is blocked, and either the cached successful response or a "transaction in-progress" state is returned instantly.

4. Background Reconciliation Services

Even with ACID databases, small software bugs or silent disk corruption can cause ledger mismatch.

Mitigation: A daily offline cron job running on Spark/Hadoop scans all shard databases. It sums every entry in the entries table for each account and compares it against the materialized balance in the accounts table. If a mismatch is detected, the account is temporarily frozen, and an alert is dispatched to the security operations center.

Staff Engineer Perspective

As a Staff Engineer building financial ledgers, three golden rules must never be compromised:

Candidate Verbal Script & Mock Interview Guide

Here is a step-by-step walkthrough of how to articulate this design during an actual System Design interview.

1. Requirements Phase (Minutes 0 - 5)

Candidate: "To design a secure digital wallet ledger, I will clarify functional scope. Do we support multiple currencies? Yes. Do we need an immutable ledger? Yes. For non-functional goals, we want strong consistency (CP). I will assume a scale of 1 Billion daily transactions, meaning we need to handle an average of 11,500 QPS, peaking at 35,000 QPS. Our P99 execution latency should remain under 100ms."

2. High-Level Design (Minutes 5 - 15)

Candidate: "I will separate my architecture into three major components:
1. An API Gateway with a Redis-based Idempotency Layer to prevent duplicate charges.
2. A Saga Transaction Orchestrator to coordinate multi-shard money movements.
3. A horizontally sharded PostgreSQL database cluster, sharded by Account ID, to store the balances and entry logs. I will draw a Mermaid flow diagram demonstrating how a client request is first validated, saved in an active transaction store, and committed."

3. Data Schema & Double-Entry Bookkeeping (Minutes 15 - 25)

Candidate: "For data integrity, I will not store balance as a simple mutable cell. I will design a double-entry database schema. The schema consists of accounts, transactions, and entries. Every transaction must contain at least two entries: a debit and a credit. The sum of signed amounts for any transaction must always equal zero. This makes auditing simple and prevents money from appearing or disappearing out of thin air."

4. Concurrency & Hotspots Deep-Dive (Minutes 25 - 40)

Candidate: "At 35,000 QPS, concurrency is our main challenge. If thousands of users pay a single popular merchant account, we will face high transaction collisions. If we use pessimistic database locking, we risk thread pools starving. Instead, I will implement Optimistic Concurrency Control (OCC) using a version column. If two threads read version 5 and try to update the merchant account, only one will succeed, and the other will retry. For cross-shard transactions, I will use a Saga Orchestrator to ensure eventual ledger consistency across separate physical database shards."