The Problem
Our client is a growing Australian payment processor handling thousands of daily card transactions across dozens of merchants — from fashion retail chains to ticketing platforms and trade suppliers. Like every payment processor, they face a constant threat: transaction fraud.
Card testing attacks. Velocity abuse. Refund manipulation. Organised fraud rings cycling through stolen BINs. These aren't theoretical risks — they're daily operational realities that cost the payment industry billions annually. The Australian Payments Network reported $2.2 billion in card fraud losses in 2024 alone.
The conventional solution? Enterprise fraud platforms from companies like NICE Actimize, Featurespace, or SAS — systems that start at six figures annually, require months of integration, and demand dedicated fraud operations teams. For a growing payment processor, that's not a viable starting point.
the processor needed something different: a system that could detect real fraud patterns across their transaction data, run in near-real-time, and cost almost nothing to operate. They needed it built fast, and they needed it to work.
The Approach: Claude Code as the Engineering Team
We built the entire fraud detection system — pipeline, ML models, API, and live dashboard — using Claude Code as the primary development tool. Not as a code autocomplete assistant, but as a full engineering partner: designing the detection architecture, writing the rule engine, implementing the ML models, building the API layer, and creating the dashboard.
The development process looked like this:
1. Requirements gathering — We analysed the processor's transaction data (2.5 million historical transactions) to understand the data shape, identify fraud-prone patterns, and establish detection baselines.
2. Architecture design — Claude Code helped design a layered detection system: rule-based pattern matching for known fraud types, statistical anomaly detection for unknown patterns, and ML-based multivariate outlier detection for complex fraud signals.
3. Implementation — The full system was built iteratively. Each component — database schema, feature engineering, detection rules, scoring engine, API, dashboard — was developed, tested, and refined through conversation with Claude Code.
4. Audit and hardening — After the initial build, we ran a multi-model cross-examination (Claude, Gemini, DeepSeek) that identified 22 issues — false positive sources, miscalibrated thresholds, and edge cases. All were fixed before deployment.
The total development time from first line of code to production deployment was measured in days, not months.
What It Detects: 18 Fraud Patterns Across 6 Categories
The system runs 18 detection rules across three tiers, covering the major fraud attack vectors that payment processors face.
Card Testing (the #1 attack vector)
When a fraudster obtains a batch of stolen card numbers, the first thing they do is test which ones work. They run micro-charges — $0.01, $0.99, $2.50 — in rapid succession. If the card clears, they know it's live.
Our system catches this with three interlocking rules:
- MICRO_AMOUNT flags any transaction under $3 (the classic card test amount)
- RAPID_FIRE detects 2+ transactions from the same card within 5 minutes
- IDENTICAL_REPEAT catches the same card making the same charge within 2 minutes
A single micro-charge scores 15 points (WATCH tier). But a card testing attack — three $0.99 charges in 4 minutes — stacks to 95 points (HIGH_RISK), triggering an immediate alert.
Decline Attacks (APD — Advanced Purchase Declining)
Sophisticated fraudsters probe for valid card numbers by submitting rapid authorisation requests, expecting most to decline. They're fishing for the one that clears.
- DECLINE_CLUSTER flags 3+ declines within a 30-minute window
- DECLINE_THEN_RETRY catches an approval immediately following a decline on the same card
Velocity Abuse (Impossible Travel)
A card used at a Melbourne retail store at 2:00 PM and a Sydney merchant at 2:15 PM? That's physically impossible — the card has been cloned or the number stolen for card-not-present fraud.
- CROSS_MERCHANT_VELOCITY detects cards appearing at 2+ merchants within 30 minutes
- CARD_ROAMING flags cards seen at 3+ distinct merchants within 7 days
Organised Fraud Rings
The most dangerous fraud isn't one-off — it's systematic. A ring of fraudsters using cards from the same BIN batch, making near-identical purchases across merchants, repeating the pattern monthly.
- BIN_CONCENTRATION detects unusual spikes in specific card BIN usage (3x baseline)
- CROSS_MERCHANT_SIMILAR_AMOUNT catches cards making purchases within 20% of each other at different merchants
- PERIODIC_PATTERN identifies repeat patterns across 90-day windows — the signature of an organised operation
Refund Fraud
"Friendly fraud" — where a customer makes a legitimate purchase, receives the goods, then disputes the charge — costs merchants billions annually. Serial refund abusers leave a trail.
- EXCESSIVE_REFUNDS flags 2+ refunds on the same card in a single day
- CROSS_DAY_REFUND detects cards with refund patterns spanning multiple days over 6 months
Statistical Anomalies (What the Rules Miss)
Rules catch known patterns. ML catches the unknown. Our system runs three statistical models:
- Z-Score Outlier Detection flags transactions more than 3.5 standard deviations from the merchant's 90-day baseline
- Isolation Forest (scikit-learn) identifies multivariate outliers using 5 features: amount, transaction hour, daily card count, time since last transaction, and amount z-score. It flags the top 2% most anomalous transactions.
The Isolation Forest is particularly powerful because it finds fraud patterns that no single rule would catch — a moderately large transaction, at an unusual hour, from a card with no recent history, at an amount that's slightly unusual. Each signal alone is weak; together, they're highly suspicious.
The Scoring System: Explainable by Design
Every flagged transaction gets a score — an additive total of all triggered rules. This was a deliberate architectural choice over a black-box ML classifier.
| Risk Tier | Score | Action |
|---|---|---|
| CLEAR | 0–9 | No action |
| WATCH | 10–24 | Log for pattern analysis |
| SUSPICIOUS | 25–49 | Review recommended |
| HIGH_RISK | 50+ | Manual review mandatory |
A real example from the system:
[HIGH_RISK] Score: 155
Card: 523163-XXXX
RAPID_FIRE (+30) 4 minutes since last transaction
DECLINE_THEN_RETRY (+15) Retry after decline
UNUSUAL_AMOUNT (+20) $1,084.78 — z-score 12.6
BIN_CONCENTRATION (+25) BIN 523163 at 5 txns today (baseline: 1.4)
IDENTICAL_REPEAT (+20) Same amount same day
ZSCORE_OUTLIER (+20) Statistical outlier
ISOLATION_FOREST (+25) Multivariate anomaly
Every point has a reason. Every reason is auditable. When a fraud analyst reviews this alert, they don't see "ML confidence: 87%." They see exactly which patterns triggered and why. This matters for compliance, for training, and for tuning the system over time.
The Stack: $0 in Software Licensing
The entire system runs on open-source software:
| Component | Technology | Cost |
|---|---|---|
| Database | DuckDB (embedded, columnar) | Free |
| Detection Pipeline | Python + pandas + scikit-learn + scipy | Free |
| API Server | FastAPI + uvicorn | Free |
| Dashboard | Next.js + React + Recharts + Tailwind | Free |
| ML Models | Isolation Forest (scikit-learn) | Free |
| Total software cost | $0 |
Why DuckDB?
The single most impactful architectural decision was choosing DuckDB over PostgreSQL. DuckDB is an embedded analytical database — think SQLite for analytics. It requires:
- No server process
- No configuration
- No licensing
- No maintenance
It sits as a single file on disk (93 MB for 2.5 million transactions), supports full SQL with window functions and CTEs, and handles concurrent reads while the detection pipeline writes. The API opens the database in read-only mode, so the dashboard stays responsive even while the cron pipeline processes new transactions.
For a system processing 5,000 transactions per day, DuckDB handles everything comfortably. At 50,000 per day, it still fits in memory on a modest machine. You'd need to be processing 500,000+ daily transactions before considering a traditional database — and even then, the migration path is straightforward because the queries are standard SQL.
Infrastructure: $12/Month for Production
The entire production system runs on a single DigitalOcean droplet:
| Resource | Specification | Monthly Cost |
|---|---|---|
| VPS | 1 vCPU, 1GB RAM, 25GB SSD | $6–12 |
| Database | DuckDB file on local SSD | $0 |
| Monitoring | Cron logs + dashboard | $0 |
| Backups | rsync to S3 (optional) | $0.02 |
| Total | ~$12/month |
Compare this to enterprise fraud detection platforms:
| Solution | Annual Cost | Setup Time |
|---|---|---|
| Enterprise platform (NICE, Featurespace) | $100,000–500,000+ | 3–6 months |
| Managed fraud API (Stripe Radar, Sift) | $0.05–0.10 per transaction | 1–2 weeks |
| Our Custom System | $144/year | Built in days |
At 5,000 transactions per day, a per-transaction fraud API would cost $250–500/month. Our system costs $12. At 50,000 transactions per day, the API approach costs $2,500–5,000/month. Ours still costs $12.
The cost advantage compounds as volume grows.
Near-Real-Time Detection: 1-Minute Cycles
The detection pipeline runs as a 1-minute cron job:
1. Ingest any new transaction CSV data
2. Compute features (card velocity, amount baselines, BIN counts)
3. Run all 18 detection rules
4. Score and classify each transaction
5. Persist results to card_risk_history
6. Stream HIGH_RISK alerts to the dashboard via SSE
A lock file prevents concurrent runs. Deduplication (by transaction_id + rule_name) ensures no double-counting across cycles. The dashboard updates every 30 seconds with a fallback polling mechanism if the Server-Sent Events connection drops.
This isn't true millisecond-latency detection — it's near-real-time batch processing. But for the fraud patterns that matter (card testing runs, velocity abuse, refund patterns), a 1–5 minute detection window is more than sufficient. The fraudster's testing sequence takes minutes; we detect it within the same window.
The Dashboard: From Alert to Action
The live dashboard gives the fraud team everything they need in one view:
Summary Panel — Total transactions, flagged count, tier breakdown, last pipeline run status. At a glance: "Are we healthy?" Flags Table — Paginated, filterable list of flagged transactions. Filter by tier, rule, minimum score, date range. Sort by risk score. Click through to card history. Timeline Chart — Hourly transaction volume overlaid with flag density. Spot attack windows visually — a spike in flags at 3 AM is immediately obvious. Rules Breakdown — Which detection rules are firing most today? If RAPID_FIRE suddenly dominates, there's likely a card testing attack in progress. Live Feed — Real-time SSE stream of HIGH_RISK and SUSPICIOUS alerts. New alerts appear instantly as the pipeline processes them. Card Deep-Dive — Click any card ID to see its full flag history: every rule triggered, every merchant visited, every amount flagged, across the full 90-day lookback window.What Small Financial Businesses Should Know
Fraud detection isn't just for the big banks. If you process payments — even a few hundred transactions per day — you're a target. Card testing attacks are automated; they don't discriminate by company size.
Here's what we learned building this system:
1. Simple Rules Catch Most Fraud
The Isolation Forest and statistical models are impressive, but the rules that catch the most real fraud are embarrassingly simple:
- "Did this card make 3 transactions in 5 minutes?" (RAPID_FIRE)
- "Is this amount under $3?" (MICRO_AMOUNT)
- "Did this card get declined and then approved within a minute?" (DECLINE_THEN_RETRY)
These three rules alone, costing nothing to implement, would catch the majority of card testing attacks — which represent the highest volume of payment fraud.
2. You Don't Need Big Data to Detect Fraud
Our system works with a 93 MB database. The Isolation Forest trains on 500 random transactions. The baselines compute from 90-day lookbacks using standard SQL window functions. You don't need a data lake or a Spark cluster. You need clean transaction data and well-calibrated thresholds.
3. Explainability Beats Accuracy
A fraud analyst who sees "RAPID_FIRE: 3 transactions in 4 minutes from card ending 5711" can act immediately. An analyst who sees "ML confidence: 83%" needs to investigate further. In production fraud operations, interpretable rules with clear reasons reduce mean time to decision from minutes to seconds.
4. The Cost of Not Detecting Fraud Is Real
For small payment processors, a single undetected card testing attack can mean:
- Hundreds of fraudulent micro-charges creating operational overhead
- Chargebacks that eat into margins
- Card network penalties for high fraud rates
- Reputational damage with merchants
A $12/month detection system that catches 80% of card testing attacks pays for itself on the first prevented incident.
5. Claude Code Makes This Accessible
The engineering effort to build a system like this traditionally requires:
- A data engineer to design the pipeline
- An ML engineer to build and tune the models
- A backend developer for the API
- A frontend developer for the dashboard
- A DevOps engineer to deploy and monitor
We replaced all five roles with a single developer working with Claude Code. Not because Claude Code is a substitute for expertise — it embodies it. The detection rules, scoring calibration, feature engineering, API design, and dashboard architecture reflect the kind of decisions that experienced fraud engineers make. Claude Code made those decisions accessible to a small team that couldn't afford to hire five specialists.
Results
| Metric | Value |
|---|---|
| Detection rules | 18 (11 per-transaction, 7 cross-session) |
| ML models | 3 (Isolation Forest, Z-score, Poisson) |
| Fraud categories covered | 6 (card testing, APD, velocity, rings, refund, anomaly) |
| Detection latency | 1–5 minutes |
| Monthly infrastructure cost | $12 |
| Software licensing cost | $0 |
| Database size (2.5M transactions) | 93 MB |
| Development time | Days, not months |
| Ongoing maintenance | ~1 hour/month |
The Bottom Line
Fraud detection doesn't have to be expensive, complex, or slow to deploy. With the right tools — DuckDB for zero-cost storage, scikit-learn for ML, FastAPI for the API, Next.js for the dashboard, and Claude Code for the engineering — a small payment processor can have enterprise-grade fraud detection running in production for the cost of a Netflix subscription.
The fraudsters have automated their attacks. It's time the defenders automated their defences.
*Built by Parity AI for the payment processor. Powered by Claude Code.*
*For enquiries: admin@parityai.com.au*