Our Methodology
TL;DR: We don't trust casinos. We don't trust claims. We test raw cryptographic output against government-grade statistical standards — NIST SP 800-22, PractRand, and TestU01 — using 100,000+ rounds per game. Every dataset is published. Every test is reproducible. If you can do math, you can check our work.
Why This Page Exists
I spent 23 years as a pit boss in land-based casinos. I watched players lose money, sure — but they could see the wheel spin. They could watch the cards being dealt. Trust wasn't blind. It was built into the physical process.
Online crypto casinos removed all of that. They replaced it with a promise: "It's provably fair."
Here's the problem — provably fair proves integrity, not fairness. A casino can pass every hash check and still rig outcomes through seed timing exploits. That's not theory. That's a documented vulnerability with working proof-of-concept code.
So we built something different. A testing framework that doesn't care what the casino says. It only cares what the numbers show.
What We Actually Test
Raw Floats, Not Game Results
This is the most important architectural decision we made — and it's the same one used by GLI and eCOGRA, the gold standard in regulated gambling.
Every provably fair game works the same way under the hood:
HMAC-SHA256(server_seed, client_seed:nonce:round) → 32 bytes → float [0,1) → game resultDice, Crash, Limbo, CoinFlip, Roulette — they all start from the same raw uniform float. The game result is just a deterministic transformation of that float. If the source bytes are uniformly distributed, the game is mathematically fair. Period.
We test the source. Not the transformation.
Why? Because testing game results introduces noise from the transformation function itself. A crash multiplier distribution should look skewed — that's by design. But the underlying bytes should be perfectly uniform. Testing at the byte level is cleaner, more powerful, and catches manipulation that game-level tests would miss.
Three Testing Frameworks, One Verdict
We don't rely on a single test or a single framework. We run two independent, complementary test suites on every dataset:
Framework 1: NIST SP 800-22 Rev. 1a — Complete Suite
The National Institute of Standards and Technology published SP 800-22 as the standard for evaluating random and pseudorandom number generators. It's what governments, military contractors, and financial institutions use to certify their cryptographic systems.
We run the complete battery — all 15 tests. Not a subset. Not a "simplified version." The full thing.
| # | Test | NIST Section | What It Catches |
|---|---|---|---|
| 1 | Monobit (Frequency) | § 2.1 | Overall bias — are there more 0s than 1s in the bitstream? |
| 2 | Block Frequency | § 2.2 | Local bias — does the balance hold in smaller sub-sequences? |
| 3 | Runs Test | § 2.3 | Patterns in consecutive values — too many or too few streaks? |
| 4 | Longest Run of Ones | § 2.4 | Suspicious clustering — are the longest streaks within normal range? |
| 5 | Binary Matrix Rank | § 2.5 | Linear dependencies — hidden structure in the bit matrix? |
| 6 | DFT Spectral | § 2.6 | Periodic patterns — Fourier analysis reveals hidden cycles |
| 7 | Non-overlapping Template | § 2.7 | Specific bit patterns appearing too often or too rarely |
| 8 | Overlapping Template | § 2.8 | Same as above, but with overlapping pattern windows |
| 9 | Maurer's Universal | § 2.9 | Compressibility — can the output be compressed? (If yes: not random) |
| 10 | Linear Complexity | § 2.10 | Predictability — could a linear feedback shift register reproduce this? |
| 11 | Serial Test | § 2.11 | Pair and triplet uniformity — are bit combinations evenly distributed? |
| 12 | Approximate Entropy | § 2.12 | Entropy in overlapping patterns — is the output truly unpredictable? |
| 13 | Cumulative Sums | § 2.13 | Drift over time — does the output trend in one direction? |
| 14 | Random Excursions | § 2.14 | Cycle analysis — abnormal patterns in cumulative sum walks |
| 15 | Random Excursions Variant | § 2.15 | State visit frequency — does the random walk visit states evenly? |
Each test produces a p-value. We use a significance level of α = 0.01 (99% confidence). A p-value below 0.01 means the output deviates from randomness more than chance alone would explain.
Additional Statistical Tests
Beyond NIST, we run four more tests from standard statistics — different mathematical lenses on the same data:
| # | Test | What It Catches |
|---|---|---|
| 16 | Chi-Square Goodness of Fit | Are outcomes distributed as uniformly as they should be? |
| 17 | Kolmogorov-Smirnov | Does the empirical distribution match the theoretical one? |
| 18 | Serial Correlation (Lag-1) | Can you predict the next value from the previous one? |
| 19 | Runs Up/Down (Wald-Wolfowitz) | Are there suspicious trends — too many ups or downs in a row? |
Framework 2: PractRand
NIST is the industry standard. PractRand is the industry nightmare.
Developed by Chris Doty-Humphrey, PractRand is widely regarded as the most demanding PRNG test suite in existence. Where NIST tests might pass a mediocre generator, PractRand will tear it apart.
PractRand works differently from NIST. It consumes a raw binary stream and runs progressively harder tests at increasing data volumes — from kilobytes to terabytes. It doesn't just check for bias. It hunts for subtle correlations, periodicities, and structural weaknesses that standard tests miss entirely.
If NIST is a medical check-up, PractRand is an autopsy. It finds things you didn't know were there.
We convert casino outcomes into raw binary streams and feed them directly into PractRand. A generator that passes both NIST and PractRand is, for all practical purposes, indistinguishable from true randomness.
Framework 3: TestU01 (BigCrush)
TestU01 is the academic gold standard, developed at the Université de Montréal. Its BigCrush battery runs 106 statistical tests over 3–4 hours — the most comprehensive single-run analysis of a random number generator that exists in peer-reviewed literature.
Where NIST gives you the government stamp and PractRand hunts for subtle structural flaws, BigCrush throws everything academia has developed over decades at your data. If a generator survives all three, there is no known statistical method that could distinguish it from true randomness.
Audit Tiers
Not every audit needs the same depth. We run two tiers:
Standard Audit (Every Report)
Every published audit report runs through our 25-test battery:
- 15 NIST SP 800-22 tests (complete suite)
- 4 additional statistical tests (Chi-Square, K-S, Serial Correlation, Runs Up/Down)
- 6 game-specific validation tests
This already exceeds what any competitor runs. It covers everything a well-implemented provably fair system should pass.
Deep Audit (On Request)
For casinos that want to prove they’re beyond reproach — or players who need absolute certainty — we go further:
- PractRand — progressive binary stream analysis, from kilobytes to terabytes
- TestU01 BigCrush — 106 academic-grade tests, 3–4 hour runtime
A Deep Audit is available on request. We run it when the stakes are high, the dataset is large, or someone challenges our findings. Three independent scientific frameworks, zero overlap in methodology, one verdict.
If NIST is the medical check-up, PractRand is the MRI, and BigCrush is the full autopsy. Most patients only need the check-up. But we have the operating room ready.
Game-Specific Validation
On top of the raw-float analysis, we run game-specific tests on the actual outcomes. These verify that the transformation from raw float to game result is implemented correctly — a casino could have a perfect RNG but a broken game formula.
| # | Test | Game | What It Verifies |
|---|---|---|---|
| 20 | Crash Instant Rate (Stake) | Crash | ~4.0% of rounds bust at 1.00x (matches Stake's house edge) |
| 21 | Crash Instant Rate (Roobet) | Crash | ~5.95% of rounds bust at 1.00x (matches Roobet's house edge) |
| 22 | Crash Instant Rate (Bustabit) | Crash | ~4.0% of rounds bust at 1.00x (matches Bustabit's house edge) |
| 23 | Coin Fairness | CoinFlip | 50/50 split between heads and tails within expected variance |
| 24 | Roulette Distribution | Roulette | Chi-square across all 37 slots (0–36) |
| 25 | Dice Distribution | Dice | Uniform distribution across the 0–100 range |
Total: 25 individual tests per audit — 15 NIST + 4 additional statistical + 6 game-specific. For Deep Audits, add PractRand and TestU01 BigCrush (106 additional tests) on top.
Show me another casino review site that runs even five of these.
Sample Sizes
We don't do spot checks. Our minimum sample size is 100,000 rounds per game. For major audits, we go to 250,000 or more. Our Bustabit audit analyzed 100 million rounds.
Why does sample size matter? Because small samples hide manipulation. A rigged coin that lands heads 52% of the time looks normal after 100 flips. After 100,000 flips, the bias screams. Statistical power increases with sample size — and we use enough data to detect deviations as small as 0.1%.
Data Integrity
Every audit report includes:
- SHA-256 dataset hash — cryptographic proof that the data hasn't been altered after testing
- Complete seed parameters — server seed, client seed, nonce range
- Reproducibility instructions — step-by-step guide so anyone can regenerate our results
- Raw data download — the actual outcomes as JSON, available for independent verification
We don't ask you to trust us. We give you the tools to verify us. That's the difference between an audit and an opinion.
What We Don't Do
Transparency means being honest about limitations too:
- We can't test live server behavior in real-time. We audit historical data. A casino could theoretically behave differently for specific players or time periods. Statistical analysis catches systematic manipulation, not targeted single-round rigging.
- We don't audit smart contracts. On-chain games with published Solidity code are a different beast. Our focus is HMAC-SHA256 based provably fair systems.
- We don't guarantee future fairness. An audit is a snapshot. That's why we advocate for continuous monitoring and regular re-audits.
- We don't test withdrawal speed or customer support. Our scope is mathematical fairness. For business practices, read why math alone isn't enough.
The Scoring System
Each audit produces a FairPlay Score from 0 to 10:
| Score | Rating | Meaning |
|---|---|---|
| 9.0–10.0 | EXCELLENT | All tests passed. No statistical anomalies detected. |
| 7.0–8.9 | GOOD | Minor deviations within acceptable variance. No evidence of manipulation. |
| 5.0–6.9 | MARGINAL | Some tests show borderline results. Warrants closer monitoring. |
| 3.0–4.9 | CONCERNING | Multiple statistical anomalies. Expanded testing recommended. |
| 0.0–2.9 | FAILED | Systematic deviations detected. Data inconsistent with fair RNG. |
The score is calculated from the pass/fail ratio across all applicable tests, weighted by severity. A failed NIST Monobit test (fundamental bias) weighs heavier than a marginal Runs test result.
Open Source Commitment
Our testing tools will be published as open source on GitHub. You can read the code. You can run it yourself. You can file issues if you find a bug.
This isn't generosity — it's strategy. Open source means our methodology is under permanent peer review. If our tests are flawed, someone will find it. That pressure keeps us honest. And honestly? That's exactly how it should work.
Not every casino appreciates this level of scrutiny. We’ve documented the seven most common excuses casinos give when asked about independent audits — and why none of them hold up.
Challenge Us
If you think our methodology has a gap, our math is wrong, or our conclusions don't follow from the data — tell us. We publish a standing invitation to challenge any audit we've ever produced. Bring data, not opinions, and we'll respond in kind.
That's the whole point. We're not asking you to believe us. We're asking you to check.