A ranking is only as good as the method behind it.
Every VPN we list goes through the same battery of tests, on the same hardware, every month. This is exactly how we score them — including the parts we got wrong.
The 100-point rubric.
Every overall score is a weighted blend of four sub-scores. The weights are public; the inputs are public; the data is public.
- Speed35%
Median throughput across 5-city panel; protocol-aware; same hardware every month.
- Privacy30%
Audit history, jurisdiction, app source code transparency, leak tests, log policy.
- Streaming20%
Unblock rate across Netflix, Disney+, BBC iPlayer, Prime Video — across regions.
- Value15%
Effective $/mo on the longest plan + a features-per-dollar score.
Probes, panels, and assumptions.
Each pillar broken down into the specific tests, server choices, and tolerances.
The number that gets the most attention is the easiest to fake. Our speed score is the median of the same protocol matrix, on the same hardware, against our own random-byte endpoint.
- WireGuard / NordLynx / Lightway over UDP at 1500 MTU
- OpenVPN UDP at 1500 MTU (baseline)
- IKEv2 / IPsec, where supported, at default MTU
Each provider runs 7 days × 3 protocols × 5 cities = 105 measurements per cycle. We use the median, not the peak, to discourage cherry-picking.
Privacy is a posture, not a feature. The score blends what the provider can prove (audits, code, jurisdiction) with what we can measure (leak rate, kill-switch failure).
- Most recent independent audit — auditor, scope, date, public report
- Jurisdiction and applicable mandatory data-retention law
- App source-code transparency (open-source clients, reproducible builds)
- Log policy: claimed vs. demonstrated (court records, server seizures)
- Kill-switch failure rate measured in the lab
- DNS / WebRTC / IPv6 leak rate measured in the lab
- Warrant canary status and last-update freshness
We test on real subscriptions, not mocked endpoints. A region passes if catalogue items play at full HD without buffering for at least three minutes across two attempts.
- Netflix (US, UK, JP, BR)
- Disney+ (US, UK, IN)
- BBC iPlayer (UK)
- Amazon Prime Video (US, DE)
Streaming unblock degrades over time. Each region is re-checked weekly; the score on the page reflects the most recent four-week rolling pass rate.
Value is not the lowest sticker price. It blends effective monthly cost, what you get bundled, and what you pay when the introductory term ends.
- Effective $/mo on the longest plan, after introductory discount
- Simultaneous connections allowed
- Money-back guarantee length and verified honour rate
- Bundled extras (password manager, ad blocker, dedicated IP)
- Renewal price vs. introductory price
Each sub-score is graded out of 10 against per-pillar reference points drawn from the top quartile of the testing panel. We blend the four sub-scores by the published weights to land at an Overall score out of 10. The number on the review page is the raw output of that calculation — no rounding up to make a pick look better, no rounding down to make a competitor look worse.
Same hardware every time.
Independence requires consistency. Every test runs on the same physical setup so month-over-month deltas mean something.
Two identical desktops — Intel Core i7-14700, 32GB DDR5, Samsung 990 Pro NVMe — running Windows 11 24H2 and Ubuntu 24.04 LTS in parallel. Provider apps installed from the official channel for each platform.
1 Gbps symmetric fibre to the residence, verified weekly against the unencrypted baseline. Tests pause if the baseline drifts more than 3% from rolling weekly median.
Five edge regions: Frankfurt, New York, Singapore, London, Sydney. Speed downloads are uncompressible random bytes served from our own endpoint to defeat traffic shaping.
Real subscriptions on rotating accounts, refreshed weekly. Each platform tested against four catalogue items chosen for regional sensitivity. No mocked endpoints.
DNS, WebRTC, IPv6, and kill-switch failure under simulated drop. Same protocol matrix as speed tests. Captured packet-level, not just from in-app indicators.
Every test result is timestamped, hashed, and stored unedited. The raw CSV is what powers the rankings — there is no manual override layer.
If a provider believes a score is wrong, they are welcome to reproduce the test using the published rig spec and protocol matrix. When their measurements differ substantially from ours, we re-run the test and publish both results, dated. We do not quietly revise scores in response to commercial feedback.
We disclose everything. Including what we get wrong.
Rankings change because the data changes. When we revise a score, the change log is public. When a test methodology shifts, every existing review is re-scored, not grandfathered.