How zkVIPER was built

A privacy-first, on-device age-verification primitive. EdgeXene LLC, Apache-2.0.

The problem

Age verification on the web almost always works by collecting the very thing it is supposed to protect: you hand a site your date of birth, your ID photo, or both, and trust that they store it responsibly. Every one of those collections is a breach waiting to happen and a regulatory liability for whoever holds the data.

zkVIPER takes the opposite approach. It proves a person is 18 or older, to the day, without their date of birth ever leaving the browser. The server receives only a zero-knowledge proof and a yes/no verdict, never the underlying identity data. There is no database of birthdays to leak, because the birthday is never transmitted in the first place.

It is built as an open-source cryptographic primitive, not a hosted service. The whole point is that anyone can read it, run it locally, and verify the privacy claim for themselves rather than taking it on faith.

How the flow works

The entire verification runs in the user's browser. Nothing is uploaded:

Document scan. The user photographs their ID.
OCR. The date of birth is read on-device. If OCR cannot recover a full month and day, the flow fails closed: a year-only date can't satisfy a day-accurate proof, so a partial read is rejected rather than guessed.
Liveness. An active gesture check, a randomized smile and head-turn sequence, confirms a live person is present rather than a static image.
Face match. The live selfie is matched against the portrait on the ID.
Anti-spoof. A passive presentation-attack check rejects printed-photo and screen-replay attacks that could pass the gesture check.
Proof. Only after all the above does the device generate a Groth16 zero-knowledge proof that birthDate is on or before today minus 18 years, with the birth date as a private input. Only the boolean result and the public "today" basis are revealed.
Receipt. The proof is verified and a PII-free compliance receipt is returned: the verdict, the policy evaluation, and a deterministic hash of the proof. Nothing in it could identify the person.

The decisions that shaped it

A day-accurate circuit, not a year approximation

Most "are you 18" checks compare birth year to current year, which is wrong by up to a year at the boundary. zkVIPER's circuit compares full packed YYYYMMDD dates, so someone one day short of 18 correctly fails. The month and day are private inputs: they are used inside the proof but never revealed.

GPL-free at every layer

The runtime prover and verifier are arkworks Groth16, licensed MIT/Apache, not snarkjs, and the circuit's comparator templates are first-party rather than the GPL-licensed circomlib. The circom and snarkjs build tools are used to compile the circuit and run the trusted setup, but they are developer tools. Like compiling with GCC, their license does not attach to the output, and neither is shipped or depended on by the app. The result is an Apache-2.0 primitive an organization can adopt without GPL obligations.

The witness is computed in Rust

Rather than ship a separate circom witness-calculator runtime to the browser, the prover, compiled from Rust to WebAssembly, computes the witness natively. One fewer artifact to download, one fewer moving part.

One shared Web Worker for memory

The heavy engines, namely the document detector, face matcher, anti-spoof model, and the proof prover, all run in a single shared Web Worker. This isolates their large WebAssembly heaps from the UI thread and lets the whole lot be freed in one shot when the proof completes, which matters on memory-constrained phones.

A stateless verifier

The verification endpoint writes nothing, manages no accounts, and holds no sessions. It checks the proof and hands back a verdict; what an integrating operator does with that verdict is the operator's decision. Keeping it stateless is what makes zkVIPER a primitive rather than an identity service.

What is not production-ready

The bundled trusted setup is development-only

The proving key shipped in the repository came from a single-contributor Groth16 Phase-2 ceremony. If that contributor kept the setup randomness ("toxic waste"), proof soundness is compromised. Anyone deploying for real must run their own multi-party Phase-2 contribution and generate their own proving key (and therefore their own proofs) from it. The more independent contributors, the stronger the guarantee.

To be clear about which half of the setup this is: the Phase-1 Powers of Tau is universal and circuit-independent, and zkVIPER reuses the public Hermez ceremony (powersOfTau28_hez_final_12.ptau) for it. You do NOT need to, and should not, regenerate Phase-1 yourself: a large, reputable public ceremony is far more trustworthy than a fresh solo one. What you must redo is the circuit-specific Phase-2, which turns that public Phase-1 plus the age-check circuit into the proving and verification keys.

The ceremony is a short sequence: an initial setup that binds the Phase-1 ptau to this circuit, one entropy contribution per independent party, a final public beacon so the last transform is verifiable, then export of the verification key. The build tool is snarkjs -- a developer tool whose license does not attach to its output, as covered in GPL-free at every layer above. In shape:

snarkjs groth16 setup build/age_check.r1cs powersOfTau28_hez_final_12.ptau age_check_0000.zkey
snarkjs zkey contribute age_check_0000.zkey age_check_0001.zkey --name="contributor 1"   # one per party
snarkjs zkey beacon     age_check_0001.zkey age_check_final.zkey <public-beacon> 10 -n="final beacon"
snarkjs zkey export verificationkey age_check_final.zkey verification_key.json

The full procedure ships in the source you receive: the repository README.md is the produce-your-own walkthrough (rebuild the circuit, then run the ceremony), and circuits/CEREMONY.md records the exact commands, tooling versions, and validation EdgeXene ran for this demo's key -- as a provenance reference, not a substitute for your own multi-party run.

The document detector uses proxy classes

The on-device ID-rectangle locator is a general object detector (EfficientDet-Lite0 trained on COCO), which has no "ID card" class, so it leans on proxy categories like book and cell phone plus a size and aspect plausibility envelope. A purpose-trained detector is the planned improvement.

Thresholds are the operator's to tune

The face-match and anti-spoof thresholds ship at sensible defaults, but false-acceptance and false-rejection rates depend on the real user population and must be tuned against it.

Hosting zkVIPER carries data-controller obligations

The OCR, liveness, face-match, and anti-spoof steps process biometric identifiers. Anyone who hosts it becomes responsible for compliance with the relevant biometric, privacy, and age-assurance law (BIPA, CCPA, GDPR, and so on).

What was hard

The honest part of building this was the on-device computer-vision and crypto plumbing, where small preprocessing mistakes fail silently rather than loudly.

The anti-spoof model was silently blind

The passive anti-spoof model (MiniFASNet) was fed pixel values divided by 255, the usual normalization. This particular ONNX export was trained on raw 0-255 values, so the division collapsed every input -- a genuine face, random noise, a real spoof -- to the same degenerate output. It was not rejecting attackers; it was blind to everything, and it failed real users while looking like it worked. Finding it meant running the shipped model offline against its own ground-truth samples. The fix was raw input plus a verdict that thresholds the live class (a real face scores 0.996, real spoofs 0.001), since a spoof can split its confidence across the print and replay classes so neither alone trips a single-class threshold.

Liveness was tuned for the wrong camera

The gesture thresholds were set for a desktop webcam. On a phone the face fills more of the frame (so a head turn produces a smaller normalized movement), the camera runs slower, and the head-turn baseline was being captured mid-motion. Real people failed the liveness check. The fix was a settled baseline averaged over the first frames, phone-tuned thresholds, and a two-frame hold so a single noisy frame cannot decide it.

OCR fabricated a birth year on huge uploads

A full-resolution phone photo (40+ megapixels) OCRs worse, not better, and the document crop additionally upscaled it. On a garbled read the parser had a last-resort fallback that grabbed any four-digit number in the text -- a ZIP code, a license number -- and treated it as a birth year, producing a confidently wrong date. The fix was to downscale before OCR and to remove the guessing fallback entirely: if a date of birth cannot be read near a date-of-birth label, the flow fails honestly rather than inventing one.

Verifying claims required real ground truth

A recurring theme: a check that looks like it passes is not the same as a check that works. Each of the above was only caught by running the real model or the real flow against real inputs, not by reading the code or trusting a green result. The thresholds shipped here are honest defaults, not validated guarantees, and still need tuning against a real user population.

Cryptography notice

zkVIPER includes cryptographic software: zero-knowledge proofs (arkworks Groth16 over BN254), SHA-256 hashing, and HMAC-based session signing. The country in which you currently reside may have restrictions on the import, possession, use, and re-export of encryption software. Before using or redistributing it, check the laws of your country.

It relies on standard, publicly available cryptographic primitives and is distributed with full source. In the United States, publicly available encryption source code is generally handled under the Export Administration Regulations; see the U.S. Bureau of Industry and Security (bis.gov).

Technical stack

Setup → document → OCR → liveness → face-match → anti-spoof → proof.

Circom: Circuit language. circuits/age_check.circom proves birthDate is on or before todayDate minus 180000 via LessEqThan(25) over packed YYYYMMDD integers (born on or before today minus 18 years, to the day) in 31 constraints over BN254. birthDate (year + month + day) is a private input; public signals are [ageValid, todayDate]. Comparators come from a first-party Apache-2.0 lib/comparators.circom (LessEqThan / LessThan / Num2Bits), not circomlib.
BN254: Elliptic curve the circuit lives on; also called bn128.
Hermez Powers of Tau: Phase-1 trusted setup. powersOfTau28_hez_final_12.ptau, level 12, supports up to 4096 constraints; the circuit uses 31. The shipped Phase-2 circuit_final.zkey is a single-contributor, development/evaluation-only ceremony; production adopters must re-run a multi-party contribution. README and SECURITY.md warn of the toxic-waste risk.
arkworks: Groth16 prover and verifier (ark-groth16 + ark-bn254 + a vendored circom_compat subset, all MIT OR Apache-2.0). Compiled to WASM via wasm-bindgen: a 350,864-byte web-target prover at public/zkprover/ and a 407,435-byte nodejs-target verifier at lib/zkprover-node/. The witness is computed natively in Rust; no circom witness runtime is shipped.
MediaPipe ObjectDetector: On-device document-rectangle locator. EfficientDet-Lite0 on COCO with book / cell phone / laptop / tv / remote / keyboard as proxy classes since COCO has no "ID card" class; plausibility envelope 15-95% area, aspect 1.2-2.5; known weak spot, pending custom-trained detector. @mediapipe/tasks-vision, Apache-2.0. Runs in the engine Web Worker; WASM self-hosted under /mediapipe-wasm.
Tesseract.js v7: On-device OCR. eng LSTM + simd-lstm wasm core, self-hosted under /tesseract; ~9.4 MB. Apache-2.0. Module is lib/ocr.ts. Not in the shared engine Web Worker: lib/ocr.ts calls Tesseract.js's own createWorker, so recognition runs in Tesseract's internal Web Worker (the lib/ocr.ts shim itself is on the main thread). Large uploads are downscaled to a 2000px long edge before recognition (raw 40+ MP phone/laptop photos OCR worse, not better, and risk mobile OOM). If OCR cannot read a full month + day near a DOB label, the flow fails closed: it never guesses a birth year from a stray 4-digit run (ZIP, license number), and a year-only DOB cannot satisfy the day-accurate circuit.
MediaPipe FaceLandmarker: On-device liveness via blendshapes: smile (sum of mouthSmileLeft/Right > 0.35) + nose-tip x-shift (> 0.10 vs a settled baseline averaged over the first few frames) for head turn; each must hold across 2 frames; gesture order randomized; 15s per-gesture timeout. Thresholds are tuned for phone front cameras (closer framing, slower frame rate) rather than a desktop webcam. Uses no eye/blink signal, so it is glasses-safe. Stays on the main thread, welded to the live webcam video via detectForVideo.
@vladmandic/face-api: On-device face-to-document match. TinyFaceDetector + faceLandmark68TinyNet + faceRecognitionNet, 128-d Euclidean threshold MATCH_THRESHOLD = 0.60; one retry then hard-fail. MIT weights at public/models/face-api/. Runs in the engine Web Worker.
MiniFASNet-V2: On-device passive anti-spoof / presentation-attack detection; complements the active gesture liveness by rejecting screen-replay and printed-photo attacks. Apache-2.0 ONNX, 1,744,116 bytes, at public/models/minifasnet/ with its LICENSE bundled. Input 80x80 BGR raw 0-255 (the model is trained on raw pixel values, not /255) NCHW from a 2.7x face-bbox crop (clamped to the frame so a close phone selfie is not padded with black). Its 3-class output is [print, live, replay] with the live/real class at index 1; we remap to [live, print, replay] for the receipt. The verdict passes when the live class clears 0.30 (a spoof can split confidence across the print and replay classes, so we threshold the live class, not a single spoof class); validated on the model's own ground-truth samples (genuine 0.996, spoofs 0.001). The active gesture liveness remains the primary defense. Runs via onnxruntime-web (MIT, WASM EP self-hosted under /onnx-wasm/) in the engine Web Worker, reusing the face-api detection box. Gates after face-match, before proving: fail-open on infra error, fail-closed on a confident spoof. Threshold tuning on real captures is the operator's responsibility.
Engine lifecycle -- Web Worker isolation: The four heavy compute engines (MediaPipe ObjectDetector, @vladmandic/face-api, MiniFASNet via onnxruntime-web, and the arkworks WASM prover) run in one shared Web Worker (lib/worker/engine.worker.ts), driven by a typed RPC client (engine-client.ts) with a PROTOCOL_VERSION handshake. The four engine lib/ modules are thin shims marshalling image data as transferred ImageBitmaps (zero-copy). worker.terminate() hard-frees the entire WASM/tfjs heap. Lifecycle: lazy spawn, persist through the flow, terminateWorker() after the proof (primary reclaim), unmount backstop. Left on the main thread: liveness (webcam-bound) and Tesseract (own internal worker). Low-memory advisory on the idle step when navigator.deviceMemory is 2 or less.
Stateless verification endpoint: POST /api/verify verifies a posted proof with arkworks and returns a PII-free compliance receipt: { verified, policy_evaluation: { is_adult, freshness_delta_days, basis_date_matched, crypto_valid }, cryptographic_receipt: { proof_hash, verifier_engine: "arkworks-0.6" } }. Policy: ageValid === "1" plus a timezone-safe freshness check on todayDate (decoded to calendar dates, +/-FRESHNESS_DAYS = 2). proof_hash is a sha256 of the canonical proof bytes (deterministic, not JSON.stringify). No DB, no sessions, no account mutation; the operator decides what to do with the verdict.