forwardlane_v1 — pilot pack — live

Real human datafor agenticLLM training.

Ten production-grade slices drawn from a live operational graph — joined, canonicalised, pseudonymised, provenance-back-linked, and shipped with three orthogonal privacy gauges in a signed manifest.

Scan the payload ↓
0
Task slices
0K
Documents
0K
Ontology edges
0K
Trace spans
scroll ↓
HUBSPOTSALESFORCEASANAJIRADOCFORGEFL-REPOSOPENTELEMETRYFALKORDBDUCKDBQWEN3 · MRLGRAPHITIMAGMAPROV-OFIBOHUBSPOTSALESFORCEASANAJIRADOCFORGEFL-REPOSOPENTELEMETRYFALKORDBDUCKDBQWEN3 · MRLGRAPHITIMAGMAPROV-OFIBO
§ PayloadFOUR SOURCE LAYERS → ONE GRAPH

What rides inside the pack.

Every entity carries a JSON-LD @type array. Every passage carries a PROV-O back-link. Every edge carries (t_valid, t_invalid, t_created, t_expired) — contradictions invalidate rather than overwrite.

01
Operational graph
A Living Operational Data corpus

25.5K contacts · 6.2K companies · 87 deals across HubSpot, Salesforce, Asana and Jira — joined into one canonical knowledge graph with HMAC-keyed pseudonyms.

02
Long-document store
15,341 Rust-parsed docs

~38 GB · ~4.5B markdown characters from Drive and iCloud. 7,326 organisations and 4,044 persons extracted with provenance back to source URIs.

03
Code + ontology
151 ontology-enriched repos

70,019 mapping edges across 15 ontologies — BFO, FIBO, UFO, PROV-O, SDLC, CWE, OWASP — on 110 repos. 43% of latest-revision functions carry ≥ 1 mapping.

04
Agent traces
22,915 OTel spans

6,698 unique trace IDs · 21,117 embedded passages. Every passage carries a PROV-O back-link to its source document URI.

§ Instrumentation

Three orthogonal gauges, one signed manifest.

A reviewer audits any slice without re-running the pipeline: recompute SHA-256 over the bytes, confirm the gauge band matches the value, verify the operator's ed25519 signature against the published fingerprint.

G·01
ACTIVE
HMAC pseudonymisation

Per-slice keys rotated per release. Canonical URIs never leave the resolver — only HMAC-derived pseudonym IDs ship in the pack.

G·02
ACTIVE
Wasserstein-1 leakage

Each slice carries a band — tight / moderate / wide — for distributional distance from the held-out reference. Villani 2009; scipy bootstrap CI.

G·03
ACTIVE
LiRA membership inference

Likelihood-ratio attack at FPR = 10⁻³ against a shadow model. We ship the band, not the raw score. Carlini et al. 2022.

G·04
ACTIVE
Carlini canary exposure

Secret-Sharer canaries planted at controlled rarity; we publish the 95th-percentile exposure as an upper bound. Carlini 2019 + 2021.

§ Lineage

Standing on the shoulders of giants.

FULL BIBLIOGRAPHY IN THE DECK APPENDIX

[01]Rasmussen et al. 2025 — Zep / Graphiti — arXiv:2501.13956
[02]MAGMA 2026 — arXiv:2601.03236
[03]MemWeaver 2026 — arXiv:2601.18204
[04]NLSHBlock 2024 · ComEM 2024
[05]Qwen3-Embedding-0.6B 2025 · Matryoshka 2022
[06]Carlini — Secret Sharer / Extraction / LiRA — 2019·21·22
[07]Sweeney 2002 · Machanavajjhala 2007 · Villani 2009
[08]BFO · UFO · CCO · FIBO · PROV-O · SDLC · CWE · OWASP

Want the full pack?

Sign in with Google to open the interactive deck and the downloadable sample data. Access is allowlisted — if your domain isn't recognised, reach nathan@forwardlane.com.