FORWARDLANE_V1 · v0.1.0 · 10 SLICES · PARQUET·ZSTD + JSONL · SHA-256 · ED25519 · BI-TEMPORAL

forwardlane_v1 — pilot pack — live

Real human datafor agenticLLM training.

Ten production-grade slices drawn from a live operational graph — joined, canonicalised, pseudonymised, provenance-back-linked, and shipped with three orthogonal privacy gauges in a signed manifest.

Scan the payload ↓

Task slices

Documents

Ontology edges

Trace spans

scroll ↓

HUBSPOT◆SALESFORCE◆ASANA◆JIRA◆DOCFORGE◆FL-REPOS◆OPENTELEMETRY◆FALKORDB◆DUCKDB◆QWEN3 · MRL◆GRAPHITI◆MAGMA◆PROV-O◆FIBO◆HUBSPOT◆SALESFORCE◆ASANA◆JIRA◆DOCFORGE◆FL-REPOS◆OPENTELEMETRY◆FALKORDB◆DUCKDB◆QWEN3 · MRL◆GRAPHITI◆MAGMA◆PROV-O◆FIBO◆

§ PayloadFOUR SOURCE LAYERS → ONE GRAPH

What rides inside the pack.

Every entity carries a JSON-LD @type array. Every passage carries a PROV-O back-link. Every edge carries (t_valid, t_invalid, t_created, t_expired) — contradictions invalidate rather than overwrite.

Operational graph

A Living Operational Data corpus

25.5K contacts · 6.2K companies · 87 deals across HubSpot, Salesforce, Asana and Jira — joined into one canonical knowledge graph with HMAC-keyed pseudonyms.

Long-document store

15,341 Rust-parsed docs

~38 GB · ~4.5B markdown characters from Drive and iCloud. 7,326 organisations and 4,044 persons extracted with provenance back to source URIs.

Code + ontology

151 ontology-enriched repos

70,019 mapping edges across 15 ontologies — BFO, FIBO, UFO, PROV-O, SDLC, CWE, OWASP — on 110 repos. 43% of latest-revision functions carry ≥ 1 mapping.

Agent traces

22,915 OTel spans

6,698 unique trace IDs · 21,117 embedded passages. Every passage carries a PROV-O back-link to its source document URI.

§ Instrumentation

Three orthogonal gauges, one signed manifest.

A reviewer audits any slice without re-running the pipeline: recompute SHA-256 over the bytes, confirm the gauge band matches the value, verify the operator's ed25519 signature against the published fingerprint.

G·01

ACTIVE

HMAC pseudonymisation

Per-slice keys rotated per release. Canonical URIs never leave the resolver — only HMAC-derived pseudonym IDs ship in the pack.

G·02

ACTIVE

Wasserstein-1 leakage

Each slice carries a band — tight / moderate / wide — for distributional distance from the held-out reference. Villani 2009; scipy bootstrap CI.

G·03

ACTIVE

LiRA membership inference

Likelihood-ratio attack at FPR = 10⁻³ against a shadow model. We ship the band, not the raw score. Carlini et al. 2022.

G·04

ACTIVE

Carlini canary exposure

Secret-Sharer canaries planted at controlled rarity; we publish the 95th-percentile exposure as an upper bound. Carlini 2019 + 2021.

§ Lineage

Standing on the shoulders of giants.

FULL BIBLIOGRAPHY IN THE DECK APPENDIX

[01]Rasmussen et al. 2025 — Zep / Graphiti — arXiv:2501.13956

[02]MAGMA 2026 — arXiv:2601.03236

[03]MemWeaver 2026 — arXiv:2601.18204

[04]NLSHBlock 2024 · ComEM 2024

[05]Qwen3-Embedding-0.6B 2025 · Matryoshka 2022

[06]Carlini — Secret Sharer / Extraction / LiRA — 2019·21·22

[07]Sweeney 2002 · Machanavajjhala 2007 · Villani 2009

[08]BFO · UFO · CCO · FIBO · PROV-O · SDLC · CWE · OWASP

Want the full pack?

Sign in with Google to open the interactive deck and the downloadable sample data. Access is allowlisted — if your domain isn't recognised, reach nathan@forwardlane.com.