# ZK Identity Networking App — Feature Specification

*What the app does. Who it's for. How every flow works. Where the revenue comes from. Why it runs with near-zero operational overhead.*

---

## At a Glance

A privacy-first, agent-powered conference networking application. Attendees describe what they're looking for; an on-device AI builds a structured intent profile; Bluetooth mesh surfaces high-signal matches in real time; zero-knowledge proofs reveal identity only after mutual in-person consent. The server is a thin relay, not a brain. The agent lives on the phone. The platform is licensed to conference organizers as a white-label product with no bespoke engineering per deployment.

Three things make it different from what already exists:

1. **The AI does the onboarding.** Users don't fill out forms. They paste a short self-summary, drop a PDF or URL, and answer 3–10 adaptive questions. The agent synthesizes one coherent intent profile.
2. **Credentials reveal only after mutual tap.** Two people meet, both agree to exchange, and a zero-knowledge proof unlocks the credentials. No public profile. No broadcast of who you are.
3. **It runs on the device.** Inference, matching, transcription, proof generation — all local. The server stores nothing sensitive and can be scaled down to a single commodity VM per event.

---

## User Archetypes

Five types of people show up to a conference. Each has a different upload pattern, intent structure, and emotional relationship with the app. The system handles all five without any of them feeling like they're using someone else's tool.

### 1. Founder

**What they have:** A pitch deck. A thesis. Maybe a white paper, 1-pager, or project README. A project with stage, traction, and an ask.

**What they want:** Investors matching their stage and category. Engineers. Other founders to collaborate with or learn from.

**Upload pattern:** Drops a PDF. The on-device model extracts structured JSON. No raw file stored — only the extracted structure. This is the "Pitch Parser" scout at work.

**Extracted JSON shape:**
```json
{
  "thesis": "One-sentence description of what we're building and why",
  "stage": "seed",
  "traction": "One-sentence proof of momentum — users, revenue, partnerships",
  "ask": "What we need from the right investor/collaborator",
  "team": ["name + role", "name + role"],
  "risks": ["risk 1", "risk 2"],
  "key_metric": "The one number that matters most right now"
}
```

**Mode:** Active — broadcasts intent via BLE mesh. Founders are hunting.

---

### 2. Investor / VC

**What they have:** A thesis. Stage preferences. Category focus. Fund size or typical check size. Possibly a track record they don't want to broadcast.

**What they want:** Deal flow matching their thesis. They don't want to be found. They want to find.

**Upload pattern:** Zero documents. Adaptive cards walk them through their criteria. The AI builds the profile entirely from their answers.

**Adaptive card sequence:**
1. "What stages do you invest in?" → multi-select: Pre-seed / Seed / Series A / Series B+
2. "What categories are you active in?" → multi-select: DeFi / ZK / Infrastructure / Consumer / Governance / Other
3. "What's your typical check size range?" → range picker
4. "What are you specifically looking for right now?" → free text, AI-summarized into a vector
5. Cards branch from here based on answers.

**Mode:** Passive only. Receives, never broadcasts. When an Active user's beacon clears a 70% match score, it surfaces as a notification.

---

### 3. Talent / Engineer

**What they have:** A GitHub repo. A portfolio URL. Skills. Past projects. A resume if they're formal about it.

**What they want:** Projects that need their skills. Founders looking to hire. Collaborators.

**Upload pattern:** Pastes a URL or drops a resume PDF. URL triggers the "URL Parser" scout — pulls GitHub README, recent activity, languages, project descriptions, stars/forks. Returns structured JSON only. Raw page content never reaches the core intent model.

**Extracted JSON shape (GitHub):**
```json
{
  "primary_languages": ["Rust", "TypeScript", "Solidity"],
  "active_projects": [
    { "name": "ZK-bridge", "description": "Cross-chain ZK proof relay", "stars": 340 }
  ],
  "recent_activity": "3 commits/day average over last 30 days",
  "skills_inferred": ["smart contracts", "ZK circuits", "cross-chain", "Rust systems"],
  "availability": "open to collaboration"
}
```

**Mode:** Either. Engineers hunting for projects go Active. Engineers open to being found go Passive. One toggle during onboarding.

---

### 4. Business Developer

**What they have:** Knowledge of multiple projects. Relationship context. The ability to make introductions across their network.

**What they want:** Connections that create value across their portfolio. Intros in both directions. Being the person who puts the right people together.

**Upload pattern:** Hybrid. Uploads about a specific project, or describes what they're brokering via adaptive cards. The AI synthesizes everything into a single coherent profile.

**Mode:** Active. BizDev people are hunters.

---

### 5. Solo Operator / Generalist

**What they have:** A vague sense of what they want. Maybe skills, maybe interests, maybe just "I'm here to see what happens."

**What they want:** Opportunities. Projects. Jobs. Interesting conversations.

**Upload pattern:** Minimal. Self-summary does the heavy lifting. Adaptive cards fill the 2–3 highest-impact gaps. Maybe a portfolio link.

**Mode:** Either. Adaptive cards make this easy regardless of upload volume.

---

## Input Channels

The AI builds its understanding of a user from three sources, layered. Each source adds signal. The agent synthesizes across all of them into one coherent intent profile — not three separate profiles.

### Channel 1: Self-Summary (Baseline)

A rich, non-formulaic snapshot of who the user is and what they care about. The intent fingerprint.

**How it works:**
1. User taps "Generate Summary"
2. App copies a prompt to clipboard and opens the user's preferred social surface via deep link
3. User pastes the prompt into the AI assistant of choice, gets a paragraph-length summary back
4. User pastes the summary into the app (editable before submit)
5. On-device model parses it locally — extracts vectors for matching, hashes for broadcast
6. Stored encrypted on device

**Why copy/paste and not a direct API:** One tap. No OAuth dance. No storing third-party credentials. The user controls the text.

---

### Channel 2: Document / URL Upload

**PDF (pitch deck, white paper, 1-pager, resume):**
1. User drops file
2. "Pitch Parser" scout activates — receives PDF bytes, locked output schema
3. On-device model reads and extracts structured JSON
4. Scout returns JSON only — raw PDF content never reaches the core intent model
5. PDF is deleted from device after extraction

**URL (GitHub, portfolio, project site):**
1. User pastes URL
2. "URL Parser" scout activates — receives URL string only
3. Scout fetches the page, parses content, extracts structured JSON per its schema
4. Returns JSON only — raw page content never reaches the core intent model
5. Critical for security: a malicious portfolio site could embed prompt injection in its HTML. The locked output schema neutralizes it.

**What's stored:** Only the extracted JSON. Never the raw document or page content.

---

### Channel 3: Adaptive Cards

3–10 dynamically generated questions, tailored to what the AI doesn't already know from the summary and uploads.

**How the AI decides what to ask:**
1. After Channels 1 and 2, compute confidence score on the intent profile
2. Identify the highest-impact gaps — fields where filling them would most change match outcomes
3. Generate cards that target those gaps specifically
4. Stop when confidence reaches 85%

**Branching logic example (Founder path):**
- Card 1: "You mentioned ZK — is your protocol infrastructure or user-facing?"
- If Infrastructure: Card 2: "Are you hiring engineers or looking for a technical co-founder?"
- If Hiring + co-founder: Card 3: "What's your budget range for the first hire?"
- Stops after 5 cards because confidence is already at 87%

**Key principle:** The AI does the work. The user just confirms or corrects. No long forms. No drop-downs. Onboarding feels like a conversation, not a registration.

---

## Core User Flows

### Flow 1: Onboarding (5–10 min, one-time per event series)

```
Splash → Wallet Creation → Self-Summary → Voice Interview → Document Upload →
ID Verification (optional) → Mode Toggle → Calendar Sync → Permission Bundle → Done
```

1. **Splash.** 30 seconds. Plain language. No jargon.
2. **Wallet Creation.** Sign in with Google, Apple, or email. An embedded smart-wallet provider provisions a smart account in ~200ms. No seed phrase. Keys live in a TEE. The DID is registered on an L2 automatically, gas sponsored (free to user).
3. **Self-Summary.** Copy prompt → paste into AI → paste result back.
4. **Voice Interview.** App reads 3–5 questions aloud. User speaks. OS-native speech recognition transcribes offline. 60–90 seconds. Feels like a quick chat. Builds intent *and* scores credibility depth simultaneously.
5. **Document Upload.** Optional. Pitch deck, resume, or GitHub URL. Skippable.
6. **ID Verification.** Optional. Photo of ID processed on-device, confirms authenticity, then deleted. Only a credibility tier bump survives.
7. **Mode Toggle.** Active (broadcast) or Passive (listen).
8. **Calendar Sync.** OAuth to Google or an event platform. App detects which talks/mixers the user is attending.
9. **Permission Bundle.** One ask covering Bluetooth, Calendar read, Microphone, Camera — with a brief explanation of why each is needed.

---

### Flow 2: Discovery & Matching

The passive, background experience. The app does the work while the user lives their conference life.

**Presence Detection (Calendar-GPS Fusion):**

The app needs to know the user is actually at an event — not just that they said they'd attend. But it can't ping GPS constantly (battery). Solution: calendar as oracle, GPS as spot-check.

1. App sleeps until an event window opens. Zero battery drain before this.
2. Event window opens. App prompts: "At [Event Name]? Tap to check in."
3. User taps yes. One GPS ping confirms they're inside the venue geofence.
4. Presence flag set with confidence score. Active for the event duration.
5. If the user leaves the geofence, a single low-frequency background check auto-clears the flag.
6. Overlapping events prompt a manual select.

**Total battery cost for presence:** Under 5% across an 8-hour conference day.

**Active Mode Broadcasting:**

1. Every 5–10 minutes, app broadcasts a BLE LE beacon packet
2. Packet contains only hashes and metadata — no raw content:
```json
{
  "did_hash": "sha3(did_id)",
  "interests_hash": "sha3(interest_vector_quantized)",
  "pitch_hash": "sha3(pitch_json)",
  "stage": "seed",
  "ask": "zk_cofounder",
  "key_metric_hash": "sha3(key_metric)",
  "presence": true,
  "ttl": 600
}
```
3. BLE mesh relays via nearby phones — extends range without internet
4. Encrypted with a session key derived from the DID
5. Battery cost: ~2mA at 5–10 min intervals

**Passive Mode Listening:**

1. App listens continuously for BLE beacons (BLE LE is designed for this — low power)
2. Local scout extracts only allowed fields from incoming packets
3. Deterministic matching runs on-device:
   - Cosine similarity on interest vectors
   - Stage and category hard filter
   - 70% threshold
4. If the match clears: ZK proof exchange via Semaphore. The Active user proves specific attribute claims without revealing their full interest set. ~1KB proof, <1s generation, verified locally.
5. Notification surfaces with context: "High match — seed-stage ZK founder. Looking for infrastructure co-founder."
6. 90%+ of beacons are pruned before the notification stage. No flood.

**Spam Protection:**
- Rate limit: 3 pings/min per source DID hash. Exceeded → dropped.
- Block list: swipe left on any notification → permanent ignore.
- HMAC validation: beacon packets include an HMAC keyed to the DID claim. Spoofed DIDs fail validation.

---

### Flow 3: Handshake & Meeting

1. User taps "Interested" on a match notification
2. App pings the Active user via BLE mesh: "Someone matched — send a 15-sec teaser?"
3. Active user agrees: on-device model generates a short audio teaser from the pitch JSON (TTS, ~40KB .opus). Sent via WebRTC P2P. End-to-end encrypted. No server in the middle.
4. Passive user listens. The teaser is the AI's elevator pitch on their behalf.
5. If they want to meet: taps "Meet." App surfaces a location hint — auto-generated from calendar data ("Bar area near main stage") or manually entered ("Table 12").
6. No raw coordinates shared. Coarse zone only.
7. IRL meeting happens. App goes to background.

---

### Flow 4: Post-Meet Contact Swap

*This is the value delivery moment. Everything before this is filtering. This is where the app creates something you keep.*

**Mutual consent gate:**
Both users must tap "Share Contacts." Neither party's data moves until both have agreed.

**Contact selection:**
Each user picks what to expose independently — email, Telegram, LinkedIn, wallet, GitHub, or a free-text "other." One channel or all of them.

**Voice note (optional):**
- 10-second recording
- On-device transcription + summarization into a 3-sentence digest
- Example: *"Alice — ZK seed, sharp on the privacy angle, wants to DM about her deck."*
- Raw audio is deleted after transcription

**The bundle:**
1. Selected contact channels + AI-generated match context + voice note digest → contact card
2. Encrypted with the recipient's public key (ECDH via their DID)
3. Swapped via WebRTC file transfer — P2P, encrypted, max 200KB
4. Stored locally as an `.enc` file
5. Decrypted only when opened

**Optional persistence:**
Opt-in decentralized storage (Ceramic/IDX via free IPFS pinning). Cards persist even if both users uninstall.

---

## The Contact Card (What You Keep)

When two people complete a swap, each receives a **contact card**. This is the artifact that makes the app sticky after the event.

A contact card contains:
1. **Contact channels** — whatever the other person chose to share
2. **AI-generated context** — why you matched, what you have in common, what they're looking for
3. **Voice note digest** — 3-sentence summary of the interaction (if recorded)
4. **Match metadata** — overlap score, triggering categories, event context

Not "here's someone's email." Instead, "here's someone's email, here's why you matched, here's what the AI understood about them, here's what they said in their own words." Pick it up three weeks later and know exactly what to say.

---

## On-Device Architecture & Operational Posture

The single most important technical commitment in this product: **the agent lives on the phone, and the server holds nothing it doesn't need to.** This shapes privacy, cost, and the operational burden of running the platform for a new event.

### What Runs On Device

| Component | Runs On Device | Why It Matters |
|-----------|----------------|----------------|
| Intent extraction (PDF, URL, summary) | Local model inference | Raw documents never leave the phone |
| Adaptive card generation | Local | The AI knows what to ask next without a round trip |
| Voice interview transcription | OS-native speech APIs (iOS Speech / Android SpeechRecognizer) | Offline, no cloud dependency |
| BLE beacon generation + signing | Local | Sub-second response; zero server cost per broadcast |
| Matching (cosine similarity, filters) | Local | No intent vectors ever touch a server |
| ZK proof generation (Semaphore) | Local | ~1KB proofs, <1s on mid-range phones |
| Proof verification | Local | Both parties verify without a trusted third party |
| Voice note transcription + summarization | Local | Raw audio is discarded; only the digest survives |
| Contact card encryption (ECDH) | Local | Server never sees contact payloads |
| Photo ID check | Local | Photo deleted immediately after verification |

### What The Server Does

Deliberately almost nothing. The server is a thin relay plus a handful of infrastructure primitives:

- **BLE fallback relay.** If two phones can't hear each other directly, an event-local relay bridges the beacon. No decryption. No storage.
- **DID registration proxy.** Submits the user's DID claim to the L2 on their behalf, paying gas. Stateless.
- **Event configuration blob.** JSON file per event — venue geofence, branding, allowlisted credential types, sponsor copy. Under 100KB.
- **Sponsor content delivery.** Serves approved sponsor cards and tracks impression counts for billing. No user data attached.
- **Optional IPFS pinning bridge** for users who opt into post-event persistence.

### Operational Implications

Because the agent is local and the server is thin, the cost and maintenance profile of running the platform is closer to a static site than a SaaS product:

- **Per-event server footprint:** One commodity VM, roughly $20–$50/month at baseline, scales modestly with attendee count and sponsor content volume.
- **Scaling behavior:** Matching is O(local) — adding 1,000 more attendees to an event does not add load to the server. It adds load to individual phones, which have headroom.
- **Data breach surface:** Near zero. The server never has raw intent, raw documents, raw audio, contact cards, match results, or identity. A compromise yields the event config file and a list of DID hashes.
- **White-label deployment effort:** A new event stands up with a new config JSON. No code fork. No per-event engineering team. Branding, geofence, credential types, and sponsor slots are configuration, not code.
- **Compliance posture:** Nothing sensitive to retain, nothing to hand over under subpoena, nothing to accidentally log. Simplifies GDPR, CCPA, and enterprise conversations materially.

The product's privacy story and its operating-cost story are the same story. One device-local architecture produces both.

---

## Revenue Model

Two grounded revenue streams, both aligned with how conferences already spend money. The on-device architecture means neither of them requires scaling server capacity proportionally with revenue.

### 1. White-Label Event Licensing (Primary)

Conference organizers license the platform as a branded app for their event. Same codebase, different configuration.

- **Pricing model:** Tiered per-event license based on attendee count. A small regional event (~500 attendees) sits at the low end; a flagship multi-day conference (5,000+ attendees) at the high end. Multi-event contracts (annual event series, tour franchises) are sold as a discounted bundle.
- **What's included:** Branded app, custom geofence and venue map, curated credential types relevant to the audience, sponsor slot management, post-event analytics dashboard (aggregate-only — no individual attendee data).
- **What's not:** Attendee data. There is none to sell. The organizer's value is in running a better event, not harvesting a list.
- **Why organizers pay:** Networking quality is the thing attendees remember and the thing that gets them to buy next year's ticket. Better matching is a direct retention lever. Premium ticket tiers ("includes networking agent access") become a defensible upsell.
- **Unit economics:** Each event deployment is a config change, not a custom build. Incremental cost-of-goods per event is negligible after the first deployment. Margin approaches that of a templated product — typically 85%+ at scale.

### 2. Sponsor Advertising Slots (Secondary)

Sponsors pay to place content in front of relevant attendees. This is the same spend sponsors already make on conference signage, swag, and sponsored sessions — but with meaningful targeting and no wasted impressions.

- **Pricing model:** Flat-fee sponsor packages sold by the event organizer, tiered by slot type (headline sponsor, category sponsor, session sponsor). The platform takes a percentage of sponsor revenue as part of the license terms, or charges the organizer a separate per-impression fee.
- **How placement works:** Sponsors provide a card — logo, tagline, short offer, optional link. The event organizer approves the copy and tags each sponsor card with the attendee archetypes and interest categories it should surface to. The user's agent only shows sponsor cards that match the user's intent profile; everything else is filtered out before it's ever rendered.
- **Why it doesn't annoy attendees:** Relevance is enforced on-device before display. A DeFi founder looking for infrastructure engineers never sees a consumer-wallet ad. A recruiter sees tools and services that help with hiring. Irrelevant content is rejected by the agent, not tolerated by the user.
- **Why sponsors pay:** Conference sponsorship today is a blunt instrument — everyone sees everything, and sponsors pay for a lot of wasted attention. This is the cleanest slice of that spend: verified in-venue presence, archetype-matched targeting, and an auditable impression count per sponsor.
- **Platform take:** Configurable per contract. A typical arrangement is 15–25% of gross sponsor revenue flowing through the platform, with the organizer retaining the rest.

### Revenue Composition at Scale

For a healthy conference (2,000 attendees, 30–40 sponsors, 3-day duration), the license fee is the dominant line item and sponsor revenue share is a meaningful secondary. The organizer keeps the majority of sponsor revenue; the platform earns a predictable license plus a participation slice of ad spend.

The server cost for that same event is a single commodity VM and a handful of L2 gas sponsorships. Gross margin on event-level revenue remains well above 90% even with generous customer-success overhead.

---

## Trust & Safety

Covered throughout the flows, but worth consolidating:

- **No raw data leaves the phone without mutual consent.** Beacons are hashes; proofs are zero-knowledge; contact cards only move after both users tap share.
- **Prompt injection defense:** URL parser and PDF parser both emit locked JSON schemas. A malicious portfolio page cannot hijack the core intent model.
- **Sybil resistance:** DID claims anchored to L2; HMAC validation on beacons; Semaphore nullifiers for anonymous group membership without double-counting.
- **Rate limiting + block lists:** Spam is contained at the BLE layer before it reaches the user's notification tray.
- **Optional ID verification:** On-device only. Photo deleted immediately. Only a credibility tier bump is retained.
- **Sponsor content gating:** All sponsor copy is approved by the event organizer before deployment. Agent-side relevance filtering means attendees don't see misaligned ads.

---

## What's Not in v1

- Cloud sync / docking with a personal AI assistant
- Full on-chain Semaphore identity (local for v1)
- On-chain ZK proof verification (local verification for v1)
- LinkedIn integration (no programmatic API access)
- Cross-event identity continuity at scale (the DID is portable; the reputation graph is a post-v1 roadmap item)
- Multi-event-at-once support inside a single app instance (one event at a time for v1)
- Any server-side component handling user data

The v1 product is deliberately local, deliberately minimal, and deliberately light on infrastructure. Everything above is on the roadmap, but the thesis — that a privacy-preserving, agent-powered networking layer can be built almost entirely on the user's device — gets proven first.
