§ 01
Foundations
What the score is, and what it isn’t.
The Cubed Score is a deterministic, weighted, stage-aware composite of 101 underlying factors organised into 9 thematic cubes. It is designed to behave like a venture analyst with a fixed methodology: same inputs produce the same score, every factor must cite evidence, every rating has a defined mathematical impact.
It is not a black-box ML model. The LLM (Gemini 2.5 Flash via the Lovable AI Gateway) is used only to classify each factor as ontrack / attention / critical / unknown and to extract supporting evidence. All arithmetic — weighting, capping, penalties, aggregation, valuation, forecast — is executed in code. The LLM provides perception; the engine provides judgement.
Design principles
- • Deterministic math — score is a closed-form function of factor ratings.
- • Stage-gated activation — only factors relevant to a company’s maturity are scored; others are excluded, not zeroed.
- • Asymmetric penalties — Critical issues cap and penalise; on-track signals never inflate above 1.0×.
- • Evidence-bound — every factor must carry a
reasonand anaction; missing evidence collapses to ‘unknown’ rather than guessed. - • Reproducibility — same scraped corpus + uploads → same score within ±1 pt (temp 0.2).
§ 02
The 9 cubes and their weights
Calibrated for B2B SaaS / Fintech / AI at Seed. The fixed coefficients of the weighted-sum aggregator.
Economy
w = 0.08 · n = 4Macro tailwind/headwind for the sector.
Personality
w = 0.10 · n = 6Founder behavioural traits.
Sentiment
w = 0.08 · n = 5Pattern-match conviction signals.
People
w = 0.12 · n = 10Team quality, depth, alignment (+ A-Player sub-model).
Product & Tech
w = 0.12 · n = 13Fit, differentiation, scalability, delivery.
Market
w = 0.12 · n = 9TAM/SAM, share, timing, defensibility.
Revenue Model
w = 0.13 · n = 19GTM, pipeline, predictability.
Financials
w = 0.13 · n = 14Cash, burn, margin, Rule-of-40, variance.
Customers
w = 0.12 · n = 21NRR, churn, NPS, value realisation.
Revenue Model and Financials carry the heaviest weight (13% each) because at Seed they are the highest-signal predictors of survival to Series A. People, Product, Market and Customers each carry 12% as the ‘four pillars’ of fundamentals. Personality and Sentiment carry 8–10% — informative but easier to manipulate. Economy is held at 8% because macro is mostly an exogenous filter, not a differentiator.
§ 03
The 101-factor catalogue
Every factor is tagged with one of six stage gates: pre-seed, seed, A (Series A+), Y (Year-2 ops), review (board cadence), H (health / CX). At Seed only pre-seed + seed factors fire; the rest are N/A and excluded from the cube average.
| Detected stage | Active gates | Indicative # active |
|---|---|---|
| Idea / Pre-Seed | pre-seed | ~9 |
| Seed | pre-seed + seed | ~38 |
| Series A | pre-seed + seed + A | ~75 |
| Series B+ | all gates | 101 |
Economy8% · 4 factors
- Strong (pre-seed)
- Stable (seed)
- Growing (seed)
- Future Forecast Aligned (seed)
Personality10% · 6 factors
- Likeable (pre-seed)
- Aligned True North (seed)
- Transparent (seed)
- Stable & Predictable (seed)
- Humble (seed)
- Accountability & Freedom (A)
Sentiment8% · 5 factors
- Sentiment Frame (pre-seed)
- Gut feel on Market (seed)
- Gut feel on Founder(s) (seed)
- Gut feel on Product (seed)
- Gut feel on Team (seed)
People12% · 10 factors
- 1st-time founder (seed)
- EQ (seed)
- IQ (seed)
- Likeability (seed)
- Organisational Alignment (A)
- Mature Advisors (A)
- Depth of Talent (A)
- Onboarding / Enablement Process (A)
- Communication Cascade (A)
- Employee NPS (A)
Product & Tech12% · 13 factors
- Value / fit to market (seed)
- Adoption / Usage / Usecases (seed)
- Differentiation (seed)
- Scalability (seed)
- Hours dev — Change Requests (A)
- Hours dev — Fixes vs roadmap (A)
- % roadmap built vs plan (A)
- % client solution coverage (A)
- % API estate built vs plan (A)
- % P1 resolution within SLAs (A)
- Cost of hosting (Y)
- % salary inflation (Y)
- % CAM expansion via product (Y)
Market12% · 9 factors
- $ TAM (seed)
- $ SAM (seed)
- 3 yr CAGR (seed)
- 1st mover advantage (seed)
- % market share vs competitors (A)
- Number of T1 markets (A)
- % market share take 3 mo (A)
- % price premium (A)
- % brand awareness target segment (A)
Revenue Model13% · 19 factors
- Forecast Accuracy (seed)
- Magic Number (seed)
- YoY Growth (seed)
- Standardised Value Positioning (seed)
- Standardised Sales Process (A)
- Clean CRM (A)
- Aligned GTM strategy (A)
- 4× pipeline coverage (A)
- In-Year backloading (A)
- In-quarter backloading (review)
- $ pipe gen / month rolling (review)
- % pipe from marketing (review)
- % pipe from partners (review)
- % pipe from direct (review)
- Avg deal lifecycle days (Y)
- % win rate (Y)
- % new logo Y1 revenue (Y)
- % achieve plan last 6 quarters (Y)
- % services revenue rolling Q (Y)
Financials13% · 14 factors
- Cash in bank (seed)
- Burn Rate (seed)
- Revenue (seed)
- % gross margin (seed)
- % EBITDA (A)
- Rule of 40 (A)
- % variance vs plan — revenue (A)
- % variance vs plan — cost (A)
- % variance vs plan — cash (A)
- % revenue growth rolling 3m (Y)
- % FCF change rolling (Y)
- % rolling gross margin 3m (Y)
- $ revenue per employee (Y)
- $ CAC (Y)
Customers12% · 21 factors
- NRR (seed)
- Churn (seed)
- NPS (seed)
- Value realisation (seed)
- NBEC (A)
- At Risk (A)
- Direct Revenue Contribution (A)
- Time-to-value / Speed of Delivery (A)
- Advocacy (A)
- Differentiation (Y)
- Adoption / users / usage (H)
- Customer health score (H)
- Churn — Price (Y)
- Churn — Product (Y)
- Churn — Customer (Y)
- ROI (Y)
- Support — P1 (H)
- Delivery — Pace (Y)
- Share of wallet / Market (Y)
- Customer Led Improvement (H)
- Segmentation (Y)
§ 04
The scoring math
A closed-form four-step derivation.
Step 1 — RAG multiplier
µ(ontrack) = 1.0
µ(attention) = 0.6
µ(critical) = 0.2
µ(unknown) = excluded from denominator
µ(N/A) = excluded (stage-gate inactive)Step 2 — Cube score
S(c) = 100 · ( Σ_{f∈F_c} µ(σ_f) ) / |F_c|
if ∃ f ∈ F_c with σ_f = critical → S(c) = min(S(c), 65) # critical capStep 3 — Overall score
Score_raw = Σ_c w(c) · S(c)
N_crit = total count of critical factors across all cubes
Score = Score_raw − 5 · [ N_crit ≥ 3 ] # systemic-risk penalty
Score = clip(Score, 0, 100)Step 4 — Verdict band
Score ≥ 70 → PROGRESS (Green)
40 ≤ Score < 70 → WATCH (Amber)
Score < 40 → PASS (Red)
Also PASS if the late-stage override fires (see §7).The 1.0 / 0.6 / 0.2 curve is convex-downward: Attention recovers 60% of value (solvable), Critical recovers only 20% (structurally damaging). The 65-cap prevents ‘one disaster averaged away by nine green lights’. The −5 penalty triggers at the 3-critical concentration threshold, which empirically correlates with failed Series A bridges in our backtest corpus.
§ 05
Worked examples
Four canonical scenarios.
A · Healthy Seed B2B SaaS
38 active factors · 30 ontrack · 6 attention · 2 critical (both in Customers)
Sum µ = 30·1.0 + 6·0.6 + 2·0.2 = 34.0
Cube avgs: Economy 82 · Personality 78 · Sentiment 75 · People 80
Product 77 · Market 72 · RevenueModel 70 · Financials 74 · Customers 62
Weighted = 0.08·82 + 0.10·78 + 0.08·75 + 0.12·80 + 0.12·77
+ 0.12·72 + 0.13·70 + 0.13·74 + 0.12·62 = 74.0
N_crit = 2 → no systemic penalty
Final Score = 74 → PROGRESSB · Critical cap kicks in
Same company but Customers cube avg = 78 with one Critical factor
Customers raw = 78 → capped to 65
Weighted impact: 0.12·65 = 7.80 (vs 0.12·78 = 9.36)
Δ = −1.56 points on overallC · Systemic-risk penalty
4 Critical factors spread across People, Product, Financials, Customers
N_crit = 4 → −5 applied to overall; each affected cube also capped at 65
A 72-pre-penalty company → final 67 → still PROGRESS
A 44-pre-penalty company → final 39 → PASSD · Late-stage override
Engine detects 'Series B announced March 2025' → finalStage = 'Series B'
Cubed Fund I mandate = Seed only →
verdict.label = PASS
verdict.headline ≈ 'Outside mandate — Series B'
fund_fit_criteria flagged 'fail — outside Cubed Seed mandate'
Cube scores still computed (used by RevCube / OpCube)§ 06
RAG semantics & evidence rules
Every factor object carries name · status · reason · action. A factor without a reason is rejected by the salvage layer and re-requested.
| Status | Multiplier | Meaning |
|---|---|---|
| ontrack (green) | 1.0× | Explicit positive evidence found in corpus, uploads, or follow-up answers. |
| attention (amber) | 0.6× | Partial evidence, weak signal, or solvable concern with a clear path to resolution. |
| critical (red) | 0.2× | Direct negative evidence or a structural blocker (tiny TAM, no differentiation, regulatory wall). |
| unknown | excluded | Insufficient evidence at this stage; excluded from the cube average. |
Idea / Pre-Seed clemency rule
For Idea or Pre-Seed companies, absence of data is expected and must not be marked Critical. The engine forces Attention as the floor for missing-data factors at this stage. Critical is reserved for genuine red flags (tiny TAM, no differentiation, wrong sector).
§ 07
Stage detection & late-stage override
The engine independently detects a company’s true stage from scraped evidence, then rewrites finalStage if it disagrees with the recorded value.
Signals: funding announcements, press releases, Crunchbase / PitchBook mentions, headcount, ARR. If detected stage > recorded stage, finalStage is rewritten.
If detected stage ≥ Series A, Cubed Fund I mandate fails and verdict collapses to PASS regardless of score — late entry violates ownership economics (≥20% target at $15–25M pre).
§ 08
Valuation model — V_pre from score
Pre-money is bounded by stage and modulated by score.
V_pre(stage, score) = V_low(stage) + ( V_high(stage) − V_low(stage) ) · ( score / 100 )| Stage | Pre-money band (USD) | Forward ARR multiple |
|---|---|---|
| Pre-Seed | $3M – $8M | 25× |
| Seed | $8M – $25M | 18× |
| Series A | $25M – $80M | 12× |
| Series B | $80M – $250M | 8× |
| Series C+ | $250M – $600M | 6× |
The forecast view applies the stage multiplier to projected ARR to chart enterprise value over 5 years (see CompanyForecastCharts.tsx). Cubed Ventures Fund I writes $3–5M for a minimum 20% ownership, implying $15–25M pre-money at Seed.
§ 09
Forecast model — organic vs Cubed uplift
Two trajectories per company.
g(score) = 0.15 + (score/100) · 0.85 # organic CAGR
δ(score) = 0.10 + (score/100) · 0.15 # cubed uplift
ARR_org(t) = ARR_0 · (1 + g(score))^(t/12)
ARR_cubed(t) = ARR_0 · (1 + g(score) + δ(score))^(t/12)
# at year ≥ 2 add +0.05 second-cheque boost to cubed trackFloor ARR: if reported ARR is 0 / unknown, ARR_0 defaults to $100k (Pre-Seed / Idea) or $500k (other) so charts remain readable; flagged ‘estimated’ in the UI.
Valuation overlay: V(t) = ARR(t) · M_stage · decay(t), where decay(t) = 1 − 0.03·t (organic) or 1 − 0.02·t (cubed) — reflects multiple compression as companies mature.
§ 10
Report anatomy — every section in the company PDF
The 13 components composed from a single JSON payload.
| Section | What it is |
|---|---|
| CubeScoreHeader | Identity strip: logo, name, tagline, stage chip, overall score with delta vs previous_score, projected ARR, phase. |
| InvestmentVerdict | PROGRESS / WATCH / PASS badge, one-line headline, bull case, bear case, recommendation, conviction. |
| FundFitAssessment | 8-criterion checklist from Cubed Fund I mandate (sector, stage, geography, ownership, cheque, conflict, ESG…). |
| CompanyNews | Filtered research_insights of type ‘news’ / ‘press’. De-duplicated by URL, sorted by date desc. |
| CompanyForecastCharts | 18-month ARR area chart + 5-year valuation/ARR composed chart. Inputs: projected_arr, overall_score, stage. |
| ValidationTracker | Maps each verdict claim back to specific cube factors and research insights — the ‘show your working’ trail. |
| CubeBreakdownBar | Horizontal bars of the 9 cube scores, coloured by band (≥70 green, 40–69 amber, <40 red). |
| CubeScoreGrid | 9-column grid: each column is a cube; each cell is a factor tile coloured by RAG. Hover reveals reason + action. |
| ResearchNarratives | Per-cube qualitative summary written by the LLM, grounded in the scraped corpus. Each narrative cites its sources. |
| DrivingMetrics | The 12 target KPIs with value, target, RAG, and trend (see §11). |
| ActivityTimeline | Chronological log of analyst actions: created, follow-up calls, uploads, rescore events. |
| OperationalChessboard | 7×12 RAG grid (see §12). |
| SourcesProvenance | Every URL, document, and LLM call that contributed to the score. Click-through to evidence. The auditable trail. |
§ 11
12 Driving Metrics
A separate target-performance layer used by RevCube. Concrete revenue-engine KPIs, each with an explicit target and a RAG.
| # | KPI | Target | Definition |
|---|---|---|---|
| 1 | Forecast Accuracy | ≥80% | Forecast vs actual variance over rolling 4 quarters |
| 2 | ARR Growth | ≥50% | YoY ARR change |
| 3 | Pipeline Coverage | ≥4–5× | Open pipeline / quota for the period |
| 4 | Win Rate on Qualified Deals | ≥50% | Won / (won + lost) after qualification |
| 5 | Annual Increase in Avg Deal Size | ≥10% | ASP YoY |
| 6 | New Revenue from Account Growth | ≥30% | Expansion / total new ARR |
| 7 | New Revenue from Partners | ≥10% | Partner-sourced / total new ARR |
| 8 | Net Retention Rate | ≥125% | (start ARR − churn + expansion) / start ARR |
| 9 | Deals Closed Before Last Week of Quarter | ≥90% | Anti-hockey-stick discipline |
| 10 | Deals Closed Before Q4 | ≥70% | Avoid year-end concentration |
| 11 | Deal Alignment to Strategy & GTM | 100% | ICP / ideal-deal scoring |
| 12 | Quota Carriers Making 80% of Plan | ≥80% | Sales-team health |
§ 12
Operational Chessboard — the 7×12 grid
84 cells. Rows = operational horizon & discipline. Columns = revenue-engine functions. Each cell carries a short label and a RAG status. The OpCube view: a single screen for where an operator should focus this quarter.
Rows (7): Immediate Focus (3 month) · Mid-Term Focus (Next QTR) · Long-Term Focus (12 months) · People · Growth · Efficiency · Accuracy.
Columns (12): Full Year · Current Quarter · Revenue Protection · Revenue Growth · Strategy · Enabled · Pipeline Gen · Qualification · Deal Growth · Deal Acceleration · Competitor Positioning · Cost.
Scoring: the LLM populates each cell from the same evidence corpus as the cubes. For early-stage companies most cells will be unknown (grey) — correct behaviour, not a flaw. As the company matures and the engine ingests CRM/ERP data, cells flip to RAG.
Reading: ‘Immediate Focus × Pipeline Gen = red’ means pipeline coverage is the single most urgent fix in the next 90 days. ‘Efficiency × Cost = amber’ means cost discipline needs a plan but is not on fire. The chessboard turns the score into action.
Illustrative render. Each cell = one RAG status × label pair in the JSON payload.
§ 13
A-Player sub-model
Inside the People cube, each named executive is scored across 7 traits, each RAG.
The cube-level People score remains weighted at 12%, but the A-Player view feeds OpCube’s ‘who to keep / coach / replace’ recommendation and FundCube’s key-person risk assessment.
§ 14
Fund-fit assessment layer
Sits on top of the universal score.
Cubed Fund I has 8 fixed criteria; each returns pass / attention / fail with a one-line detail. A company can score 78 (PROGRESS) but still fail fund-fit (wrong stage, wrong geography, conflict, ESG exclusion). Fund-fit failure overrides verdict → PASS. This is the bridge between the universal score and a specific mandate.
§ 15
Thesis-Alignment Intelligence (TAI)
The most important extensibility surface — plug any investor mandate in as structured criteria, and the engine re-scores every company against their thesis without retraining.
The 9-cube score is thesis-neutral (universal company quality). The TAI layer is thesis-specific. Both run on the same evidence corpus.
FitScore(company, thesis) =
Σ_k v(k) · m_k(company) # weighted match score
− Σ_h H_h · [ hard_fail_h ] # hard-rule penalties
where
v(k) = investor weight for criterion k (Σ = 100)
m_k = match function ∈ [0, 1] per criterion type
H_h = penalty for breaching a hard rule (default = ∞ → PASS)Criterion types
| Type | How m_k is computed |
|---|---|
| Categorical | Sector, geography, business model. m_k = 1 if in allow-list else 0. |
| Range | Stage, cheque size, valuation, headcount, ARR. m_k = 1 inside band, linear decay outside, 0 beyond tolerance. |
| Threshold | Ownership ≥ X%, gross margin ≥ Y%. m_k = clip( (value − threshold) / tolerance, 0, 1 ). |
| Score-derived | ‘Must score ≥ 70 on Product cube’. m_k pulls directly from cube outputs. |
| Boolean hard rule | Conflict-of-interest, ESG exclusion, sanctioned jurisdiction. Hard fail → automatic PASS. |
| Semantic | Free-text thesis. LLM scores embedding similarity between thesis statement and company description; m_k ∈ [0, 1]. |
Thesis schema (JSON contract)
{
"thesis_id": "fund_xyz_2026",
"investor": "XYZ Capital",
"criteria": [
{ "k": "sector", "type": "categorical", "allow": ["fintech","saas","ai"], "w": 15 },
{ "k": "stage", "type": "range", "min": "Seed", "max": "Series A", "w": 15 },
{ "k": "geography", "type": "categorical", "allow": ["UK","US","EU"], "w": 10 },
{ "k": "cheque", "type": "range", "min": 1e6, "max": 5e6, "tol": 0.25, "w": 10 },
{ "k": "ownership", "type": "threshold", "gte": 0.10, "tol": 0.05, "w": 10 },
{ "k": "gross_margin", "type": "threshold", "gte": 0.70, "tol": 0.10, "w": 10 },
{ "k": "product_quality", "type": "score", "cube": "Product & Tech", "gte": 70, "w": 10 },
{ "k": "thesis_fit", "type": "semantic", "statement": "AI infra for regulated industries", "w": 20 }
],
"hard_rules": [
{ "k": "sanctions", "type": "boolean", "fail_if": true },
{ "k": "esg_exclude", "type": "categorical", "deny": ["tobacco","weapons","gambling"] }
],
"verdict_bands": { "strong": 75, "watch": 50 }
}Multi-thesis intelligence
Because the universal score is computed once, an LP can layer N theses and instantly see how the same pipeline ranks under each lens. This enables co-investment matching: ‘which 3 of our 12 LPs would this company suit?’ becomes a sort, not a meeting.
Calibration
Each investor’s weight vector v can be learned from historical decisions: feed in past invest / pass calls, regress on criteria values, and recover the implicit weights. This converts gut-feel committee behaviour into an auditable thesis vector.
§ 16
Cross-product intelligence
One score, three product surfaces, one flywheel.
| Product | Consumes | Surfaces |
|---|---|---|
| FundCube | overall_score · cubes[] · investment_verdict · fund_fit_criteria · V_pre · forecast curves | Pipeline ranking · IC packs · portfolio benchmarking · second-cheque trigger (‘rescore at month 18, if score ↑ ≥ 8 pts then double down’) |
| RevCube | Revenue Model cube · Customers cube · Driving Metrics · Financials cube | Revenue-engine diagnostic · pipeline-coverage simulator · NRR projection · magic-number trend · sales-team capacity model |
| OpCube | Operational Chessboard · People cube + A-Player sub-model · Product cube delivery metrics | Weekly operating cadence · who-to-coach matrix · dependency map · OKR alignment grid · exec talent heatmap |
Cross-product flywheel: each product writes evidence back to the central company record (RevCube updates Forecast Accuracy when actuals load; OpCube updates Employee NPS when a survey closes). The next score recomputation absorbs those updates automatically — the score sharpens as the company is operated through the stack.
§ 17
JSON contracts and data lineage
The data model every product surface consumes.
{
"overall_score": 74,
"projected_arr": "$2.5M",
"detected_stage": "Seed",
"investment_verdict": {
"label": "PROGRESS",
"headline": "Strong Seed-stage B2B AI play, clean GTM",
"bull": "...", "bear": "...", "recommendation": "...",
"fund_fit_criteria": [
{ "criterion": "Stage", "status": "pass", "detail": "Seed" }
]
},
"cubes": [
{ "name": "Economy", "score": 82, "factors": [
{ "name": "Strong", "status": "ontrack",
"reason": "Sector CAGR 22% per S&P 2026 outlook",
"action": "Maintain macro watch" }
]}
],
"research_insights": [
{ "type": "news", "title": "...", "url": "...", "date": "..." }
],
"driving_metrics": [
{ "name": "Forecast Accuracy", "target": "80%+", "status": "unknown", "value": null }
],
"operational_chessboard": [
{ "category": "Immediate Focus (3 month)",
"cells": [ { "label": "Pipeline gap", "status": "attention" } ] }
]
}Validation rules (enforced server-side)
- •
cubes.length == 9and names match the canonical set. - • Each
factor.status ∈ {ontrack, attention, critical, unknown}. - • Overall score recomputed from cubes and rejected if delta > 2 from LLM-claimed score.
- •
operational_chessboard.length == 7,cells.length == 12per row. - • Every factor must have non-empty
reasonandaction.
§ 18
LLM pipeline, JSON salvage & determinism
Model: Gemini 2.5 Flash via Lovable AI Gateway. Temperature 0.2. System prompt enforces the full 101-factor catalogue, the activation matrix and the scoring rules verbatim. The LLM’s only job is classification + evidence extraction — never arithmetic.
Pipeline
- Scrape (website, LinkedIn, news) → corpus.
- Merge uploaded docs + follow-up answers.
- Build prompt with corpus + 101-factor instructions.
- LLM call → JSON.
- Salvage layer.
- Validate.
- Server-side recompute of
overall_scorefrom cubes (LLM’s claimed score is a check, not source of truth). - Persist.
JSON salvage
LLM output is often truncated or contains stray markdown. The salvage layer strips code fences, balances braces, repairs trailing commas, retries with a stricter system prompt (‘Return ONLY valid minified JSON, no commentary, all strings <200 chars’) if parse fails, and finally falls back to a stub object with exceptions[] populated so the UI can flag ‘partial data’.
Determinism guarantees
Same corpus + same prompt + temp 0.2 → score reproducible within ±1 pt. The math layer is fully deterministic — drift comes only from LLM classification noise on borderline factors, which are flagged in the audit trail for human review.
§ 19
Calibration, backtesting & the training dataset
Three tables form the ML-ready warehouse.
| Table | Shape | Purpose |
|---|---|---|
| companies_training | Wide per company-quarter | 9 cube scores, 101 factor statuses, V_pre, projected ARR, actual ARR (when known), outcome label. |
| factor_observations | Long-format | (company_id, factor_id, quarter, status, evidence_url). Per-factor calibration — does ‘Magic Number = attention’ actually predict Series A outcomes? |
| outcomes_ledger | Event log | round_raised, valuation, exit, shutdown with timestamps. Used to compute Brier scores and fit the weight vector. |
Backtest metric
Brier(verdict) = mean( ( P(PROGRESS) − 1[outcome = positive] )² )
Cubed Fund I target: Brier ≤ 0.18 on 24-month-forward outcomes.
Each new outcome event updates the running Brier; weight-vector review every 25 new outcomes.Weight learning (when enabled)
w* = argmin_w Σ_i ( Σ_c w(c) · S(c, i) − y_i )²
s.t. Σ_c w(c) = 1, w(c) ≥ 0
Initialised at the curated 8 / 10 / 8 / 12 / 12 / 12 / 13 / 13 / 12 vector.
Drift > 15% on any cube triggers a human-in-the-loop committee review
before deployment — never auto-deployed.§ 20
Failure modes, biases & guardrails
What can go wrong, and what stops it.
| Failure mode | What it is | Mitigation |
|---|---|---|
| Idea-stage false-negative | Engine could mark missing data as Critical. | Stage-aware clemency rule forces Attention floor and excludes inactive factors. |
| Hype inflation | Sentiment-heavy corpora bias the LLM toward ontrack. | Sentiment cube capped at 8% weight; Critical-cap dominates. |
| Single-source dependency | If scraping fails, score collapses to unknown. | SourcesProvenance flags low-coverage; UI shows ‘partial data’ badge; score withheld below evidence threshold. |
| Late-stage misuse | Score is calibrated for Seed; applying to Series C distorts weights. | Late-stage override forces verdict = PASS for Cubed Fund I; other TAI profiles can rebalance weights per stage. |
| LLM hallucinated evidence | Fabricated reason/action strings. | Each evidence claim must carry a source URL traceable in SourcesProvenance; unreferenced reasons are quarantined. |
| Founder gaming | Once thesis is known, founders shape language to match. | Weights are public, but evidence rules + Critical-cap mean shallow language games stay at Attention; only verifiable outcomes flip to green. |
§ A
Appendix · Offline reference implementation
Reproduces the exact score the live engine produces for any JSON payload. Language-agnostic.
WEIGHTS = {
Economy: .08, Personality: .10, Sentiment: .08,
People: .12, Product: .12, Market: .12,
RevenueModel: .13, Financials: .13, Customers: .12,
}
MULT = { ontrack: 1.0, attention: 0.6, critical: 0.2 } # unknown / N/A excluded
for cube in cubes:
active = [f for f in cube.factors if stage_allows(f.tag, company.stage)
and f.status not in ("unknown", "N/A")]
raw = 100 * mean(MULT[f.status] for f in active)
if any(f.status == "critical" for f in active):
raw = min(raw, 65) # critical cap
cube.score = raw
score = sum(cube.score * WEIGHTS[cube.name] for cube in cubes)
n_crit = sum(1 for c in cubes for f in c.factors if f.status == "critical")
if n_crit >= 3: score -= 5 # systemic-risk penalty
score = clip(score, 0, 100)
verdict = "PROGRESS" if score >= 70 else ("WATCH" if score >= 40 else "PASS")
if late_stage_override(company): verdict = "PASS"Next
Want the brain to run this against your business?
Genesis Onboarding is the 5-minute path from your pitch deck to a live Cubed Score, biggest risks, and ten action-ready insights.
Start Genesis Onboarding— End of Wiki v1 —