Home/n — Cubed Score Wiki

The Cubed Score, in full detail.

A deterministic, weighted, stage-aware composite of 101 factors organised into 9 thematic cubes — the scoring engine behind FundCube, RevCube and OpCube. Same inputs, same score; every factor must cite evidence; every rating has a defined mathematical impact on the final number.

9

Cubes

101

Factors

4

RAG states

0–100

Score range

§ 01

Foundations

What the score is, and what it isn’t.

The Cubed Score is a deterministic, weighted, stage-aware composite of 101 underlying factors organised into 9 thematic cubes. It is designed to behave like a venture analyst with a fixed methodology: same inputs produce the same score, every factor must cite evidence, every rating has a defined mathematical impact.

It is not a black-box ML model. The LLM (Gemini 2.5 Flash via the Lovable AI Gateway) is used only to classify each factor as ontrack / attention / critical / unknown and to extract supporting evidence. All arithmetic — weighting, capping, penalties, aggregation, valuation, forecast — is executed in code. The LLM provides perception; the engine provides judgement.

Design principles

  • Deterministic math — score is a closed-form function of factor ratings.
  • Stage-gated activation — only factors relevant to a company’s maturity are scored; others are excluded, not zeroed.
  • Asymmetric penalties — Critical issues cap and penalise; on-track signals never inflate above 1.0×.
  • Evidence-bound — every factor must carry a reason and an action; missing evidence collapses to ‘unknown’ rather than guessed.
  • Reproducibility — same scraped corpus + uploads → same score within ±1 pt (temp 0.2).

§ 02

The 9 cubes and their weights

Calibrated for B2B SaaS / Fintech / AI at Seed. The fixed coefficients of the weighted-sum aggregator.

Economy

w = 0.08 · n = 4

Macro tailwind/headwind for the sector.

Personality

w = 0.10 · n = 6

Founder behavioural traits.

Sentiment

w = 0.08 · n = 5

Pattern-match conviction signals.

People

w = 0.12 · n = 10

Team quality, depth, alignment (+ A-Player sub-model).

Product & Tech

w = 0.12 · n = 13

Fit, differentiation, scalability, delivery.

Market

w = 0.12 · n = 9

TAM/SAM, share, timing, defensibility.

Revenue Model

w = 0.13 · n = 19

GTM, pipeline, predictability.

Financials

w = 0.13 · n = 14

Cash, burn, margin, Rule-of-40, variance.

Customers

w = 0.12 · n = 21

NRR, churn, NPS, value realisation.

Revenue Model and Financials carry the heaviest weight (13% each) because at Seed they are the highest-signal predictors of survival to Series A. People, Product, Market and Customers each carry 12% as the ‘four pillars’ of fundamentals. Personality and Sentiment carry 8–10% — informative but easier to manipulate. Economy is held at 8% because macro is mostly an exogenous filter, not a differentiator.

§ 03

The 101-factor catalogue

Every factor is tagged with one of six stage gates: pre-seed, seed, A (Series A+), Y (Year-2 ops), review (board cadence), H (health / CX). At Seed only pre-seed + seed factors fire; the rest are N/A and excluded from the cube average.

Detected stageActive gatesIndicative # active
Idea / Pre-Seedpre-seed~9
Seedpre-seed + seed~38
Series Apre-seed + seed + A~75
Series B+all gates101
Economy8% · 4 factors
  1. Strong (pre-seed)
  2. Stable (seed)
  3. Growing (seed)
  4. Future Forecast Aligned (seed)
Personality10% · 6 factors
  1. Likeable (pre-seed)
  2. Aligned True North (seed)
  3. Transparent (seed)
  4. Stable & Predictable (seed)
  5. Humble (seed)
  6. Accountability & Freedom (A)
Sentiment8% · 5 factors
  1. Sentiment Frame (pre-seed)
  2. Gut feel on Market (seed)
  3. Gut feel on Founder(s) (seed)
  4. Gut feel on Product (seed)
  5. Gut feel on Team (seed)
People12% · 10 factors
  1. 1st-time founder (seed)
  2. EQ (seed)
  3. IQ (seed)
  4. Likeability (seed)
  5. Organisational Alignment (A)
  6. Mature Advisors (A)
  7. Depth of Talent (A)
  8. Onboarding / Enablement Process (A)
  9. Communication Cascade (A)
  10. Employee NPS (A)
Product & Tech12% · 13 factors
  1. Value / fit to market (seed)
  2. Adoption / Usage / Usecases (seed)
  3. Differentiation (seed)
  4. Scalability (seed)
  5. Hours dev — Change Requests (A)
  6. Hours dev — Fixes vs roadmap (A)
  7. % roadmap built vs plan (A)
  8. % client solution coverage (A)
  9. % API estate built vs plan (A)
  10. % P1 resolution within SLAs (A)
  11. Cost of hosting (Y)
  12. % salary inflation (Y)
  13. % CAM expansion via product (Y)
Market12% · 9 factors
  1. $ TAM (seed)
  2. $ SAM (seed)
  3. 3 yr CAGR (seed)
  4. 1st mover advantage (seed)
  5. % market share vs competitors (A)
  6. Number of T1 markets (A)
  7. % market share take 3 mo (A)
  8. % price premium (A)
  9. % brand awareness target segment (A)
Revenue Model13% · 19 factors
  1. Forecast Accuracy (seed)
  2. Magic Number (seed)
  3. YoY Growth (seed)
  4. Standardised Value Positioning (seed)
  5. Standardised Sales Process (A)
  6. Clean CRM (A)
  7. Aligned GTM strategy (A)
  8. 4× pipeline coverage (A)
  9. In-Year backloading (A)
  10. In-quarter backloading (review)
  11. $ pipe gen / month rolling (review)
  12. % pipe from marketing (review)
  13. % pipe from partners (review)
  14. % pipe from direct (review)
  15. Avg deal lifecycle days (Y)
  16. % win rate (Y)
  17. % new logo Y1 revenue (Y)
  18. % achieve plan last 6 quarters (Y)
  19. % services revenue rolling Q (Y)
Financials13% · 14 factors
  1. Cash in bank (seed)
  2. Burn Rate (seed)
  3. Revenue (seed)
  4. % gross margin (seed)
  5. % EBITDA (A)
  6. Rule of 40 (A)
  7. % variance vs plan — revenue (A)
  8. % variance vs plan — cost (A)
  9. % variance vs plan — cash (A)
  10. % revenue growth rolling 3m (Y)
  11. % FCF change rolling (Y)
  12. % rolling gross margin 3m (Y)
  13. $ revenue per employee (Y)
  14. $ CAC (Y)
Customers12% · 21 factors
  1. NRR (seed)
  2. Churn (seed)
  3. NPS (seed)
  4. Value realisation (seed)
  5. NBEC (A)
  6. At Risk (A)
  7. Direct Revenue Contribution (A)
  8. Time-to-value / Speed of Delivery (A)
  9. Advocacy (A)
  10. Differentiation (Y)
  11. Adoption / users / usage (H)
  12. Customer health score (H)
  13. Churn — Price (Y)
  14. Churn — Product (Y)
  15. Churn — Customer (Y)
  16. ROI (Y)
  17. Support — P1 (H)
  18. Delivery — Pace (Y)
  19. Share of wallet / Market (Y)
  20. Customer Led Improvement (H)
  21. Segmentation (Y)

§ 04

The scoring math

A closed-form four-step derivation.

Step 1 — RAG multiplier

µ(ontrack)    = 1.0
µ(attention)  = 0.6
µ(critical)   = 0.2
µ(unknown)    = excluded from denominator
µ(N/A)        = excluded (stage-gate inactive)

Step 2 — Cube score

S(c) = 100 · ( Σ_{f∈F_c} µ(σ_f) ) / |F_c|

if ∃ f ∈ F_c with σ_f = critical  →  S(c) = min(S(c), 65)   # critical cap

Step 3 — Overall score

Score_raw = Σ_c w(c) · S(c)
N_crit    = total count of critical factors across all cubes
Score     = Score_raw − 5 · [ N_crit ≥ 3 ]               # systemic-risk penalty
Score     = clip(Score, 0, 100)

Step 4 — Verdict band

Score ≥ 70       → PROGRESS (Green)
40 ≤ Score < 70  → WATCH    (Amber)
Score < 40       → PASS     (Red)
Also PASS if the late-stage override fires (see §7).

The 1.0 / 0.6 / 0.2 curve is convex-downward: Attention recovers 60% of value (solvable), Critical recovers only 20% (structurally damaging). The 65-cap prevents ‘one disaster averaged away by nine green lights’. The −5 penalty triggers at the 3-critical concentration threshold, which empirically correlates with failed Series A bridges in our backtest corpus.

§ 05

Worked examples

Four canonical scenarios.

A · Healthy Seed B2B SaaS

38 active factors · 30 ontrack · 6 attention · 2 critical (both in Customers)
Sum µ = 30·1.0 + 6·0.6 + 2·0.2 = 34.0

Cube avgs:  Economy 82 · Personality 78 · Sentiment 75 · People 80
            Product 77 · Market 72 · RevenueModel 70 · Financials 74 · Customers 62
Weighted = 0.08·82 + 0.10·78 + 0.08·75 + 0.12·80 + 0.12·77
         + 0.12·72 + 0.13·70 + 0.13·74 + 0.12·62 = 74.0
N_crit = 2  →  no systemic penalty
Final Score = 74  →  PROGRESS

B · Critical cap kicks in

Same company but Customers cube avg = 78 with one Critical factor
Customers raw = 78  →  capped to 65
Weighted impact: 0.12·65 = 7.80  (vs 0.12·78 = 9.36)
Δ = −1.56 points on overall

C · Systemic-risk penalty

4 Critical factors spread across People, Product, Financials, Customers
N_crit = 4  →  −5 applied to overall; each affected cube also capped at 65
A 72-pre-penalty company → final 67 → still PROGRESS
A 44-pre-penalty company → final 39 → PASS

D · Late-stage override

Engine detects 'Series B announced March 2025'  →  finalStage = 'Series B'
Cubed Fund I mandate = Seed only  →
   verdict.label    = PASS
   verdict.headline ≈ 'Outside mandate — Series B'
   fund_fit_criteria flagged 'fail — outside Cubed Seed mandate'
Cube scores still computed (used by RevCube / OpCube)

§ 06

RAG semantics & evidence rules

Every factor object carries name · status · reason · action. A factor without a reason is rejected by the salvage layer and re-requested.

StatusMultiplierMeaning
ontrack (green)1.0×Explicit positive evidence found in corpus, uploads, or follow-up answers.
attention (amber)0.6×Partial evidence, weak signal, or solvable concern with a clear path to resolution.
critical (red)0.2×Direct negative evidence or a structural blocker (tiny TAM, no differentiation, regulatory wall).
unknownexcludedInsufficient evidence at this stage; excluded from the cube average.

Idea / Pre-Seed clemency rule

For Idea or Pre-Seed companies, absence of data is expected and must not be marked Critical. The engine forces Attention as the floor for missing-data factors at this stage. Critical is reserved for genuine red flags (tiny TAM, no differentiation, wrong sector).

§ 07

Stage detection & late-stage override

The engine independently detects a company’s true stage from scraped evidence, then rewrites finalStage if it disagrees with the recorded value.

Signals: funding announcements, press releases, Crunchbase / PitchBook mentions, headcount, ARR. If detected stage > recorded stage, finalStage is rewritten.

If detected stage ≥ Series A, Cubed Fund I mandate fails and verdict collapses to PASS regardless of score — late entry violates ownership economics (≥20% target at $15–25M pre).

§ 08

Valuation model — V_pre from score

Pre-money is bounded by stage and modulated by score.

V_pre(stage, score) = V_low(stage) + ( V_high(stage) − V_low(stage) ) · ( score / 100 )
StagePre-money band (USD)Forward ARR multiple
Pre-Seed$3M – $8M25×
Seed$8M – $25M18×
Series A$25M – $80M12×
Series B$80M – $250M
Series C+$250M – $600M

The forecast view applies the stage multiplier to projected ARR to chart enterprise value over 5 years (see CompanyForecastCharts.tsx). Cubed Ventures Fund I writes $3–5M for a minimum 20% ownership, implying $15–25M pre-money at Seed.

§ 09

Forecast model — organic vs Cubed uplift

Two trajectories per company.

g(score)     = 0.15 + (score/100) · 0.85        # organic CAGR
δ(score)     = 0.10 + (score/100) · 0.15        # cubed uplift
ARR_org(t)   = ARR_0 · (1 + g(score))^(t/12)
ARR_cubed(t) = ARR_0 · (1 + g(score) + δ(score))^(t/12)
# at year ≥ 2 add +0.05 second-cheque boost to cubed track

Floor ARR: if reported ARR is 0 / unknown, ARR_0 defaults to $100k (Pre-Seed / Idea) or $500k (other) so charts remain readable; flagged ‘estimated’ in the UI.

Valuation overlay: V(t) = ARR(t) · M_stage · decay(t), where decay(t) = 1 − 0.03·t (organic) or 1 − 0.02·t (cubed) — reflects multiple compression as companies mature.

§ 10

Report anatomy — every section in the company PDF

The 13 components composed from a single JSON payload.

SectionWhat it is
CubeScoreHeaderIdentity strip: logo, name, tagline, stage chip, overall score with delta vs previous_score, projected ARR, phase.
InvestmentVerdictPROGRESS / WATCH / PASS badge, one-line headline, bull case, bear case, recommendation, conviction.
FundFitAssessment8-criterion checklist from Cubed Fund I mandate (sector, stage, geography, ownership, cheque, conflict, ESG…).
CompanyNewsFiltered research_insights of type ‘news’ / ‘press’. De-duplicated by URL, sorted by date desc.
CompanyForecastCharts18-month ARR area chart + 5-year valuation/ARR composed chart. Inputs: projected_arr, overall_score, stage.
ValidationTrackerMaps each verdict claim back to specific cube factors and research insights — the ‘show your working’ trail.
CubeBreakdownBarHorizontal bars of the 9 cube scores, coloured by band (≥70 green, 40–69 amber, <40 red).
CubeScoreGrid9-column grid: each column is a cube; each cell is a factor tile coloured by RAG. Hover reveals reason + action.
ResearchNarrativesPer-cube qualitative summary written by the LLM, grounded in the scraped corpus. Each narrative cites its sources.
DrivingMetricsThe 12 target KPIs with value, target, RAG, and trend (see §11).
ActivityTimelineChronological log of analyst actions: created, follow-up calls, uploads, rescore events.
OperationalChessboard7×12 RAG grid (see §12).
SourcesProvenanceEvery URL, document, and LLM call that contributed to the score. Click-through to evidence. The auditable trail.

§ 11

12 Driving Metrics

A separate target-performance layer used by RevCube. Concrete revenue-engine KPIs, each with an explicit target and a RAG.

#KPITargetDefinition
1Forecast Accuracy≥80%Forecast vs actual variance over rolling 4 quarters
2ARR Growth≥50%YoY ARR change
3Pipeline Coverage≥4–5×Open pipeline / quota for the period
4Win Rate on Qualified Deals≥50%Won / (won + lost) after qualification
5Annual Increase in Avg Deal Size≥10%ASP YoY
6New Revenue from Account Growth≥30%Expansion / total new ARR
7New Revenue from Partners≥10%Partner-sourced / total new ARR
8Net Retention Rate≥125%(start ARR − churn + expansion) / start ARR
9Deals Closed Before Last Week of Quarter≥90%Anti-hockey-stick discipline
10Deals Closed Before Q4≥70%Avoid year-end concentration
11Deal Alignment to Strategy & GTM100%ICP / ideal-deal scoring
12Quota Carriers Making 80% of Plan≥80%Sales-team health

§ 12

Operational Chessboard — the 7×12 grid

84 cells. Rows = operational horizon & discipline. Columns = revenue-engine functions. Each cell carries a short label and a RAG status. The OpCube view: a single screen for where an operator should focus this quarter.

Rows (7): Immediate Focus (3 month) · Mid-Term Focus (Next QTR) · Long-Term Focus (12 months) · People · Growth · Efficiency · Accuracy.

Columns (12): Full Year · Current Quarter · Revenue Protection · Revenue Growth · Strategy · Enabled · Pipeline Gen · Qualification · Deal Growth · Deal Acceleration · Competitor Positioning · Cost.

Scoring: the LLM populates each cell from the same evidence corpus as the cubes. For early-stage companies most cells will be unknown (grey) — correct behaviour, not a flaw. As the company matures and the engine ingests CRM/ERP data, cells flip to RAG.

Reading: ‘Immediate Focus × Pipeline Gen = red’ means pipeline coverage is the single most urgent fix in the next 90 days. ‘Efficiency × Cost = amber’ means cost discipline needs a plan but is not on fire. The chessboard turns the score into action.

Illustrative render. Each cell = one RAG status × label pair in the JSON payload.

§ 13

A-Player sub-model

Inside the People cube, each named executive is scored across 7 traits, each RAG.

Attitude & execution
Strategy & Ops
Self-aware
Punchy
Team builder
Network
Relentless

The cube-level People score remains weighted at 12%, but the A-Player view feeds OpCube’s ‘who to keep / coach / replace’ recommendation and FundCube’s key-person risk assessment.

§ 14

Fund-fit assessment layer

Sits on top of the universal score.

Cubed Fund I has 8 fixed criteria; each returns pass / attention / fail with a one-line detail. A company can score 78 (PROGRESS) but still fail fund-fit (wrong stage, wrong geography, conflict, ESG exclusion). Fund-fit failure overrides verdict → PASS. This is the bridge between the universal score and a specific mandate.

§ 15

Thesis-Alignment Intelligence (TAI)

The most important extensibility surface — plug any investor mandate in as structured criteria, and the engine re-scores every company against their thesis without retraining.

The 9-cube score is thesis-neutral (universal company quality). The TAI layer is thesis-specific. Both run on the same evidence corpus.

FitScore(company, thesis) =
    Σ_k  v(k) · m_k(company)            # weighted match score
  − Σ_h  H_h · [ hard_fail_h ]          # hard-rule penalties

where
  v(k)   = investor weight for criterion k (Σ = 100)
  m_k    = match function ∈ [0, 1] per criterion type
  H_h    = penalty for breaching a hard rule (default = ∞ → PASS)

Criterion types

TypeHow m_k is computed
CategoricalSector, geography, business model. m_k = 1 if in allow-list else 0.
RangeStage, cheque size, valuation, headcount, ARR. m_k = 1 inside band, linear decay outside, 0 beyond tolerance.
ThresholdOwnership ≥ X%, gross margin ≥ Y%. m_k = clip( (value − threshold) / tolerance, 0, 1 ).
Score-derived‘Must score ≥ 70 on Product cube’. m_k pulls directly from cube outputs.
Boolean hard ruleConflict-of-interest, ESG exclusion, sanctioned jurisdiction. Hard fail → automatic PASS.
SemanticFree-text thesis. LLM scores embedding similarity between thesis statement and company description; m_k ∈ [0, 1].

Thesis schema (JSON contract)

json
{
  "thesis_id": "fund_xyz_2026",
  "investor": "XYZ Capital",
  "criteria": [
    { "k": "sector",          "type": "categorical", "allow": ["fintech","saas","ai"],         "w": 15 },
    { "k": "stage",           "type": "range",       "min": "Seed", "max": "Series A",          "w": 15 },
    { "k": "geography",       "type": "categorical", "allow": ["UK","US","EU"],                 "w": 10 },
    { "k": "cheque",          "type": "range",       "min": 1e6, "max": 5e6, "tol": 0.25,       "w": 10 },
    { "k": "ownership",       "type": "threshold",   "gte": 0.10, "tol": 0.05,                  "w": 10 },
    { "k": "gross_margin",    "type": "threshold",   "gte": 0.70, "tol": 0.10,                  "w": 10 },
    { "k": "product_quality", "type": "score",       "cube": "Product & Tech", "gte": 70,       "w": 10 },
    { "k": "thesis_fit",      "type": "semantic",    "statement": "AI infra for regulated industries", "w": 20 }
  ],
  "hard_rules": [
    { "k": "sanctions",   "type": "boolean",     "fail_if": true },
    { "k": "esg_exclude", "type": "categorical", "deny": ["tobacco","weapons","gambling"] }
  ],
  "verdict_bands": { "strong": 75, "watch": 50 }
}

Multi-thesis intelligence

Because the universal score is computed once, an LP can layer N theses and instantly see how the same pipeline ranks under each lens. This enables co-investment matching: ‘which 3 of our 12 LPs would this company suit?’ becomes a sort, not a meeting.

Calibration

Each investor’s weight vector v can be learned from historical decisions: feed in past invest / pass calls, regress on criteria values, and recover the implicit weights. This converts gut-feel committee behaviour into an auditable thesis vector.

§ 16

Cross-product intelligence

One score, three product surfaces, one flywheel.

ProductConsumesSurfaces
FundCubeoverall_score · cubes[] · investment_verdict · fund_fit_criteria · V_pre · forecast curvesPipeline ranking · IC packs · portfolio benchmarking · second-cheque trigger (‘rescore at month 18, if score ↑ ≥ 8 pts then double down’)
RevCubeRevenue Model cube · Customers cube · Driving Metrics · Financials cubeRevenue-engine diagnostic · pipeline-coverage simulator · NRR projection · magic-number trend · sales-team capacity model
OpCubeOperational Chessboard · People cube + A-Player sub-model · Product cube delivery metricsWeekly operating cadence · who-to-coach matrix · dependency map · OKR alignment grid · exec talent heatmap

Cross-product flywheel: each product writes evidence back to the central company record (RevCube updates Forecast Accuracy when actuals load; OpCube updates Employee NPS when a survey closes). The next score recomputation absorbs those updates automatically — the score sharpens as the company is operated through the stack.

§ 17

JSON contracts and data lineage

The data model every product surface consumes.

json
{
  "overall_score": 74,
  "projected_arr": "$2.5M",
  "detected_stage": "Seed",
  "investment_verdict": {
    "label": "PROGRESS",
    "headline": "Strong Seed-stage B2B AI play, clean GTM",
    "bull": "...", "bear": "...", "recommendation": "...",
    "fund_fit_criteria": [
      { "criterion": "Stage", "status": "pass", "detail": "Seed" }
    ]
  },
  "cubes": [
    { "name": "Economy", "score": 82, "factors": [
      { "name": "Strong", "status": "ontrack",
        "reason": "Sector CAGR 22% per S&P 2026 outlook",
        "action": "Maintain macro watch" }
    ]}
  ],
  "research_insights": [
    { "type": "news", "title": "...", "url": "...", "date": "..." }
  ],
  "driving_metrics": [
    { "name": "Forecast Accuracy", "target": "80%+", "status": "unknown", "value": null }
  ],
  "operational_chessboard": [
    { "category": "Immediate Focus (3 month)",
      "cells": [ { "label": "Pipeline gap", "status": "attention" } ] }
  ]
}

Validation rules (enforced server-side)

  • cubes.length == 9 and names match the canonical set.
  • • Each factor.status ∈ {ontrack, attention, critical, unknown}.
  • • Overall score recomputed from cubes and rejected if delta > 2 from LLM-claimed score.
  • operational_chessboard.length == 7, cells.length == 12 per row.
  • • Every factor must have non-empty reason and action.

§ 18

LLM pipeline, JSON salvage & determinism

Model: Gemini 2.5 Flash via Lovable AI Gateway. Temperature 0.2. System prompt enforces the full 101-factor catalogue, the activation matrix and the scoring rules verbatim. The LLM’s only job is classification + evidence extraction — never arithmetic.

Pipeline

  1. Scrape (website, LinkedIn, news) → corpus.
  2. Merge uploaded docs + follow-up answers.
  3. Build prompt with corpus + 101-factor instructions.
  4. LLM call → JSON.
  5. Salvage layer.
  6. Validate.
  7. Server-side recompute of overall_score from cubes (LLM’s claimed score is a check, not source of truth).
  8. Persist.

JSON salvage

LLM output is often truncated or contains stray markdown. The salvage layer strips code fences, balances braces, repairs trailing commas, retries with a stricter system prompt (‘Return ONLY valid minified JSON, no commentary, all strings <200 chars’) if parse fails, and finally falls back to a stub object with exceptions[] populated so the UI can flag ‘partial data’.

Determinism guarantees

Same corpus + same prompt + temp 0.2 → score reproducible within ±1 pt. The math layer is fully deterministic — drift comes only from LLM classification noise on borderline factors, which are flagged in the audit trail for human review.

§ 19

Calibration, backtesting & the training dataset

Three tables form the ML-ready warehouse.

TableShapePurpose
companies_trainingWide per company-quarter9 cube scores, 101 factor statuses, V_pre, projected ARR, actual ARR (when known), outcome label.
factor_observationsLong-format(company_id, factor_id, quarter, status, evidence_url). Per-factor calibration — does ‘Magic Number = attention’ actually predict Series A outcomes?
outcomes_ledgerEvent loground_raised, valuation, exit, shutdown with timestamps. Used to compute Brier scores and fit the weight vector.

Backtest metric

Brier(verdict) = mean( ( P(PROGRESS) − 1[outcome = positive] )² )

Cubed Fund I target: Brier ≤ 0.18 on 24-month-forward outcomes.
Each new outcome event updates the running Brier; weight-vector review every 25 new outcomes.

Weight learning (when enabled)

w* = argmin_w  Σ_i ( Σ_c w(c) · S(c, i) − y_i )²
              s.t.  Σ_c w(c) = 1,  w(c) ≥ 0

Initialised at the curated 8 / 10 / 8 / 12 / 12 / 12 / 13 / 13 / 12 vector.
Drift > 15% on any cube triggers a human-in-the-loop committee review
before deployment — never auto-deployed.

§ 20

Failure modes, biases & guardrails

What can go wrong, and what stops it.

Failure modeWhat it isMitigation
Idea-stage false-negativeEngine could mark missing data as Critical.Stage-aware clemency rule forces Attention floor and excludes inactive factors.
Hype inflationSentiment-heavy corpora bias the LLM toward ontrack.Sentiment cube capped at 8% weight; Critical-cap dominates.
Single-source dependencyIf scraping fails, score collapses to unknown.SourcesProvenance flags low-coverage; UI shows ‘partial data’ badge; score withheld below evidence threshold.
Late-stage misuseScore is calibrated for Seed; applying to Series C distorts weights.Late-stage override forces verdict = PASS for Cubed Fund I; other TAI profiles can rebalance weights per stage.
LLM hallucinated evidenceFabricated reason/action strings.Each evidence claim must carry a source URL traceable in SourcesProvenance; unreferenced reasons are quarantined.
Founder gamingOnce thesis is known, founders shape language to match.Weights are public, but evidence rules + Critical-cap mean shallow language games stay at Attention; only verifiable outcomes flip to green.

§ A

Appendix · Offline reference implementation

Reproduces the exact score the live engine produces for any JSON payload. Language-agnostic.

python
WEIGHTS = {
  Economy: .08, Personality: .10, Sentiment: .08,
  People:  .12, Product:    .12, Market:    .12,
  RevenueModel: .13, Financials: .13, Customers: .12,
}
MULT = { ontrack: 1.0, attention: 0.6, critical: 0.2 }   # unknown / N/A excluded

for cube in cubes:
    active = [f for f in cube.factors if stage_allows(f.tag, company.stage)
                                   and f.status not in ("unknown", "N/A")]
    raw = 100 * mean(MULT[f.status] for f in active)
    if any(f.status == "critical" for f in active):
        raw = min(raw, 65)                              # critical cap
    cube.score = raw

score = sum(cube.score * WEIGHTS[cube.name] for cube in cubes)
n_crit = sum(1 for c in cubes for f in c.factors if f.status == "critical")
if n_crit >= 3: score -= 5                              # systemic-risk penalty
score = clip(score, 0, 100)

verdict = "PROGRESS" if score >= 70 else ("WATCH" if score >= 40 else "PASS")
if late_stage_override(company): verdict = "PASS"

Want the brain to run this against your business?

Genesis Onboarding is the 5-minute path from your pitch deck to a live Cubed Score, biggest risks, and ten action-ready insights.

Start Genesis Onboarding

— End of Wiki v1 —