Subjective Outcomes: When Markets Resolve by Committee
Executive Summary
When a prediction-market rule requires a human reviewer to decide an outcome — because the threshold is qualitative, the criteria are contested, or the source-of-truth is plural — you are no longer trading the underlying event. You are trading the committee. Markets with the SUBJECTIVE_JUDGMENT driver have the highest mean loss per dispute in the SettleRisk dataset. This post explains why, and what to do about it.
Core Concept
SUBJECTIVE_JUDGMENT fires when SettleRisk's LLM extractor identifies one of these patterns:
| Pattern | Example | |---------|---------| | Qualitative threshold | "significantly", "materially", "substantially" | | Reviewer authority | "as determined by the platform" | | Vague entity criteria | "major exchange", "leading source" | | Non-numeric impact | "the policy was effective" | | Multi-criteria with weights unspecified | "primarily, but also considering..." |
The driver carries 14 base points and a max_points of 26 (the second-highest cap in the taxonomy, after RETROACTIVE_RULE_CHANGE). The reason is empirical: subjective-judgment disputes have a median resolution time of 23 days and a 31% probability of reversal on appeal, both significantly higher than any other driver.
Worked Example
A real Polymarket on "Will the new tariff package significantly impact bilateral trade by Q2 2026?" — the word "significantly" alone created a multi-week dispute when trade volume fell 12%. Some traders argued any double-digit drop qualified; others pointed to historical precedent where 20%+ was the threshold.
from settlerisk import SettleRiskClient
client = SettleRiskClient(api_key="sk-...")
score = client.get_risk_score("polymarket", "tariffs-significant-impact-2026")
# Score: 78, tier: CRITICAL
# p_dispute: 0.312
# Top drivers:
# SUBJECTIVE_JUDGMENT 21 conf=0.94
# AMBIGUOUS_WORDING 15 conf=0.88
# METRIC_DEFINITION 9 conf=0.79
The committee that eventually resolved the market set the threshold retroactively at 20%, locking out traders who had taken the YES side believing 10-15% would qualify. $3.8M sat frozen for 23 days.
Detection in code:
SUBJECTIVE_LEXICON = {
"significantly", "substantially", "materially", "meaningfully",
"effectively", "primarily", "appropriately", "appreciably"
}
def has_subjective_terms(text: str) -> list[str]:
text_lower = text.lower()
return [w for w in SUBJECTIVE_LEXICON if w in text_lower]
const SUBJECTIVE_LEXICON = new Set([
"significantly", "substantially", "materially", "meaningfully",
"effectively", "primarily", "appropriately", "appreciably",
]);
function hasSubjectiveTerms(text: string): string[] {
const lc = text.toLowerCase();
return [...SUBJECTIVE_LEXICON].filter((w) => lc.includes(w));
}
This catches the top patterns but misses entity-level subjectivity (e.g. "leading source"). For those, run the full extraction via /v1/evaluate-rules.
Implementation Notes
Trade these markets with extreme size discipline. A SUBJECTIVE_JUDGMENT driver above 18 points should reduce exposure caps to ~25% of base for that market. The realized loss distribution is heavy enough that even small position cumulative drawdown can wreck a quarter.
Watch for retroactive threshold-setting. When a market enters dispute on subjective grounds, the committee will pick a threshold. Subsequent markets with similar language will be governed by that precedent — but until then, you have no anchor. Persist resolved interpretations and reuse them.
Skip the worst sub-patterns entirely. "As determined by the platform" with no enumerated criteria is a quote-or-skip filter. The math doesn't work; the dispute economics don't work; just don't trade it.
| Sub-pattern | Recommended action | |-------------|--------------------| | Qualitative threshold | Size at 50% of base | | Reviewer authority (enumerated criteria) | Size at 70% of base | | Reviewer authority (no criteria) | Do not quote | | Vague entity | Size at 60% of base, require explicit fallback rule | | Multi-criteria unspecified | Size at 40% of base |
The pricing engine flags these automatically. If score.drivers includes a SUBJECTIVE_JUDGMENT driver with points_contribution > 18, the pricing engine widens fair spread by 50-100 bps relative to what the score alone would suggest.
Failure Modes
1. Trading the underlying instead of the committee. A market with subjective resolution is a meta-market on how reviewers will interpret. Fundamental analysis of the underlying event undervalues this.
2. Ignoring precedent. Committees follow precedent. If a similar market resolved with a 20% threshold last quarter, this one will too — unless the rules explicitly differ. Keep a precedent log.
3. Sizing to base on the first appearance. A SUBJECTIVE_JUDGMENT > 18 market should be size-capped well below your base allocation until the precedent is set.
4. Confusing subjective with qualitative. Some qualitative terms have well-established quantitative interpretations (e.g. "investment grade" maps to specific rating thresholds). Those are not subjective in the SettleRisk sense.
5. Skipping evidence spans. When a market scores high on SUBJECTIVE_JUDGMENT, the specific phrase in the rules text is the highest-signal info on the page. Always show it to the desk.
Checklist
- [ ] Maintain a lexicon-based pre-filter for subjective terms
- [ ] Use the full extraction for entity-level subjectivity
- [ ] Reduce exposure caps for subjective drivers > 18 pts
- [ ] Quote-skip "as determined by the platform" with no criteria
- [ ] Persist resolved precedents for reuse
- [ ] Subscribe to
dispute.resolvedto capture new precedents
Sources + Further Reading
- SettleRisk methodology — full SUBJECTIVE_JUDGMENT pattern set
- Ambiguous wording post — related linguistic patterns
- Driver attribution post — how drivers combine
- Vagueness in Legal Drafting (Endicott 2000) — adjacent academic literature
- Polymarket UMA dispute appeals — empirical reversal rates
Free key at /signup — extract drivers on any rules text in 200ms.
Get weekly risk analysis in your inbox
Market risk scores, emerging dispute patterns, and settlement delay trends — delivered every Monday.