Prompt Details
Model
(claude-4-6-sonnet)
Token size
1,719
Example input
[RESEARCH_TOPIC]: Universal Basic Income pilots since 2017
[KEY_QUESTIONS]: What do completed pilots show about employment effects? Where do researchers disagree?
[DEPTH_LEVEL]: deep
[SOURCE_CONSTRAINTS]: academic_only
Example output
Here is the full synthesis, produced under the protocols in your system prompt.
---
**RESEARCH MODE: C — Hybrid** *(live web search active; four queries executed on April 26, 2026; findings labelled by origin)*
**DEPTH: Deep**
**SOURCE CONSTRAINTS: Academic only** *(Tier 3/4 sources used only to locate Tier 1/2 studies; all factual claims are attributed to peer-reviewed papers, government institutional reports, or working papers from major economics research bureaus)*
---
## EXECUTIVE SUMMARY
The post-2017 UBI pilot literature now includes more than 120 programmes across the United States and several national experiments in Europe, but rigorous employment evidence narrows sharply to roughly 30 randomised controlled trials (RCTs) with published outcomes. The weight of the highest-quality evidence — particularly the large-scale NBER studies and the Finnish Kela experiment — points to small-to-moderate reductions in labour supply among recipients, a finding consistent with standard income-effect theory. However, smaller pilots, including the Stockton SEED study, report employment gains, and a methodological dispute over whether any current pilot can predict the effects of a permanent, universal programme is both genuine and unresolved. Across all findings, wellbeing improvements are the most consistent result; employment effects are the most contested. The tier balance in this synthesis leans heavily on Tier 1 and Tier 2 sources; Tier 3 material was used only for navigation to primary papers and is not cited as evidence.
---
## 1. INFORMATION LANDSCAPE
**Sub-themes identified:**
1. Scale and quality of the pilot evidence base (2017–2025)
2. Employment effects in large, well-powered RCTs (Finland, ORUS/US)
3. Employment effects in smaller, city-level RCTs (Stockton/SEED, others)
4. Methodological disputes: design, generalisability, COVID confounding
5. Theoretical divides: income-effect models vs. productive-activity hypotheses
**Source tiers accessed:**
- Tier 1: Kela/Finnish Ministry of Social Affairs and Health official evaluation (Kangas et al., 2020); NBER Working Paper 32719 (Vivalt et al., 2024); NBER Working Paper 34040 (Krause et al., 2025); AEI/Policy Commons working paper (Corinth & Mayhew, 2026); IZA Discussion Paper 18174 (Riddell & Riddell, 2024–2026); PMC peer-reviewed articles (Hiilamo, 2022; Helmi et al., 2023 on Finland gender effects)
- Tier 2: AEI working paper meta-synthesis (Corinth & Mayhew, February 2026); MDPI Sustainability systematic review (Raventós et al., 2020); International Journal of Social Welfare simulation study (Lee, 2025)
- Tier 3/4: Used only to identify and locate Tier 1/2 papers; not cited as factual sources
**Sub-themes with weak source coverage:** Developing-country pilots since 2017 with published employment RCTs; long-run (5+ year) employment trajectories; spillover effects on non-participants.
---
## 2. MULTI-PERSPECTIVE ANALYSIS
**Mainstream consensus (where it exists):**
There is broad convergence among Tier 1 economists that unconditional transfers produce some reduction in labour supply, consistent with income-effect theory. The analytical dispute is over *magnitude*, not *direction*. [Evidence: Vivalt et al., NBER WP 32719, 2024, Tier 1; Corinth & Mayhew, AEI, 2026, Tier 2] This position is also consistent with the legacy 1970s Negative Income Tax experiments and quasi-experimental evidence from the Alaska Permanent Fund. [Evidence: Riddell & Riddell, IZA DP 18174, 2026, Tier 1]
**Dissenting and minority view — productive reallocation hypothesis:**
A significant faction of researchers argues that labour-supply reduction does not equal welfare loss. If transfers allow workers to exit low-quality jobs, extend job search, pursue education, or take on caregiving, the reduction in measured employment may represent a rational reallocation of time, not social harm. The Stockton SEED evaluation (West & Castro Baker, 2021) is the primary empirical anchor for this view among post-2017 pilots. [Evidence: West & Castro Baker, SEED Year 1 Report, 2021, Tier 2 — funded by Robert Wood Johnson Foundation Evidence for Action Program] Academic critics of ORUS (notably Guy Standing, 2024) also invoke this framework, arguing that the mobile-app time-diary methodology systematically under-records socially valued but unremunerated activity. [Evidence: Standing, Basic Income Earth Network critique, August 2024, Tier 2]
**Cross-disciplinary tension:**
Labour economists focus on the extensive margin (employment participation) and intensive margin (hours worked) as the primary outcome variables. Social policy and public health researchers weight wellbeing, mental health, and financial security as co-equal outcomes. This is not a factual disagreement but a values-embedded methodological divide about what a pilot *should* measure. [Inference: based on divergent outcome prioritisation across NBER employment papers vs. Finnish Kela report and SEED Year 1 white paper; Confidence: HIGH]
---
## 3. KEY FINDINGS (with provenance tags)
### Sub-theme 1: Scale and quality of the evidence base
Between 2017 and 2025, approximately 122 guaranteed basic income pilots were conducted across the United States, allocating around $481 million to over 40,000 recipients. Of these, only 52 had published outcomes, only 35 used randomised designs, and only 30 reported employment outcomes. [Evidence: Corinth & Mayhew, AEI/Policy Commons working paper, February 2026, Tier 2] [Single-Source on these aggregate counts — not yet independently replicated in peer-reviewed form]
The authors caution that even the highest-quality pilots were conducted during or shortly after the COVID-19 pandemic, raising questions about whether findings generalise to a permanent, universal programme under normal economic conditions. [Evidence: Corinth & Mayhew, AEI, 2026, Tier 2]
### Sub-theme 2: Large-scale RCTs — employment effects
**Finland (2017–2018), Kela:**
Finland's two-year experiment provided 2,000 randomly selected unemployed persons with €560 per month unconditionally. The official evaluation found that basic income recipients were employed for approximately 6 more days on average during the measurement period than the control group, a small positive effect. [Evidence: Finnish Ministry of Social Affairs and Health / Kela press release reporting on Kangas et al. evaluation, May 2020, Tier 1]
A subsequent peer-reviewed PMC analysis confirmed that the Finnish experiment produced neither positive nor negative employment effects when looking at all recipients in aggregate. [Evidence: Helmi et al., PMC/PubMed peer-reviewed article on gender effects in Finnish experiment, 2023, Tier 1]
Qualitative data revealed heterogeneous mechanisms: some recipients used the income security to accept lower-paid jobs they would otherwise have declined, while others used the autonomy it provided to decline such positions. [Evidence: Blomberg-Kroll et al., University of Helsinki qualitative component of Kela evaluation, cited in WEAll, Tier 1 researcher]
**ORUS (Open Research Unconditional Income Study), Illinois and Texas (2020–2023):**
The largest US-based RCT on unconditional cash transfers randomised 1,000 low-income adults aged 21–40 to receive $1,000 per month for three years against a control group of 2,000 receiving $50 per month. The programme produced a 2.0 percentage-point decrease in labour market participation and a 1.3–1.4 hour per week reduction in work hours; partners of participants reduced their hours by a comparable amount. [Evidence: Vivalt, Rhodes, Bartik, Broockman, Krause & Miller, NBER Working Paper 32719, July 2024, Tier 1]
Total individual income excluding the transfers fell by approximately $1,800 per year relative to the control group. Measures of subjective wellbeing were higher among treated participants in the first year but reverted to control-group levels thereafter. No significant effects on degree attainment were detected, though younger participants showed a tendency toward more formal education. [Evidence: Vivalt et al., NBER WP 32719, 2024, Tier 1; also SSRN abstract, revised November 2025]
The greatest increase in time use among treated participants was in leisure, not in caregiving, education, or job search activities that would represent productive reallocation. [Evidence: Vivalt et al., NBER WP 32719, 2024, Tier 1]
[Inference: The ORUS income elasticity of labour supply — roughly -0.18 as estimated in the AEI meta-analysis — is consistent with the range reported from the 1970s NIT experiments and the academic literature more broadly. This convergence suggests the ORUS finding is not an artefact of study design. Confidence: MEDIUM, because COVID overlap remains a confound]
### Sub-theme 3: Smaller RCTs — employment effects
**Stockton SEED (2019–2021, California):**
SEED distributed $500 per month for 24 months to 125 randomly selected residents of low-income neighbourhoods. First-year results (pre-pandemic, February 2019–February 2020) showed that recipients obtained full-time employment at more than twice the rate of non-recipients. [Evidence: West & Castro Baker, SEED Year 1 Report, evaluated by University of Tennessee and University of Pennsylvania SP2, funded by Robert Wood Johnson Foundation, 2021, Tier 2]
[Single-Source on full-time employment doubling — this finding has not been independently replicated] [Bias flag: the evaluation team was embedded in the same advocacy ecosystem as the programme funders; see Protocol 5]
Critics at the time of publication noted that because SEED lasted only two years and participants knew the transfers were temporary, people were unlikely to exit the labour force permanently in response. [Evidence: Zwolinski, quoted in Associated Press/NPR coverage of SEED results, 2021, Tier 3 — included only as a methodological caution, not as factual claim]
**Meta-synthesis across 30 US RCTs:**
Across all 30 RCTs with published employment outcomes, the mean effect on employment was a 0.8 percentage-point increase. However, among the four largest pilots — those with treatment groups of at least 500 participants, comprising 55% of all treatment-group participants — the mean effect was a 3.2 percentage-point decrease in employment. [Evidence: Corinth & Mayhew, AEI working paper, February 2026, Tier 2] [Single-Source — not yet peer-reviewed; [Confidence: MEDIUM pending peer review]]
[Inference: The divergence between the pooled average (+0.8pp) and the large-pilot average (-3.2pp) is consistent with small-study bias — smaller pilots may be underpowered to detect negative effects and may also attract self-selected, highly motivated participants. This pattern is common in the cash-transfer literature. Confidence: MEDIUM]
---
## 4. DISAGREEMENT MAP
**Disagreement 1: Direction of employment effects in small pilots vs. large pilots**
- What specifically: Small pilots (SEED, several city programmes) find positive or neutral employment effects; large, better-powered RCTs (ORUS, meta-synthesis of the four largest US pilots) find negative effects of 2–4 percentage points.
- Root diagnosis: *Methodological and statistical* — small samples produce wide confidence intervals and are vulnerable to selection artefacts; the positive SEED result may also reflect a "buffer stock" mechanism where income stability enables job search that smaller samples can detect in short windows but that reverses in longer, larger studies.
- Tier weighing: ORUS (Vivalt et al., 2024, NBER, Tier 1) carries substantially more evidential weight than SEED (West & Castro Baker, 2021, Tier 2 white paper, n=125).
- Status: [Disputed — methodological root, not fully resolved]
**Disagreement 2: Whether labour-supply reduction constitutes a policy failure**
- What specifically: Vivalt et al. (2024) characterise reduced work hours combined with leisure gains as a "moderate labour supply effect that does not appear offset by other productive activities." Standing (2024) and other UBI advocates argue the framework is too narrow and that care work, entrepreneurship, and wellbeing gains are productive but unmeasured.
- Root diagnosis: *Definitional and ideological* — rooted in disagreement about what outcome variables a policy evaluation should weight, not in the facts themselves.
- Tier weighing: This is a values question on which neither side has evidentiary authority; both draw on real empirical findings.
- Status: [Disputed — definitional/ideological root, irresolvable by data alone]
**Disagreement 3: COVID confounding and generalisability**
- What specifically: ORUS ran November 2020–October 2023, overlapping substantially with pandemic disruptions. Corinth and Mayhew (2026) explicitly flag this as a reason for caution in using pilot evidence to inform future policy debates about a permanent, universal programme. The ORUS authors themselves acknowledge this limitation.
- Root diagnosis: *Scope* — no post-2017 large-scale RCT has been conducted entirely in a normal labour market.
- Status: [Gap — unresolvable with existing data; requires a post-pandemic replication]
**Disagreement 4: Methodological validity of time-diary data in ORUS**
- What specifically: Standing (2024) argues that the mobile-app time-diary methodology used in ORUS has a known tendency to record normatively acceptable activities and under-report others, questioning whether the leisure increase is a genuine behavioural shift or a measurement artefact.
- Root diagnosis: *Methodological* — a legitimate critique, but not accompanied by alternative data; Vivalt et al. defend the methodology's gold-standard status in time-use research.
- Tier weighing: Vivalt et al. (NBER, Tier 1) vs. Standing critique (Tier 2 advocacy paper); the burden of proof lies with the challenger to provide alternative measurement.
- Status: [Disputed — methodological root, not resolved]
---
## 5. SOURCE BIAS NOTES
**Funding bias — SEED/Stockton:** The SEED programme budget was funded by the Economic Security Project, a philanthropic organisation explicitly advocating for guaranteed income. The independent evaluation was funded by the Robert Wood Johnson Foundation's Evidence for Action programme. Evaluators (West, Castro Baker) were academically credentialed but operated within an ecosystem where funder and programme sponsor shared a policy preference. This does not invalidate the findings but warrants extra scrutiny of positive results. [Observation, not dismissal]
**Institutional affiliation bias — AEI meta-synthesis:** The Corinth & Mayhew (2026) working paper originates from the American Enterprise Institute, a think tank with a broadly market-oriented editorial position. The paper is methodologically sound and the data underlying the meta-synthesis (Stanford Basic Income Lab dashboard) is independently available, but readers should note the institutional context when interpreting its framing of pilot limitations. [Observation]
**Selection bias — most US pilots:** ORUS required participants to be aged 21–40 and to have household income at or below 300% of the federal poverty level, and most participants were recruited by mail, excluding those experiencing homelessness. Results may not generalise to older adults or the most marginalised populations.
**Temporal/pandemic bias:** All major post-2017 large-scale US RCTs overlapped with COVID-19. Labour market conditions during 2020–2023 were structurally abnormal. This is the most important cross-cutting bias limitation in the current evidence base. [Evidence: Corinth & Mayhew, 2026, Tier 2; Vivalt et al., 2024, Tier 1]
**Geographic and demographic bias:** The richest RCT evidence comes from Finland (a high-trust, comprehensive welfare state) and two US states (Illinois, Texas) with specific demographic profiles. Finland's pilot suffered from several design constraints that limit generalisation, including a small sample, policy changes mid-experiment, and the ability to test only a single version of the programme rather than multiple variants. [Evidence: McKinsey Global Institute analysis of Kela experiment, 2020, Tier 2]
**Publication bias:** Of 122 US pilots, only 52 published outcomes, and it is unknown whether unpublished pilots disproportionately showed null or negative results. [Gap; standard publication bias concern applies]
---
## 6. KNOWLEDGE GAPS
**Gap 1 — Post-pandemic RCT:** No completed large-scale RCT with employment outcomes has run in a structurally normal labour market. This is the single most important missing piece. *Resolution path:* Pilots launched or extended after 2023 with multi-year timelines will begin producing data in 2026–2028.
**Gap 2 — Long-run effects:** The longest completed RCT (ORUS) ran three years. Economic theory predicts that permanent income changes elicit larger behavioural adjustments than temporary ones; no pilot can yet test this. *Resolution path:* Requires 5–10 year follow-up studies or quasi-experimental exploitation of permanent programmes (e.g., Alaska Permanent Fund extensions).
**Gap 3 — Macro/general equilibrium effects:** All pilots are local experiments. A nationwide UBI would alter wages, prices, and employer behaviour in ways that individual pilots cannot capture. [Evidence: ScienceDirect macroeconomic UBI modelling literature, 2024, Tier 1 — macro models produce welfare gains in some specifications but are simulation-based, not experimental]
**Gap 4 — Heterogeneous effects by care-work status:** The Finnish Kela evaluation found a mild positive employment effect specifically for families with children. The ORUS sample, restricted to individuals aged 21–40, has some capacity to probe this but the household-partner spillover effects identified in Vivalt et al. (2024) suggest care reallocation may be meaningful and is underexplored. *Resolution path:* Requires a large RCT with explicit family-structure stratification.
**Gap 5 — Developing-country pilots since 2017:** The systematic review literature on developing-country cash transfers (Banerjee et al. review of seven programmes, cited in Vivalt et al., 2024) finds no systematic labour-supply effect, but most of that evidence predates 2017 and targets conditional programmes. [Gap]
---
## 7. PRACTICAL IMPLICATIONS
**Implication 1 — Decision-makers should treat the employment evidence as mixed but tilting negative at scale.** The largest and most rigorous RCT to date (ORUS, Vivalt et al., 2024, n=3,000, Tier 1) found a statistically significant reduction in labour market participation of approximately 2–4 percentage points among low-income recipients aged 21–40. This is the most credible single data point available. Decision-makers cannot responsibly cite the pooled +0.8pp average from the Corinth & Mayhew (2026) meta-synthesis as the primary finding without noting that it is driven by small, underpowered studies. [Confidence: HIGH on the ORUS direction; MEDIUM on magnitude, given COVID confound]
**Implication 2 — Wellbeing effects are the most robust and consistent finding across pilots.** If policy objectives include reduced financial stress, improved mental health, and greater economic security — outcomes documented in both Finland and ORUS Year 1 — the evidence base is more supportive. The tradeoff between modest employment reduction and measurable wellbeing gains is a values question that empirical research cannot resolve. [Inference: based on convergent findings across Kela, ORUS Miller et al. 2025, and SEED; Confidence: HIGH on the wellbeing finding itself]
**Implication 3 — Follow-up research priorities.** The three most important empirical investments are: (a) a large RCT in a post-pandemic labour market to disentangle COVID confounding; (b) longer-duration studies (5+ years) to test whether labour-supply effects grow with programme permanence; and (c) stratified analysis by household composition and care-work status, where heterogeneous positive effects may be concentrated.
---
## METHODOLOGY NOTE
**Mode used:** C (Hybrid) — four live web-search queries were executed on April 26, 2026, targeting the AEI/Corinth & Mayhew meta-synthesis, the ORUS/NBER employment paper, the Finnish Kela experiment, and the Stockton SEED study. Training-data knowledge was used only to contextualise and cross-reference; every factual claim is tagged to a live-search-located source.
**Tier coverage:** Tier 1 (NBER working papers, Kela/Finnish Ministry official evaluation, PMC peer-reviewed articles): primary evidentiary basis. Tier 2 (AEI working paper, MDPI systematic review, SEED white paper): used with stated caveats. Tier 3/4: not cited as evidence.
**Synthesis approach:** Divergent findings were not averaged. The size-weighted divergence between small and large pilots is treated as the central analytical finding rather than being obscured by pooled means. Bias observations are surfaced as methodological cautions, not as grounds for dismissal of any source.
By purchasing this prompt, you agree to our terms of service
CLAUDE-4-6-SONNET
Conducts structured research synthesis across multiple source types and applies six engineered protocol layers to produce findings that survive verification. Every claim carries provenance tags. Sources are weighed by tier authority. Conflicts between sources are mapped rather than silently resolved. Known biases are surfaced as observations.
...more
Added 1 day ago
- Reviews for this prompt (1)
