Skip to content

rodspeed/umwelt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

umwelt

LLMs pad their reasoning with filler words — "basically," "actually," "just," "really," "pretty much." This isn't stylistic. Filler words dilute reasoning chains and measurably reduce accuracy on tasks that require precise thinking.

Umwelt removes them.

npx @rodspeed/umwelt init

One command. +6.7% accuracy. Validated across 15,600 controlled trials, 6 models, 7 reasoning task types.

Why this works

When an LLM writes "this is basically equivalent," it has spent tokens on a hedge instead of a commitment. When it writes "this is equivalent," it must stand behind the claim — and it reasons more carefully to get there.

We tested this across Claude Sonnet 4, Claude Haiku 4.5, GPT-4o, GPT-4o Mini, Gemini 2.5 Pro, and Gemini 2.5 Flash Lite on tasks ranging from causal reasoning to ethical dilemmas to syllogisms. Banning 20 filler words improved accuracy on 5 of 6 models and 5 of 7 task types. The effect is strongest when models struggle — on harder tasks and weaker models, accuracy gains exceed +30 percentage points.

The mechanism isn't cognitive restructuring — it's regularization. Vocabulary bans disrupt default generation patterns and force the model to self-monitor, producing more deliberate reasoning. Shallow, semantically empty constraints outperform deep, theory-laden ones. The filler-word ban has zero logical content yet produces the largest effect.

This is not prompt engineering folklore. It is the first vocabulary constraint technique validated with active controls and statistical rigor.

What it does

Umwelt injects a vocabulary constraint into your project's CLAUDE.md. Claude Code reads this file at session start, so the constraint shapes every response — reasoning, code, explanations — without you thinking about it.

Profiles

Profile Effect Description
neutral-ban +6.7pp Ban 20 filler words. Best general-purpose constraint. Default.
no-have +5.4pp Strip possessive "to have." Forces relational descriptions. Best on ethical reasoning (+18.1pp).
scaffold +4.2pp Metacognitive scaffolding. Forces structured reasoning. Dominates epistemic calibration (+18.5pp).
e-prime +3.7pp Strip all "to be" forms. Smallest effect, highest disruption cost.
combined experimental neutral-ban + no-have stacked.

Effect sizes are deltas vs. unconstrained control (83.0% baseline), measured across 100% compliant first-pass trials from 6 models and 7 task types. All four constraints outperform the control. The ranking inverts theoretical depth: the shallowest constraint produces the largest gain.

Commands

npx @rodspeed/umwelt init            # Set up with neutral-ban default
npx @rodspeed/umwelt set no-have     # Switch to a different profile
npx @rodspeed/umwelt set scaffold    # Metacognitive scaffolding
npx @rodspeed/umwelt list            # Show all profiles with experiment data
npx @rodspeed/umwelt status          # Show active profile
npx @rodspeed/umwelt off             # Disable without removing
npx @rodspeed/umwelt on              # Re-enable

How it works

umwelt init writes a constraint block into your project's CLAUDE.md between <!-- umwelt:start --> and <!-- umwelt:end --> markers. Claude Code reads CLAUDE.md at session start, so the constraint applies to every response.

Switching profiles swaps the text between the markers. Turning umwelt off replaces the block with a disabled notice. Your existing CLAUDE.md content is preserved.

Profiles are also copied to .umwelt/ in your project, so you can customize them.

Choosing a profile

The default (neutral-ban) is the right choice for most work. If you need to pick by task:

  • General coding and reasoning: neutral-ban — broadest gains, highest compliance
  • Ethical reasoning, classification: no-have — strongest on tasks requiring relational thinking
  • Epistemic calibration, uncertainty reasoning: scaffold — dominates when the task requires weighing evidence
  • Causal reasoning, debugging: e-prime — helps with process-oriented tasks, but the lowest overall effect

The ranking follows a principle: constraints that ban high-frequency, semantically empty words outperform constraints that ban low-frequency or semantically loaded words. The optimal constraint maximizes self-monitoring occasions per unit of surface reformulation.

A note on E-Prime

E-Prime (English without "to be") improves accuracy overall (+3.7pp), but it has the smallest effect of any constraint and the highest disruption cost — 48% of responses require retries to achieve compliance. It helps causal reasoning and epistemic calibration but underperforms on analogical reasoning and classification.

Use npx @rodspeed/umwelt set e-prime deliberately, not as a default.

Research

Every default in this tool traces back to data, not intuition. The profiles are grounded in a 15,600-trial controlled experiment with 5 conditions, 6 models, 7 task types, active controls, and pre-registered statistical analysis (FDR-corrected pairwise comparisons, Cohen's h effect sizes).

The neutral word ban was not our hypothesis going in — E-Prime was. The data said otherwise. We shipped what the evidence supported.

Compatibility

Currently targets Claude Code (via CLAUDE.md injection). Support for Cursor (.cursorrules), Windsurf (.windsurfrules), and other AI coding tools is planned.

Requires Node.js 18+.

The name

Umwelt (German: "surrounding world") is a term from theoretical biology for the perceptual world an organism inhabits — not the objective environment, but the slice of it the organism can sense and act on. A tick's umwelt is warmth, butyric acid, and gravity. A bat's umwelt is echolocation returns.

An LLM's umwelt is its vocabulary. Constrain the vocabulary and you reshape the world the model reasons within. That's what this tool does.

License

MIT

About

Vocabulary constraint layer for AI coding tools. +6.7% accuracy, validated across 15,600 trials.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors