A fast and comprehensive Python library for expanding English contractions.
- ⚡ Fast: ~112K ops/sec for typical text expansion (Aho-Corasick algorithm)
- 📚 Comprehensive: Handles standard contractions, slang, and custom additions
- 🎯 Smart: Preserves case and handles ambiguous contractions intelligently
- 🔧 Flexible: Easy to add custom contractions on the fly
- 🐍 Modern: Supports Python 3.10+
pip install sane-contractionsuv pip install sane-contractionsimport contractions
contractions.expand("you're happy now")
# "you are happy now"
contractions.expand("I'm sure you'll love it!")
# "I am sure you will love it!"
# Shorthand aliases
contractions.e("you're") # "you are"
contractions.p("you're", 5) # preview with contextimport contractions
text = "I'm sure you're going to love what we've done"
expanded = contractions.expand(text)
print(expanded)
# "I am sure you are going to love what we have done"contractions.expand("yall're gonna love this", slang=True)
# "you all are going to love this"
contractions.expand("yall're gonna love this", slang=False)
# "yall are going to love this"
contractions.expand("yall're gonna love this", leftovers=False)
# "yall are gonna love this"The library intelligently preserves the case pattern of the original contraction:
contractions.expand("you're happy") # "you are happy"
contractions.expand("You're happy") # "You are happy"
contractions.expand("YOU'RE HAPPY") # "YOU ARE HAPPY"Add a single contraction:
contractions.add('myword', 'my word')
contractions.expand('myword is great')
# "my word is great"Add multiple contractions at once:
custom_contractions = {
"ain't": "are not",
"gonna": "going to",
"wanna": "want to",
"customterm": "custom expansion"
}
contractions.add_dict(custom_contractions)
contractions.expand("ain't gonna happen")
# "are not going to happen"Load contractions from a JSON file:
# custom_contractions.json contains: {"myterm": "my expansion", "another": "another word"}
contractions.load_file("custom_contractions.json")
contractions.expand("myterm is great")
# "my expansion is great"Load all JSON files from a folder:
# Load all *.json files from a directory (ignores non-JSON files)
contractions.load_folder("./my_contractions/")
contractions.expand("myterm is great")
# "my expansion is great"The preview() function lets you see all contractions in a text before expanding them:
text = "I'd love to see what you're thinking"
preview = contractions.preview(text, context_chars=10)
for item in preview:
print(f"Found '{item['match']}' at position {item['start']}")
print(f"Context: {item['viewing_window']}")
# Output:
# Found 'I'd' at position 0
# Context: I'd love to
# Found 'you're' at position 21
# Context: what you're thinkinExpands contractions in the given text.
Parameters:
text(str): The text to processleftovers(bool): Whether to expand leftover contractions (default: True)slang(bool): Whether to expand slang terms (default: True)
Returns: str - Text with contractions expanded
Adds a single custom contraction.
Parameters:
key(str): The contraction to matchvalue(str): The expansion
Adds multiple custom contractions at once.
Parameters:
dictionary(dict): Dictionary mapping contractions to their expansions
Loads custom contractions from a JSON file.
Parameters:
filepath(str): Path to JSON file containing contraction mappings
Raises:
FileNotFoundError: If the file doesn't existjson.JSONDecodeError: If the file contains invalid JSON
Loads custom contractions from all JSON files in a directory. Non-JSON files are automatically ignored.
Parameters:
folderpath(str): Path to directory containing JSON files
Raises:
FileNotFoundError: If the folder doesn't existNotADirectoryError: If the path is a file, not a directoryValueError: If no JSON files are found in the folder
Preview contractions in text before expanding.
Parameters:
text(str): The text to analyzecontext_chars(int): Number of characters to show before/after each match
Returns: list[dict] - List of matches with context information
Shorthand alias for expand().
Shorthand alias for preview().
you're -> you are
I'm -> I am
we'll -> we will
it's -> it is
they've -> they havegonna -> going to
wanna -> want to
gotta -> got to
yall -> you all
ain't -> are notjan. -> january
feb. -> february
mar. -> marchFor ambiguous contractions, the library uses the most common expansion:
he's -> he is (not "he has")The library uses the Aho-Corasick algorithm for efficient string matching, achieving:
- ~112K ops/sec for typical text expansion (short texts with contractions)
- ~251K ops/sec for preview operations (contraction detection)
- ~17K ops/sec for medium texts with no contractions
- ~13K ops/sec for slang-heavy texts
- ~278K ops/sec for adding custom contractions
Benchmarked on Apple M3 Max, Python 3.13.
Run performance benchmarks yourself:
# Create virtual environment and install
uv venv && source .venv/bin/activate
uv pip install -e .
# Run benchmarks
python tests/test_performance.py- Python 3.10 or higher
- textsearch >= 0.0.21
Contributions are welcome! Please feel free to submit a Pull Request.
git clone https://github.com/devjerry0/sane-contractions
cd sane-contractions
uv venv && source .venv/bin/activate
uv pip install -e ".[dev]"pytest tests/ --cov=contractions --cov-report=term-missingruff check .
mypy contractions/ tests/This fork includes several enhancements over the original contractions library:
add_dict()- Bulk add custom contractions from a dictionaryload_file()- Load contractions from JSON filesload_folder()- Load all JSON files from a directory- Type hints - Full type coverage with mypy validation
- Better structure - Modular code organization with single-responsibility modules
- Facade API - Clean, simple public API with shorthand aliases (
e(),p())
- Lazy-loaded TextSearch instances (30x faster imports)
- Optimized dictionary operations and comprehensions
- Eliminated redundant code paths
- Reduced function call overhead
- 100% test coverage enforced via CI/CD
- Comprehensive tests including edge cases
- Input validation and error handling tests
- Performance benchmarking suite
- Python 3.10+ support (modern type hints with
list[dict], etc.) - Ruff for fast linting (replaces black, flake8, isort)
- Mypy for strict type checking
- GitHub Actions CI/CD with concurrency control
- Automated PyPI publishing via Git tags
uvsupport for fast dependency management
- Comprehensive README with real benchmark results
- Complete API reference with examples
- Clear contributing guidelines
This is an enhanced fork of the original contractions library by Pascal van Kooten, with improvements in performance, testing, type safety, and maintainability.
The original library is excellent but has been unmaintained since 2021. This fork provides:
- Active maintenance
- Modern Python practices
- Community contributions
- Regular updates
MIT License - see LICENSE file for details.
Original Author: Pascal van Kooten (@kootenpv)
Fork Maintainer: Jeremy Bruns
Original Repository: https://github.com/kootenpv/contractions
This project would not exist without Pascal's excellent foundation. All credit for the core concept and initial implementation goes to the original author.