Smart and fast language detection for Python.
lang-lens/
├── data/ # Contains the WiLI-2018 dataset and the Khan preprocessed subset
├── experiments/ # Jupyter notebooks for experiments
├── langlens/ # The Python package
│ ├── configuration/ # App YAML config and logging config
│ ├── data.py # Load (cleaned) data into splits
│ ├── evaluation.py # Evaluate model (classification report, confusion matrix, PCA plot)
│ ├── vectorizer.py # Vectorize text data
│ ├── main.py # Click CLI interface
├── report/ # Milestone reports
│ ├── figures/ # Figures for the reports
│ ├── sources/ # Papers etc.
├── tests/ # Tests
├── pyproject.toml # Defines dependencies and project configuration
This project uses Poetry for dependency management.
To install the package, run:
poetry installTo see available commands and usage, run:
poetry run langlens --helpTo execute the tests, run:
poetry run pytest tests/