A high-performance CSV file viewer and processor built with Rust, supporting large files up to GB scale.
- π High Performance: Built with Rust, 15-100x performance improvement
- π Large File Support: Memory mapping and sparse indexing, supports GB-level files
- β‘ Fast Navigation: O(log n) complexity page jumping, millisecond-level response
- πΎ Memory Efficient: Memory mapping and zero-copy technology, 2-4x lower memory usage
- π Smart Caching: LRU page cache with index persistence
- π¨ Modern GUI: Beautiful Tauri + React interface (optional)
- π― Cross-Platform: Native support for Windows/Linux/macOS
Build CLI tool:
cargo build --release
.\target\release\csv-tool.exe data.csvBuild GUI app:
# Setup environment
.\setup_gui_fixed.bat
# Build EXE
.\build.bat
# Run generated EXE
.\tauri\target\release\CSV Tool.exe# Install Rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
# Clone and run
git clone https://github.com/ziyefbk/csv_tool.git
cd csv_tool
cargo build --release
./target/release/csv-tool data.csv# View first page (default)
csv-tool data.csv
# View specific page
csv-tool data.csv -p 2
# Custom page size
csv-tool data.csv -p 2 -s 50
# Custom delimiter
csv-tool data.csv -d ';'# Show file details
csv-tool data.csv info# Basic search
csv-tool data.csv search "keyword"
# Case-insensitive search
csv-tool data.csv search "keyword" -i
# Regex search
csv-tool data.csv search "pattern" -r
# Search in specific column
csv-tool data.csv search "keyword" -c "Column Name"
# Count matches only
csv-tool data.csv search "keyword" --count
# Limit results
csv-tool data.csv search "keyword" -m 100# Sort by column (ascending)
csv-tool data.csv sort -c "Column Name" --order asc
# Sort by column (descending)
csv-tool data.csv sort -c "Column Name" --order desc
# Auto-detect data type
csv-tool data.csv sort -c "Column Name" --data-type auto
# Case-insensitive sort
csv-tool data.csv sort -c "Column Name" --ignore-case# Export to JSON
csv-tool data.csv export output.json --format json
# Export to CSV
csv-tool data.csv export output.csv --format csv
# Export to TSV
csv-tool data.csv export output.tsv --format tsv
# Export specific columns
csv-tool data.csv export output.json --format json -c "Col1,Col2,Col3"
# Export row range
csv-tool data.csv export output.json --format json --from 10 --to 20# Edit cell value
csv-tool data.csv edit "set 1 2 NewValue"
# Delete row
csv-tool data.csv edit "delete-row 5"
# Append row
csv-tool data.csv edit "append-row value1,value2,value3"
# Delete column
csv-tool data.csv edit "delete-col ColumnName"
# Rename column
csv-tool data.csv edit "rename-col OldName NewName"# Create CSV file with headers
csv-tool create new.csv --headers "Column1,Column2,Column3"
# Create with initial rows
csv-tool create new.csv --headers "Col1,Col2,Col3" --rows "val1,val2,val3"- Build the application (see Quick Start above)
- Run the EXE: Double-click
CSV Tool.exe - Open CSV file: Click "Open CSV File" button
- Browse data: Use pagination controls to navigate
- Search: Use the search box to filter data in real-time
| File Size | Standard Open | Fast Open | Improvement |
|---|---|---|---|
| 10k rows (~1MB) | 3.6 ms | 2.6 ms | 1.4x |
| 100k rows (~10MB) | 23 ms | 19 ms | 1.2x |
| 500k rows (~50MB) | 96 ms | 2.5 ms | 38x π |
| Operation | Time |
|---|---|
| Read first page | 37 Β΅s |
| Read middle page | 40 Β΅s |
| Read last page | 63 Β΅s |
| File Size | Before | After | Reduction |
|---|---|---|---|
| 1 GB | 1 GB+ | <50 MB | 20x |
csv-tool/
βββ src/ # Rust core library
β βββ main.rs # CLI entry point
β βββ lib.rs # Library entry
β βββ error.rs # Error types
β βββ csv/ # Core modules
β βββ reader.rs # High-performance reader (mmap + index)
β βββ index.rs # Sparse row index + sampling
β βββ cache.rs # LRU page cache
β βββ search.rs # Search functionality
β βββ sort.rs # Sort functionality
β βββ export.rs # Export functionality
β βββ writer.rs # Edit/write functionality
β βββ utils.rs # Utility functions
β
βββ frontend/ # React frontend
β βββ src/
β βββ App.tsx
β βββ components/ # UI components
β βββ api/ # Tauri API calls
β βββ stores/ # State management
β
βββ tauri/ # Tauri backend
β βββ src/main.rs # GUI API
β
βββ tests/ # Integration tests (40+ tests)
βββ benches/ # Performance benchmarks
βββ docs/ # Documentation
memmap2 = "0.9" # Memory mapping (core optimization)
memchr = "2.7" # SIMD-accelerated string search
rayon = "1.8" # Parallel processing
csv = "1.3" # CSV parsing
lru = "0.12" # LRU cache
bincode = "1.3" # Index serialization
regex = "1.10" # Regular expressions
clap = "4.5" # CLI argument parsing
thiserror = "1.0" # Error types- Memory Mapping (mmap): OS-level file mapping, on-demand loading
- Sparse Indexing: Record byte offset every N rows, O(log n) fast location
- Zero-Copy Parsing: Fields directly reference mmap data, reducing allocations
- Index Persistence: Auto-save index to
.csv.idx, 20-40x faster on reopen - Fast Open Mode: Row sampling estimation, progressive indexing, async build support
For large files, the tool uses smart sampling and progressive indexing:
- Row Sampling: Sample first 1MB to estimate total rows
- Progressive Index: Only index first 2000 rows initially
- Async Build: Background thread continues building full index
- Result: <100ms response time for files of any size!
Indexes are automatically saved to .csv.idx files:
- Validated against file size and modification time
- Loaded automatically on next open
- 20-40x faster than rebuilding
# Run all tests
cargo test
# Run integration tests
cargo test --test integration_test
# Run benchmarks
cargo benchDetailed documentation in docs/:
- USAGE.md - Complete usage guide
- PERFORMANCE.md - Performance analysis
- TECHNICAL_ASSESSMENT.md - Technical details
- QUICK_REFERENCE.md - Quick reference
- High-performance CSV reading (mmap + sparse index)
- Fast open mode (sampling + progressive indexing)
- Index persistence (.csv.idx files)
- LRU page cache
- Zero-copy parsing
- Modern GUI (Tauri + React)
- Search (text, regex, column filter)
- Sort (multiple data types)
- Export (JSON, CSV, TSV)
- Edit (cells, rows, columns)
- Create new files
- Comprehensive tests (40+ tests)
- Performance benchmarks
- Virtual scrolling for very large tables
- Multi-file tab support
- Column statistics
- Data visualization
- Plugin system
Contributions are welcome! Please feel free to submit a Pull Request.
MIT License
Built with β€οΈ using Rust