The Bridge Pool Assignment Parser is a Rust library designed to parse and store Tor bridge pool assignment data from text files. It provides functionality to:
- Parse assignment files with flexible timestamp formats
- Store parsed data in SQLite databases
- Export parsed data to CSV
- Handle various assignment data formats
- Validate fingerprints and other fields
- Rust (stable version, recommended: 1.70 or later)
- Cargo package manager
Add to your Cargo.toml:
[dependencies]
bridge-pool-assignment-parser = { git = "link/to/repo" }
Or for local development, clone the repository:
git clone https://link/to/repo
cd TOR-Bridge-Pool-Assignment-Parser
cargo build
The parser supports a specific input format for Tor bridge pool assignments:
@type bridge-pool-assignment 1.1
bridge-pool-assignment 2022-04-09 00:29:37
005fd4d7decbb250055b861579e6fdc79ad17bee email transport=obfs4 ip=192.168.1.1 blocklist=ru distributed=true state=functional bandwidth=1024 ratio=1.902
00782946f4c54ce1d028f21e541ef8440ecaa0ee https transport=vanilla ip=10.0.0.1 distributed=false state=blocked bandwidth=2048 ratio=0.5
- First line: Type declaration (optional)
- Second line: Timestamp in various accepted formats
- Subsequent lines: Assignment entries with:
- 40-character hexadecimal fingerprint
- Distribution method (email, https, moat, etc.)
- Optional key-value pairs:
transportipblocklistdistributedstatebandwidthratio
use bridge_pool_assignment_parser::{BridgePoolAssignmentParser, BridgePoolDatabase};
use std::path::Path;
fn main() -> Result> {
// Parse file with no assignment limit
let (assignments_file, assignments) = BridgePoolAssignmentParser::parse_file(
Path::new("path/to/assignments.txt"),
None
)?;
println!("Parsed {} assignments", assignments.len());
// Optional: Store in SQLite database
let db = BridgePoolDatabase::new("bridge_assignments.db")?;
db.insert_assignments_file(&assignments_file)?;
for assignment in &assignments {
db.insert_assignment(assignment)?;
}
Ok(())
}// Limit to first 10 assignments
let (assignments_file, assignments) = BridgePoolAssignmentParser::parse_file(
Path::new("path/to/assignments.txt"),
Some(10)
)?;The crate includes a binary for direct file parsing:
cargo run --bin parse_bridge_assignments -- --input assignments.txt --database output.db --csv-output assignments.csv --max-assignments 100// Access fields from the first assignment
if let Some(first_assignment) = assignments.first() {
println!("Fingerprint: {}", first_assignment.fingerprint);
println!("Distribution method: {}", first_assignment.distribution_method);
if let Some(transport) = &first_assignment.transport {
println!("Transport: {}", transport);
}
if let Some(bandwidth) = first_assignment.bandwidth {
println!("Bandwidth: {}", bandwidth);
}
}The parser supports multiple timestamp formats:
2024-03-05 12:34:562024-03-05 12:342024-03-05T12:34:56Z2024-03-05bridge-pool-assignment 2022-04-09 00:29:37
The library uses two main structs:
// Represents the assignment file metadata
pub struct BridgePoolAssignmentsFile {
pub published: NaiveDateTime,
pub header: String,
pub digest: String,
}
// Represents an individual bridge assignment
pub struct BridgePoolAssignment {
pub published: NaiveDateTime,
pub digest: String,
pub fingerprint: String,
pub distribution_method: String,
pub transport: Option,
pub ip: Option,
pub blocklist: Option,
pub bridge_pool_assignments: Option,
pub distributed: Option,
pub state: Option,
pub bandwidth: Option,
pub ratio: Option,
}The library uses a custom BridgePoolAssignmentError enum for comprehensive error handling:
DatabaseErrorIOErrorParsingErrorCSVErrorSerializationErrorTimestampParsingErrorInvalidFingerprint
Run the test suite with:
cargo test
cargo test --lib
cargo test --test integration_test
The library uses env_logger for logging. Configure log levels as needed:
env_logger::builder()
.filter_level(log::LevelFilter::Info)
.init();- Fork the repository
- Create a feature branch
- Commit your changes
- Push and create a pull request