A high-performance asynchronous SAS XPT file parser implemented in Rust, supporting both basic and multi-byte encodings.
- 🚀 Async-first implementation using Tokio
- 📦 Supports both standard UTF-8 and GBK encodings (via feature flags)
- 📈 Efficient streaming parser
- 📂 Metadata extraction (library info, column names, labels)
- 📝 Row-by-row data reading
Add to your Cargo.toml:
[dependencies]
your_crate_name = { version = "0.1", features = ["async"] }async: Enables async support (requires Tokio runtime)multi_encoding: Adds GBK encoding support
use your_crate_name::Reader;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut file = tokio::fs::File::open("sample.xpt").await?;
// Create reader based on encoding feature
#[cfg(not(feature = "multi_encoding"))]
let mut reader = Reader::new(&mut file, |x| {
String::from_utf8(x.to_vec()).unwrap().trim().to_string()
});
#[cfg(feature = "multi_encoding")]
let mut reader = Reader::new_gbk(&mut file);
// Read metadata
let (mut data_handle, metadata) = reader.start().await?;
println!("Library: {:?}", metadata.library);
println!("Columns: {}", metadata.columns.iter().map(|c| &c.name).collect::<Vec<_>>().join("\t"));
// Read data rows
while let Some(row) = data_handle.read_line().await? {
println!("{:?}", row);
}
Ok(())
}The reader supports different initialization methods based on encoding needs:
// For standard UTF-8 processing
Reader::new(&mut file, |bytes| {
String::from_utf8(bytes.to_vec()).unwrap().trim().to_string()
});
// For GBK encoding (requires multi_encoding feature)
Reader::new_gbk(&mut file);pub struct Metadata {
pub library: String,
pub columns: Vec<Column>,
}
pub struct Column {
pub name: String,
pub label: String,
// ... other fields
}The read_line() method returns:
Some(Vec<Value>)when data is availableNonewhen end of file is reached
- Uses zero-copy parsing where possible
- Current-thread Tokio runtime recommended for simple applications
- Enable
multi_encodingfeature only when GBK support is required
Run tests with different feature combinations:
cargo test --features "async"
cargo test --features "async multi_encoding"