-
Notifications
You must be signed in to change notification settings - Fork 3
Integrate cbq #156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integrate cbq #156
Conversation
Summary of ChangesHello @noamteyssier, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request is a significant and well-executed refactoring to integrate the new cbq format. The introduction of a generic Encoder and the use of BinseqWriterBuilder greatly simplify the codebase and reduce duplication, which is excellent. The CLI is also updated consistently with the new info command and updated options. I have a few suggestions to further improve the code, mainly regarding some leftover debugging settings, commented-out code, and opportunities to reduce duplication.
I am having trouble creating individual review comments. Click here to see my feedback.
Cargo.toml (50-52)
These changes to the release profile appear to be for debugging. Enabling debug symbols (debug = true) in a release build is generally not recommended for production as it increases binary size and can have a minor performance impact. More importantly, commenting out lto = true and codegen-units = 1 will likely cause a significant performance regression in release builds. These settings are crucial for optimizations. Please revert these changes before merging.
lto = true
codegen-units = 1
src/cli/output.rs (321-325)
The enum variants Binseq and VBinseq are being renamed to Bq and Vbq respectively. This is a good change for consistency. However, the old names are still used in BinseqMode::determine. Please update them to use the new variant names.
Binseq,
#[clap(name = "vbq")]
#[default]
Vbinseq,
README.md (11)
This update is correct. However, please also update line 14, which still incorrectly states that there are only two variants. It should be updated to mention three variants now that cbq is supported.
src/cli/info.rs (6-7)
The struct was correctly renamed to InfoCommand, but its documentation comment is outdated and still describes the old CountCommand. Please update the documentation to reflect the new purpose of showing information and statistics about a BINSEQ file.
/// Show information and statistics about a BINSEQ file.
pub struct InfoCommand {
src/commands/cat/mod.rs (126-142)
There's a good amount of code duplication in this module that could be refactored.
- The functions
run_cbq(here) andrun_vbq(lines 108-124) are almost identical. The common logic for processing files could be extracted into a helper function that accepts aBinseqWriter. - Similarly,
record_cbq_header(lines 91-106) andrecord_vbq_header(lines 74-89) are duplicates. This could be solved with a generic function.
Reducing this duplication would make the code more maintainable. For example, run_vbq and run_cbq could be simplified by using a shared helper:
fn run_cat_variable<W: std::io::Write + Send>(
args: &CatCommand,
writer: binseq::BinseqWriter<W>,
) -> Result<()> {
let mut processor = Encoder::new(writer)?;
for path in &args.input.input {
let reader = BinseqReader::new(path)?;
reader.process_parallel(processor.clone(), args.output.threads())?;
}
processor.finish()?;
Ok(())
}src/commands/encode/encode.rs (185-284)
This large block of commented-out code appears to be from the previous implementation. It's great that you've refactored it into a more generic solution. To keep the codebase clean, please remove this commented-out code before merging.
src/types.rs (3-4)
The BoxedWriter type alias is now marked with #[allow(unused)]. If it's no longer used anywhere in the crate due to the refactoring, it would be cleaner to remove it entirely rather than suppressing the warning.
No description provided.