Releases: pdimens/mimick
Releases · pdimens/mimick
3.0.1
3.0
New
- completely rewritten in Julia!
- significantly faster under the hood
- added alternative mass-sample-from-VCF mode of operation
Breaking
- removed barcode specs (length/count), turns out they weren't necessary
2.3
Breaking Changes
- replaces
--lr-typewith--segments- accepts an integer
--segments 1uses the entire barcode (10X, tellseq)--segments 3is a 3-segment combinatorial (stlfr)--segments 4is a 4-segment combinatorial (haplotagging)
- replaces
--lengthwith--lengths- now uses the format
--lengths R1,R2 - e.g.
--lengths 132,150
- now uses the format
- default output type for haplotagging (
--segments 4) isstandard:haplotagging
New
- adds
standardsuffixesstandard:haplotaggingfor@SEQID BX:Z:ACBD VX:i:Nstandard:stlfrfor@SEQID BX:Z:1_2_3 VX:i:N- just
standardis the nucleotide barcode@SEQID BX:Z:ATCG VX:i:N
2.2.2
The singleton logic was inverted and for that I am truly sorry 🙏 -- FIXED
2.2.1
Fixed a little oopsie in the final output files.
2.2
New
--circularoption to simulate molecules from linear FASTA sequences as though they are circular
Breaking changes
- none
Non-breaking changes
- [internal] the barcode generation system is Class-based (cleaner code)
Fixes
- the writing process opens the output files once in write mode, instead of many times in append mode
- should speed up the file writing
mainis cleaned up a bunch due to new class system- more errors caught, errors are printed nicer
2.1
Breaking Changes
- None, except the output .gff file is now gzipped
Non-breaking Changes
- writer task is now a Process (used to be a Thread)
- the wgsim simulation is still a thread-based process
- reverted to calling
wgsim.core()instead of as a subprocess (another possible speedup) - longmoleculerecipe logic now writes a temp fasta file (speeds things along)
- N reads from a molecule are drawn from exponential distribution instead of lognormal and then transformed out of log
- the log-unlog transformations were slowing things down significantly
- Set a minimum of 2 reads from a molecule prior to possible downgrade to singleton
- more reliable singleton ratios
Fixes
- simplified thread-watching and backpressure
- should have hopefully addressed some of the noticeable slowdown between v1 and v2
2.0.1
force a commit
2.0
The Ship of Theseus edition
This is an almost complete (>98%) rewrite of the simulator such that it can be considered something entirely different from the source material and inspiration, XENIA.
Breaking Changes
- somehow, none
- the file outputs are a little different, so that would be considered breaking if Mimick was part of a pipeline with specific output expectations
Non-breaking changes
- output is now one pair of FASTQ files, rather than a pair for each haplotype
New
- molecules sharing a barcode can now span contigs and haplotypes
- process described here
- singular GFF output file for mutations
- outputs a molecule manifest file that lists all molecules that were simulated and some important details about them
Internal
- completely rewritten simulator
- significantly improved multithreading
- one thread always reserved for writing final output files
Full Changelog: 1.3...2.0
1.3
New
--seedto optionally set a random seed--singletonsto specify a proportion of singletons (i.e. barcodes with only one read pair)
Fixes
- Barcodes now properly write to file
Breaking changes
None