Conversation
|
@Vedin, |
| public void Dispose() | ||
| { | ||
| // BrotliNative.BrotliDecoderDestroyInstance(BrotliNativeState); | ||
| // BrotliNative.BrotliEncoderDestroyInstance(BrotliNativeState); |
There was a problem hiding this comment.
@KrzysztofCwalina This 2 methods, which I mentioned above
There was a problem hiding this comment.
I think you are going to have to hold enough information that allows you to distinguish between whether you are doing compress or decompress, then call the correct de-allocation function.
|
|
||
| public void SetQuality(uint quality) | ||
| { | ||
| if (quality < MinQuality || quality > MaxQuality) |
There was a problem hiding this comment.
MinQuality is 0 and quality is an unsigned int. Quality will never be < MinQuality and hence this check is unnecessary.
|
|
||
| public void SetWindow(uint window) | ||
| { | ||
| if (window < MinWindowBits || window > MaxWindowBits) |
There was a problem hiding this comment.
if (window - MinWindowBits > MaxWindowBits - MinWindowBits) saves an extra conditional branch.
See: #1616 (comment)
There was a problem hiding this comment.
make sense, nice optimization
| } | ||
|
|
||
| internal static TransformationStatus Compress(ReadOnlySpan<byte> source, Span<byte> destination, out int bytesConsumed, out int bytesWritten, int quality = DefaultQuality, int windowSize = DefaultWindowSize, BrotliEncoderMode encMode = BrotliEncoderMode.Generic) | ||
| public static TransformationStatus FlushEncoder(Span<byte> source, Span<byte> destination, out int bytesConsumed, out int bytesWritten, ref State state, bool is_finished = true) |
There was a problem hiding this comment.
nit: argument name doesn't follow coding style
| bytesConsumed = bytesWritten = 0; | ||
| BrotliEncoderOperation operation = is_finished ? BrotliEncoderOperation.Finish : BrotliEncoderOperation.Flush; | ||
| bytesConsumed = source.Length; | ||
| bytesWritten = destination.Length; |
There was a problem hiding this comment.
These should not be set to source.Length and destination.Length in the beginning.
There was a problem hiding this comment.
really it should be because we send this variables to the native library (where they means array size)
(see written and consumed)
| bytesWritten = (int)written; | ||
| return TransformationStatus.Done; | ||
| } | ||
| bytesWritten = destination.Length - bytesWritten; |
There was a problem hiding this comment.
I am curious, why do we do this?
There was a problem hiding this comment.
because bytesWritten after calling a native func is: "how many bytes have already free", and we convert it to how many bytes have already written
| return TransformationStatus.InvalidData; | ||
| }; | ||
| bytesConsumed = (int)consumed; | ||
| bytesWritten = (int)((nuint)destination.Length - written); |
There was a problem hiding this comment.
Won't this set bytesWritten to 0 in the success/done case, if written == destination.Length?
There was a problem hiding this comment.
Yes, it's true and it should works like this
| bool errorDetected = false; | ||
| bytesConsumed = source.Length; | ||
| bytesWritten = destination.Length; | ||
| BrotliDecoderResult LastDecoderResult = BrotliDecoderResult.NeedsMoreInput; |
There was a problem hiding this comment.
Why is NeedsMoreInput the starting state of LastDecoderResult?
There was a problem hiding this comment.
Because at start we haven't send anything for decompress, so we need more input
| } | ||
| else | ||
| { | ||
| endOfStream = true; |
There was a problem hiding this comment.
Is this bool necessary? The other two conditions return, so reaching here automatically implies endOfStream is true. No?
There was a problem hiding this comment.
endOfStream was deleted
|
|
||
| if (endOfStream && !BrotliNative.BrotliDecoderIsFinished(state.BrotliNativeState)) | ||
| { | ||
| errorDetected = true; |
There was a problem hiding this comment.
Similarly, I don't think we need this. You can just use !BrotliNative.BrotliDecoderIsFinished(state.BrotliNativeState)) at line 209/215. Let me know if I am mistaken.
There was a problem hiding this comment.
I rewrote the logic of detecting errors, but you are right
| var text = BrotliNative.BrotliDecoderErrorString(error); | ||
| throw new System.IO.IOException(text + BrotliEx.unableDecode); | ||
| } | ||
| if (endOfStream && !BrotliNative.BrotliDecoderIsFinished(state.BrotliNativeState) && LastDecoderResult == BrotliDecoderResult.NeedsMoreInput) |
There was a problem hiding this comment.
I don't know if this will ever be true.
if (x && y && LastDecoderResult == BrotliDecoderResult.NeedsMoreInput) is false if LastDecoderResult == BrotliDecoderResult.NeedsMoreInput is false (regardless of what x and y are), and we only get here if that is the case.
| public void Dispose() | ||
| { | ||
| // BrotliNative.BrotliDecoderDestroyInstance(BrotliNativeState); | ||
| // BrotliNative.BrotliEncoderDestroyInstance(BrotliNativeState); |
There was a problem hiding this comment.
I think you are going to have to hold enough information that allows you to distinguish between whether you are doing compress or decompress, then call the correct de-allocation function.
| // BrotliNative.BrotliEncoderDestroyInstance(BrotliNativeState); | ||
| } | ||
|
|
||
| public void InitializeDecoder() |
There was a problem hiding this comment.
I suggest a single initialize function with a flag that says whether they want compress or decompress. Also, I'd make the initialization protect against a second call. If you don't you will leak native memory.
There was a problem hiding this comment.
You might also consider making these internal. State is a struct, so have the default object be "uninitialized" then have the state initialize itself on first use. That makes the developer have to so less work and won't get it wrong.
There was a problem hiding this comment.
The problem is that in encoder mode developer also may want to set quality and window size, in this case He needs to have a state before first call. Possible solution is add another Compress method with quality and window size, but it doesn't look obvious for user to call one method at first iteration and other on next iterations, does it?
| _encoder = new Encoder(); | ||
| _encoder.SetQuality(); | ||
| _encoder.SetWindow(); | ||
| _state.InitializeEncoder(); |
There was a problem hiding this comment.
In the stream case, I would say that an exception bubbling up is probably a good thing, unless you plan on wrapping and re-throwing.
| if (BrotliNative.BrotliEncoderIsFinished(_state.BrotliNativeState)) return; | ||
| _nextInput = new byte[0]; | ||
| _availableInput = 0; | ||
| TransformationStatus ts = TransformationStatus.DestinationTooSmall; |
There was a problem hiding this comment.
Rename ts to something more meaningful. Even "status" would be better.
There was a problem hiding this comment.
renamed to flushStatus
| } | ||
| else | ||
| { | ||
| throw new UnauthorizedAccessException(); |
There was a problem hiding this comment.
That exception is used for security checks. It is not appropriate here.
| } | ||
| copyLen = bytesRemain > _bufferSize ? _bufferSize : bytesRemain; | ||
| Marshal.Copy(buffer, currentOffset, _bufferInput, copyLen); | ||
| byte[] bufferInput = new byte[copyLen]; |
There was a problem hiding this comment.
Why is this extra buffer allocation and copy necessary? If you are going to need an extra buffer on each write call, you might want to rent a buffer from a pool, have your own buffer that you re-use, or use a stack buffer (if it isn't too large).
There was a problem hiding this comment.
Not necessary It is temporary. What the problem is that signature of Compress takes out parameter (consumed) and input is exactly as source size. I can change consumed to ref parameter, but it is inconsistent with other Encode/Decode methods, which using Span.
May be I can fix it using Span.Slice and just cut the buffer instead of copy it to _bufferInput?
There was a problem hiding this comment.
@shiftylogic Is something like this ok?
Span<byte> bufferInput = new Span<byte>(buffer);
bufferInput.Slice(0, copyLen);
transformationResult = Brotli.Compress(bufferInput, _bufferOutput, out _availableInput, out _availableOutput, ref _state);
| BrotliEncoderOperation operation = isFinished ? BrotliEncoderOperation.Finish : BrotliEncoderOperation.Flush; | ||
| bytesConsumed = source.Length; | ||
| bytesWritten = destination.Length; | ||
| bytesConsumed = 0; |
There was a problem hiding this comment.
Why set bytesConsumed to source.Length first and then to 0?
| TransformationStatus transformationResult = TransformationStatus.DestinationTooSmall; | ||
| while (transformationResult == TransformationStatus.DestinationTooSmall) | ||
| /*Span<byte> bufferInput = new Span<byte>(buffer); | ||
| bufferInput.Slice(0, copyLen);*/ |
There was a problem hiding this comment.
nit: remove commented out code?
There was a problem hiding this comment.
It was a question to @shiftylogic Can I use smth like this instead of allocate and copy array?
| } | ||
|
|
||
| private void ValidateCompressedData(Span<byte> data, byte[] expected) | ||
| private void ValidateCompressedData(byte[] data, byte[] expected) |
There was a problem hiding this comment.
Why was this change required?
There was a problem hiding this comment.
Just to understand that I send byte array not span, in fact it wasn't change anything
| byte[] data = File.ReadAllBytes(testFilePath); | ||
| var bytes = new byte[bufferSize]; | ||
| foreach (var iteration in Benchmark.Iterations) | ||
| { |
There was a problem hiding this comment.
FYI, this filename has a typo:
BrotliPerfomanceTests.cs => BrotliPerformanceTests.cs
| { | ||
| internal IntPtr BrotliNativeState { get; private set; } | ||
| internal BrotliDecoderResult LastDecoderResult; | ||
| public bool CompressMode { get; set; } |
There was a problem hiding this comment.
This should probably be readonly and set internally when InitializeDecoder or InitializeEncoder is called.
| _availableInput = (nuint)copyLen; | ||
| _nextInput = _bufferInput; | ||
| while ((int)_availableInput > 0) | ||
| byte[] bufferInput = new byte[copyLen]; |
There was a problem hiding this comment.
Why is this extra array and copy needed? Why can't you use the input directly for the compress call?
| while (true) | ||
| if (_state.BrotliNativeState == IntPtr.Zero) return; | ||
| if (BrotliNative.BrotliEncoderIsFinished(_state.BrotliNativeState)) return; | ||
| _nextInput = new byte[0]; |
There was a problem hiding this comment.
Why allocate an empty array and throw away the one you allocated in the constructor?
| public static TransformationStatus Compress(ReadOnlySpan<byte> source, Span<byte> destination, out int bytesConsumed, out int bytesWritten, ref State state) | ||
| { | ||
| EnsureInitialized(ref state, true); | ||
| try |
There was a problem hiding this comment.
This try catch and throw is not necessary if all you are doing it re-throwing the same exception. Just let it bubble up to the caller.
| public static TransformationStatus Decompress(ReadOnlySpan<byte> source, Span<byte> destination, out int bytesConsumed, out int bytesWritten, ref State state) | ||
| { | ||
| EnsureInitialized(ref state, false); | ||
| try |
There was a problem hiding this comment.
Same here. Just let this bubble up to the caller.
| } | ||
| } | ||
|
|
||
| public static void EnsureInitialized(ref State state, bool compress) |
There was a problem hiding this comment.
Probably want this internal, unless you think someone needs this for external use.
| } | ||
|
|
||
| public static TransformationStatus Compress(ReadOnlySpan<byte> source, Span<byte> destination, out int bytesConsumed, out int bytesWritten, CompressionLevel quality = (CompressionLevel)DefaultQuality, int windowSize = DefaultWindowSize, BrotliEncoderMode encMode = BrotliEncoderMode.Generic) | ||
| public static TransformationStatus FlushEncoder(Span<byte> source, Span<byte> destination, out int bytesConsumed, out int bytesWritten, ref State state, bool isFinished = true) |
There was a problem hiding this comment.
Why change the input source to Span instead of ReadOnlySpan?
There was a problem hiding this comment.
It is different methods, but yes their should be ReadOnlySpan also
| nuint availableOutput = (nuint)bytesWritten; | ||
| nuint consumed = (nuint)bytesConsumed; | ||
| state.LastDecoderResult = BrotliNative.BrotliDecoderDecompressStream(state.BrotliNativeState, ref consumed, ref bufIn, ref availableOutput, ref bufOut, out nuint totalOut); | ||
| bytesWritten = (int)((nuint)destination.Length - availableOutput); |
There was a problem hiding this comment.
This is simpler. Would it work?
bytesWritten = destination.Length - (int)availableOutput;
There was a problem hiding this comment.
Yes, I should change it, I just think that it is better to do arithmetic operations in the narrowed type, than expand it to int and than do minus
| public partial class BrotliStream : Stream | ||
| { | ||
| private const int DefaultBufferSize = (1 << 16) - 1; | ||
| private const int DefaultBufferSize = (1 << 16) - 16; |
There was a problem hiding this comment.
Can you add the decimal value of DefaultBufferSize as a comment?
| throw new System.IO.IOException(BrotliEx.unableEncode); | ||
| var extraData = (nuint)_availableOutput != (nuint)_bufferSize; | ||
| if (extraData) | ||
| flushStatus = Brotli.FlushEncoder(Array.Empty<byte>(), _buffer, out _availableInput, out _availableOutput, ref _state, finished); |
There was a problem hiding this comment.
Out of curiosity, why do we pass an empty array here?
There was a problem hiding this comment.
Because, when we start flush stream, we shouldn't send any more date to the original native lib
There was a problem hiding this comment.
Hmm...Does Span.Empty work faster? I just thought it is the same
There was a problem hiding this comment.
Hmm...Does Span.Empty work faster? I just thought it is the same
I am not certain (will have to look at assembly). However, FlushEncoder takes a span, and if Span.Empty works, I would use that. If there is a specific reason to keep Array.Empty instead and use the implicit cast to Span, then use array. I believe you avoid the implicit cast: https://github.com/dotnet/corefx/blob/master/src/System.Memory/src/System/Span.cs#L376
There was a problem hiding this comment.
Span.Empty should be faster because it avoids the implicit cast and thus the extra ctor call.
* Optimizing some int32 parsers and clean up * wip - adding tests and fixing impl bugs * WIP - new implementation and tests to compare with previous * WIP - fixing non-invariant parser bugs and adding tests * Cleaning up functional and performance tests and removing old code. * Removing unnecessary using directive * Addressing PR comments and adding more tests. * Addressing PR comments and adding loop unrolling. * Fixing issues missed after merge * Fixing non-invariant int32 parser * Addressing PR comment, removing unused DangerousGetPinnableReference * Cleanup and updating test helpers * Patch to dotnet install scripts for downloading via new blob URL. (#1632) * Removing text.Length cache and changing to unsigned comparison * Adding comment and removing unnecessary math operations using 0 (D0) * Api brotli changes (#1621) * CaterburyPerf * flush/compress * huge changes * space * small * flush/compress * huge changes * space * small * unsaved chagnes * test fix depends on APIchanges * resolve issues * Less alocation, change State * issue * remove unnecessary * resolve * clean up changes, change state * resolve * guidelines * fix bug * resolve issues * change access * small issues
Changes to the Brotli API. What's new:
Brotli (Primitives) was rewrote and now return all possible TransformationStatus.
BroltiStream now base on Brotli class. Also it uses twice less memory as before and doesn't use marshaling.
@KrzysztofCwalina Finally, I remove everything from State and rewrite all in way where we need only BrotliNativeState and LastDecoderResult, So how to Dispose it know (there are 2 commented methods for compression/decompression mode? May be add some flag to detect in what mode state now?
BrotliPrimitivesTests needs an update.