Skip to content

add support for CSR NPZ decompression for EigerImageSeries#899

Merged
kpwelsh merged 3 commits intomasterfrom
csr-npz-loading-from-master
Feb 27, 2026
Merged

add support for CSR NPZ decompression for EigerImageSeries#899
kpwelsh merged 3 commits intomasterfrom
csr-npz-loading-from-master

Conversation

@kpwelsh
Copy link
Collaborator

@kpwelsh kpwelsh commented Feb 26, 2026

Overview

To minimize the overall bandwidth necessary during collection of Eiger images, we are optionally storing them as NPZ serializations of scipy's CSR matrix (save_npz). This can be substantially smaller amounts of data. However, the HEXRD Eiger image series adapters do not support this compression format. They currently only support "lz4" and "bslz4", which come right off of the Eiger and are decompressed through their compression/decompression algorithms.

This PR adds support for loading CSR NPZ compressed images when using the Eiger V1 or V2 image series adapters.
The core of this change is the following:

def _decompress_csrnpz(d: dict) -> np.ndarray:
    data = d['data']
    # These are there, but they are included in the NPZ as well.
    # So lets just take them from the NPZ for simplicity.
    # dtype = d['dtype']
    # shape = d['shape']
    # elem_size = d['elem_size']

    # First 8 bytes are the length of the compressed data, which we can ignore for loading
    # This is placed there so the Eiger stream2 parsers can handle the data.
    npz_data = data.tobytes()[8:]
    array: csr_array = load_npz(BytesIO(npz_data))
    return array.toarray()

This will be used to decompress the image if the compression type is "csrnpz".

Test Cases

Tests were added to address the following cases:

  1. decompressing from lz4 dict
  2. decompressing from csrnpz dict
  3. decompressing from not-compressed dict (compression_type == None)
  4. Creating an adapter for an EigerV1 HDF5 with csrnpz compression
  5. Creating an adapter for an EigerV1 HDF5 with lz4 compression
  6. Creating an adapter for an EigerV2 HDF5 with csrnpz compression
  7. Creating an adapter for an EigerV2 HDF5 with lz4 compression

Note: these examples are all synthetically generated, but have been validated against real data sets in manual testing. For the synthetic generation, we don't try to emulate BSLZ4 compression because it is unnecessarily complex and poor ROI.

Affected Workflows

Anyone loading Eiger images could be impacted. However, no existing functionality has been changed.

Documentation Changes

As there are no existing documentation artifacts for the Eiger Adapters, this PR will not modify documentation.

@codecov
Copy link

codecov bot commented Feb 26, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 69.55%. Comparing base (f40f031) to head (73f1ff6).
⚠️ Report is 4 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #899      +/-   ##
==========================================
+ Coverage   69.41%   69.55%   +0.14%     
==========================================
  Files         148      149       +1     
  Lines       22590    22586       -4     
==========================================
+ Hits        15680    15709      +29     
+ Misses       6910     6877      -33     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

block_size = orig_size
block_size_bytes = block_size.to_bytes(4, "big")
# Compress the data
compressed = lz4.block.compress(arr_bytes, store_size=False)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess lz4 is being imported automatically here via the pytest.importorskip on line 58. I wonder if we should import it explicitly in this function?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I must have messed that up when I added the skips. I moved the pytest importorskip into that function so we have a more sensible import.

Copy link
Collaborator

@psavery psavery left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kpwelsh kpwelsh merged commit b3bf8d5 into master Feb 27, 2026
14 of 16 checks passed
@kpwelsh kpwelsh deleted the csr-npz-loading-from-master branch February 27, 2026 22:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants