Skip to content

Question about delta-based encoding missing in implementation #6

@Zedong-Liu

Description

@Zedong-Liu

I’m reading both the SIGCOMM’24 paper and the codebase. The paper (§5.2 and Fig. 6) describes a change-based / delta encoding strategy: grouping tokens and storing the KV of the first token in each group as an anchor, then encoding only the deltas relative to that anchor.

However, in the current implementation, I could not locate such delta computation or storage. It seems the KV tensors are directly quantized and compressed without this differential step.

Is delta-based encoding still part of the latest method?
Was it removed/changed for practical reasons (accuracy / complexity / GPU optimization)?
If removed — does the reported performance in the paper include delta encoding or reflect the current code?

If I missed where it’s implemented, could you please point me to the right place?

Thanks again for the excellent work and for open-sourcing CacheGen.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions