Skip to content

[spirv] Incorrect int4 vicuna model output #14739

@yzhang93

Description

@yzhang93

What happened?

Vicuna int4 model outputs are different compared to fake_int8 model (int4 weights in int8 container). The results begin diverging from dispatch 5.

Build is based on afc8705

Steps to reproduce your issue

  1. Download model second_vicuna_int4.mlir.
  2. Commands to reproduce:
    ./iree-compile --iree-input-type=none --iree-vm-bytecode-module-output-format=flatbuffer-binary --iree-hal-target-backends=vulkan --mlir-print-debuginfo --mlir-print-op-on-diagnostic=false --iree-llvmcpu-target-cpu-features=host --iree-stream-resource-index-bits=64 --iree-vm-target-index-bits=64 --iree-vm-bytecode-module-strip-source-map=false --iree-util-zero-fill-elided-attrs --iree-vm-target-truncate-unsupported-floats --iree-codegen-check-ir-before-llvm-conversion=false --iree-vulkan-target-triple=rdna3-unknown-linux --iree-opt-const-expr-hoisting=false --iree-consteval-jit-debug=true ~/Downloads/second_vicuna_int4.mlir -o vicuna_vulkan_i4.vmfb --iree-stream-resource-max-allocation-size=3221225472 --iree-flow-break-dispatch=@forward:5

./iree-run-module --device_allocator=caching --vulkan_vma_allocator=false --module=vicuna_vulkan_i4.vmfb --device=vulkan --function=forward --input=1x1xi64 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32  --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --input=1x32x1x128xf32 --output=@vulkan_dispatch5.npy

  1. The results I got for dispatch 5 from the above commands: https://storage.googleapis.com/shark-public/vivian/vicuna_i4/vulkan_dispatch5.npy
  2. Golden results for comparison: https://storage.googleapis.com/shark-public/vivian/vicuna_i4/vulkan_dispatch5_golden.npy

Metadata

Metadata

Assignees

Labels

bug 🐞Something isn't workingcodegen/spirvSPIR-V code generation compiler backend

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions