Skip to content

Conversation

@mzweilin
Copy link

@mzweilin mzweilin commented Jan 20, 2026

This PR fixes a bug in two problems level3/33_VanillaRNN.py and level3/35_LSTM.py that Model uses hard-coded FP32 tensors.

It is related to #79 and #80. You will find the bug if you try to run the agent on the two problems.

Please consider updating the dataset in HuggingFace too: https://huggingface.co/datasets/ScalingIntelligence/KernelBench

An alternative fix could be moving the random tensors from Model to get_inputs().

How to reproduce the bug

$ python -i KernelBench/level3/33_VanillaRNN.py
init_inputs = get_init_inputs()
inputs = get_inputs()
m = Model(*init_inputs)
output = m(*inputs)
m_bf16 = m.to(dtype=torch.bfloat16)
inputs_bf16 = [inp.to(dtype=torch.bfloat16) for inp in inputs]
output_bf16 = m_bf16(*inputs_bf16)
RuntimeError: mat1 and mat2 must have the same dtype, but got Float and BFloat16
$ python -i KernelBench/level3/35_LSTM.py
init_inputs = get_init_inputs()
inputs = get_inputs()
m = Model(*init_inputs)
output = m(*inputs)
m_bf16 = m.to(dtype=torch.bfloat16)
inputs_bf16 = [inp.to(dtype=torch.bfloat16) for inp in inputs]
output_bf16 = m_bf16(*inputs_bf16)
RuntimeError: could not create a primitive descriptor for the LSTM forward propagation primitive. Run workload with environment variable ONEDNN_VERBOSE=all to get additional diagnostic information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant