-
Notifications
You must be signed in to change notification settings - Fork 9
Open
Description
Thanks for the great work. I'm trying to reproduce the results and facing following errors:
-
Can I use lm-evaluation-harness script instead of yours to evaluate the results? When I used lm-harness ammlu dataset, I got 34.1 accuracy as compared to yours 37. What could be the difference?
-
How to use this script for another model's evaluation?
i. When I changed the model to jais-13b it gave 0% accuracy on Ammlu (all the responses are empty string).
ii. On any other model such as Phi-2, MobiLlama-1B, I am getting the following error:
below are the changes I made to config.yaml:

and in ArabicMMLU_few_shots.sh, I changed the model id to Phi-2B-base. Can you please tell me the solution of this?
Metadata
Metadata
Assignees
Labels
No labels