Skip to content

LM generation with strange behavior #93

@hyuenmin-choi

Description

@hyuenmin-choi

System Info

I could successfully run evalution scripts.
I downloaded the model weights using huggingface_hub python packages using following codes.

from huggingface_hub import snapshot_download

` # Define model paths
models = {
"math-shepherd-mistral-7b-prm": "/workspace/openr/model/math-shepherd-mistral-7b-prm",
"mistral-7b-sft": "/workspace/openr/model/mistral-7b-sft",
}

' # Download each model
for model_name, local_dir in models.items():
print(f"Downloading {model_name} to {local_dir}")
snapshot_download(repo_id=f"peiyi9979/{model_name}", local_dir=local_dir, local_files_only=False)`

Who can help?

@mengfn @YanSong97 @iamlilAJ

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the codebase (such as scrips/, ...)
  • My own task or dataset (give details below)

Reproduction

  1. I just downloaded the models using huggingface.

  2. Then after just changing model path and name, I ran sh reason/llm_service/create_service_math_shepherd.sh to launch FastChat serving system.

  3. I modified /reason/evaluate.py to run just 1 request not the whole datasets using following codes.

` def evaluate_single_request(
method_name: str, solver_fn: Callable, save_dir: Optional[Path] = None) -> Dict[str, Any]:
test_ds = task.test_ds

    # Select the first request
    problem_inst = test_ds[10]
    print(f"Evaluating single request: {problem_inst['question']}")

    # Solve the problem instance
    output = solver_fn(problem_inst, llm_gen_fn, rm_call)

    print(f"Result: {output.solutions}")
    return output`
  1. Then I ran given script to check its functionality. sh scripts/eval/vanila_mcts.sh & sh scripts/eval/beam_search.sh

Expected behavior

Generation by VLLMRemoteCaller is strange.
It produces whole output, not the step by step output.
So for me, it produces bunch of warnings which tells me that PRM value length does not matches with action history.
Also, It dose not show any expected reasoning behavior.
In my opinion, LM model doesn't recognize the lm_step_tag well.

This is log of mine.

`
root@f7320dc3b737:/workspace/openr.code# sh scripts/eval/beam_search.sh
/usr/local/lib/python3.10/dist-packages/fuzzywuzzy/fuzz.py:11: UserWarning: Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning
warnings.warn('Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning')
lm_step_tag ки

Evaluating single request: What is the least positive integer multiple of 30 that can be written with only the digits 0 and 2?
2025-01-03 04:11:42.457 | WARNING | reason.guided_search.tree:_expand_leaf_node:692 - PRM value length not match with action history. len(prm)=4, len(act_hist)=0 s:
<|im_start|>system
Please reason step by step, and put your final answer within \boxed{{}}.<|im_end|>
<|im_start|>user
What is the least positive integer multiple of 30 that can be written with only the digits 0 and 2?<|im_end|>
<|im_start|>assistant

a:
{'action': 'We want to find the least positive integer multiple of 30 that can be written with only the digits 0 and 2. ки\nStep 1: The smallest multiple of 30 is 30 itself, but it contains the digit 0. ки\nStep 2: The next multiple of 30 is 60, which contains the digit 0 again. ки\nStep 3: The next multiple of 30 is 90, which contains the digit 0 again. ки\nStep 4: The next multiple of 30 is 120, which contains the digit 0 again. ки\nStep 5: Finally, the next multiple of 30 is 150, which only contains the digits 0 and 2. ки\nStep 6: Therefore, the least positive integer multiple of 30 that can be written with only the digits 0 and 2 is $\boxed{150}$. The answer is: 150 ки\n\n', 'prob': 0.26125988425317836, 'num_token': 217, 'finish_reason': 'stop'}
rs:[0.8916015625, 0.60400390625, 0.607421875, 0.54296875, 0.4765625, 0.611328125, 0.57763671875]
2025-01-03 04:11:42.457 | WARNING | reason.guided_search.tree:_expand_leaf_node:692 - PRM value length not match with action history. len(prm)=4, len(act_hist)=0 s:
<|im_start|>system
Please reason step by step, and put your final answer within \boxed{{}}.<|im_end|>
<|im_start|>user
What is the least positive integer multiple of 30 that can be written with only the digits 0 and 2?<|im_end|>
<|im_start|>assistant

a:
{'action': 'We know that a number is divisible by 30 if it is divisible by both 2 and 3. ки\nStep 1: Since we can only use the digits 0 and 2, we need to find the smallest multiple of 30 that only contains these digits. ки\nStep 2: To be divisible by 2, the number must end in 0 or 2. ки\nStep 3: To be divisible by 3, the sum of its digits must be divisible by 3. ки\nStep 4: The smallest multiple of 30 that satisfies these conditions is 20. ки\nStep 5: Therefore, the least positive integer multiple of 30 that can be written with only the digits 0 and 2 is $\boxed{20}$. The answer is: 20 ки\n\n', 'prob': 0.2554836445415253, 'num_token': 186, 'finish_reason': 'stop'}
rs:[0.5849609375, 0.5390625, 0.640625, 0.64794921875, 0.23095703125, 0.06658935546875]
2025-01-03 04:11:42.457 | WARNING | reason.guided_search.tree:_expand_leaf_node:692 - PRM value length not match with action history. len(prm)=4, len(act_hist)=0 s:
<|im_start|>system
Please reason step by step, and put your final answer within \boxed{{}}.<|im_end|>
<|im_start|>user
What is the least positive integer multiple of 30 that can be written with only the digits 0 and 2?<|im_end|>
<|im_start|>assistant

a:
{'action': 'We want to find the smallest multiple of 30 that can be written with only the digits 0 and 2. ки\nStep 1: The smallest multiple of 30 that we can think of is 30 itself, which is not a valid answer since it contains the digit 1. ки\nStep 2: The next multiple of 30 is 60, which only contains the digits 0 and 2. ки\nStep 3: Therefore, the least positive integer multiple of 30 that can be written with only the digits 0 and 2 is $\boxed{60}$. The answer is: 60 ки\n\n', 'prob': 0.24323187560054257, 'num_token': 142, 'finish_reason': 'stop'}
rs:[0.8759765625, 0.57373046875, 0.46484375, 0.259765625]
2025-01-03 04:11:42.457 | WARNING | reason.guided_search.tree:_expand_leaf_node:692 - PRM value length not match with action history. len(prm)=4, len(act_hist)=0 s:
<|im_start|>system
Please reason step by step, and put your final answer within \boxed{{}}.<|im_end|>
<|im_start|>user
What is the least positive integer multiple of 30 that can be written with only the digits 0 and 2?<|im_end|>
<|im_start|>assistant

a:
{'action': 'We want to find the smallest multiple of 30 that can be written using only the digits 0 and 2. ки\nStep 1: Since 30 is divisible by 2 and 3, our number must also be divisible by 2 and 3. ки\nStep 2: To be divisible by 2, the units digit must be even, so it must be 0 or 2. ки\nStep 3: To be divisible by 3, the sum of its digits must be divisible by 3. ки\nStep 4: The only possible sum of digits using only 0 and 2 is 0, 2, or 4. ки\nStep 5: However, we want the smallest multiple, so we choose the sum of digits to be 0. ки\nStep 6: The units digit must be 0, and the tens digit must be 2. ки\nStep 7: Therefore, the least positive integer multiple of 30 that can be written with only the digits 0 and 2 is \boxed{20}. The answer is: 20 ки\n\n', 'prob': 0.2400245956047538, 'num_token': 248, 'finish_reason': 'stop'}
rs:[0.8759765625, 0.74609375, 0.71875, 0.7373046875, 0.60009765625, 0.60400390625, 0.6513671875, 0.1143798828125]
Result: ['We want to find the least positive integer multiple of 30 that can be written with only the digits 0 and 2. ки\nStep 1: The smallest multiple of 30 is 30 itself, but it contains the digit 0. ки\nStep 2: The next multiple of 30 is 60, which contains the digit 0 again. ки\nStep 3: The next multiple of 30 is 90, which contains the digit 0 again. ки\nStep 4: The next multiple of 30 is 120, which contains the digit 0 again. ки\nStep 5: Finally, the next multiple of 30 is 150, which only contains the digits 0 and 2. ки\nStep 6: Therefore, the least positive integer multiple of 30 that can be written with only the digits 0 and 2 is $\boxed{150}$. The answer is: 150 ки\n\n', 'We know that a number is divisible by 30 if it is divisible by both 2 and 3. ки\nStep 1: Since we can only use the digits 0 and 2, we need to find the smallest multiple of 30 that only contains these digits. ки\nStep 2: To be divisible by 2, the number must end in 0 or 2. ки\nStep 3: To be divisible by 3, the sum of its digits must be divisible by 3. ки\nStep 4: The smallest multiple of 30 that satisfies these conditions is 20. ки\nStep 5: Therefore, the least positive integer multiple of 30 that can be written with only the digits 0 and 2 is $\boxed{20}$. The answer is: 20 ки\n\n', 'We want to find the smallest multiple of 30 that can be written with only the digits 0 and 2. ки\nStep 1: The smallest multiple of 30 that we can think of is 30 itself, which is not a valid answer since it contains the digit 1. ки\nStep 2: The next multiple of 30 is 60, which only contains the digits 0 and 2. ки\nStep 3: Therefore, the least positive integer multiple of 30 that can be written with only the digits 0 and 2 is $\boxed{60}$. The answer is: 60 ки\n\n']
`

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions