feat: add support reranking using LLM by junjiem · Pull Request #147 · hexinfo/dat

junjiem · 2025-11-20T03:33:57Z

Summary by CodeRabbit

New Features
- LLM-based reranking enabled with configurable model selection for result scoring.
- Results can be reranked by the chosen LLM when reranking mode is active.
Bug Fixes / Behavior
- Validation updated so reranking can use an LLM without requiring a pre-existing scoring model; otherwise a scoring model is required.
Documentation
- Added default system and user prompts for LLM relevance scoring.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2025-11-20T03:34:14Z

Walkthrough

Adds end-to-end LLM-based reranking: two new config options (USE_LLM_RERANKING, RERANKING_LLM), a new LlmScoringModel that uses a ChatModel for scoring, prompt templates for system/user scoring prompts, and factory logic to instantiate and validate LLM-based scoring when reranking is enabled.

Changes

Cohort / File(s)	Summary
Factory Configuration `dat-core/src/main/java/ai/dat/core/factories/DefaultContentStoreFactory.java`	Added `USE_LLM_RERANKING` and `RERANKING_LLM` public ConfigOptions; imported `LlmScoringModel` and `ChatModel`; included new options in optional options and fingerprinting; updated `create(...)` to resolve ChatModel and instantiate `LlmScoringModel` when reranking is enabled; adjusted `validateConfigOptions(...)` to allow reranking with either an existing scoringModel or LLM-based reranking and to enforce constraints when LLM reranking is disabled.
Scoring Model `dat-core/src/main/java/ai/dat/core/scoring/LlmScoringModel.java`	New public class implementing `ScoringModel`; builder-based constructor accepts a `ChatModel`; uses `AiServices` to create an Assistant that prompts system/user templates; `scoreAll(...)` queries the assistant per segment and returns normalized Double scores wrapped in a `Response`.
Prompt Resources `dat-core/src/main/resources/prompts/default/scoring_system_prompt.txt`, `dat-core/src/main/resources/prompts/default/scoring_user_prompt_template.txt`	Added system prompt defining a strict 0–10 integer scoring scheme and relevance ranges; added user prompt template with `Query` and `Document` placeholders for per-segment evaluation.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Factory as DefaultContentStoreFactory
    participant LlmScorer as LlmScoringModel
    participant LLM as ChatModel/LLM

    Client->>Factory: create(config)
    Factory->>Factory: validateConfigOptions()
    alt RERANK_MODE && USE_LLM_RERANKING
        Factory->>Factory: resolve ChatModel (RERANKING_LLM or default)
        Factory->>LlmScorer: new LlmScoringModel(chatModel)
        Factory->>Factory: set scoringModel = LlmScorer
    else RERANK_MODE && !USE_LLM_RERANKING
        Factory->>Factory: require existing scoringModel
    end
    Factory->>Client: return ContentStore

    Note over Client,LlmScorer: Runtime scoring flow
    Client->>LlmScorer: scoreAll(segments, query)
    LlmScorer->>LLM: Assistant.score(query, document)
    LLM->>LlmScorer: integer score (0-10)
    LlmScorer->>Client: Response<List<Double>>

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Pay attention to: factory conditional logic for resolving/instantiating ChatModel and LlmScoringModel; validation branches enforcing presence or absence of a pre-set scoringModel; prompt loading paths and Assistant annotations in LlmScoringModel.java.

Poem

🐇 Hop lightly through prompts and code,

I nudge each doc down a scoring road.
Ten to zero, the numbers sing,
A rabbit’s nudge to rank each thing.
Hooray — the rerank bells ring!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'feat: add support reranking using LLM' directly matches the main change: introducing LLM-based reranking support with new config options (USE_LLM_RERANKING, RERANKING_LLM) and a new LlmScoringModel class.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat/reranking_using_llm

Tip

📝 Customizable high-level summaries are now available in beta!

You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.

Provide your own instructions using the high_level_summary_instructions setting.
Format the summary however you like (bullet lists, tables, multi-section layouts, contributor stats, etc.).
Use high_level_summary_in_walkthrough to move the summary from the description to the walkthrough section.

Example instruction:

"Divide the high-level summary into five sections:

📝 Description — Summarize the main change in 50–60 words, explaining what was done.

📓 References — List relevant issues, discussions, documentation, or related PRs.

📦 Dependencies & Requirements — Mention any new/updated dependencies, environment variable changes, or configuration updates.

📊 Contributor Summary — Include a Markdown table showing contributions:
| Contributor | Lines Added | Lines Removed | Files Changed |

✔️ Additional Notes — Add any extra reviewer context.
Keep each section concise (under 200 words) and use bullet or numbered lists for clarity."

Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between df651ad and 5bd057f.

📒 Files selected for processing (4)

dat-core/src/main/java/ai/dat/core/factories/DefaultContentStoreFactory.java (5 hunks)
dat-core/src/main/java/ai/dat/core/scoring/LlmScoringModel.java (1 hunks)
dat-core/src/main/resources/prompts/default/scoring_system_prompt.txt (1 hunks)
dat-core/src/main/resources/prompts/default/scoring_user_prompt_template.txt (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

dat-core/src/main/java/ai/dat/core/factories/DefaultContentStoreFactory.java (2)

dat-core/src/main/java/ai/dat/core/scoring/LlmScoringModel.java (1)

LlmScoringModel (19-44)

dat-core/src/main/java/ai/dat/core/utils/FactoryUtil.java (1)

FactoryUtil (29-376)

🪛 LanguageTool

dat-core/src/main/resources/prompts/default/scoring_system_prompt.txt

[grammar] ~4-~4: Ensure spelling is correct
Context: ...the given query Guidelines: Score 0-2: Documentis completely irrelevant Score 3-5: Docume...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

[grammar] ~5-~5: Ensure spelling is correct
Context: ...mentis completely irrelevant Score 3-5: Documenthas some relevant information but doesn't d...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

[style] ~7-~7: This phrase is redundant. Consider writing “relevant”.
Context: ...swers the query Score 9-10: Document is highly relevant and directly answers the query You MUST...

(HIGHLY_RELEVANT)

🔇 Additional comments (6)

dat-core/src/main/resources/prompts/default/scoring_user_prompt_template.txt (1)

1-5: LGTM!

The template structure is clear and appropriate for the scoring use case.

dat-core/src/main/java/ai/dat/core/scoring/LlmScoringModel.java (1)

23-28: LGTM!

The builder-based constructor and AiServices integration is well-structured.

dat-core/src/main/java/ai/dat/core/factories/DefaultContentStoreFactory.java (4)

12-12: LGTM!

The new imports are appropriate for the LLM reranking integration.

Also applies to: 16-16

73-84: LGTM!

The new configuration options are well-documented and appropriate for controlling LLM-based reranking behavior.

296-296: LGTM!

The new options are correctly added to the optional options set.

497-500: LGTM!

The validation logic correctly ensures that when reranking is enabled, either a scoring model is provided or LLM reranking is enabled.

coderabbitai · 2025-11-20T03:36:48Z

dat-core/src/main/java/ai/dat/core/scoring/LlmScoringModel.java

+    public Response<List<Double>> scoreAll(List<TextSegment> segments, String query) {
+        List<Double> scores = segments.stream()
+                .map(segment -> assistant.scoring(query, segment.text()))
+                .map(Double::valueOf)
+                .toList();
+        return Response.from(scores);
+    }


⚠️ Potential issue | 🟠 Major

Add error handling for LLM response parsing.

The current implementation assumes the LLM will always return a valid integer, but LLMs can fail to follow instructions or return unexpected output. If assistant.scoring returns null or the LLM returns non-numeric text, this will throw NullPointerException or cause parsing failures at runtime.

Consider adding defensive error handling:

@Override public Response<List<Double>> scoreAll(List<TextSegment> segments, String query) { List<Double> scores = segments.stream() .map(segment -> assistant.scoring(query, segment.text())) + .map(score -> score != null ? score : 0) .map(Double::valueOf) .toList(); return Response.from(scores); }

Alternatively, add try-catch and logging to handle parse failures gracefully.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

public Response<List<Double>> scoreAll(List<TextSegment> segments, String query) {

List<Double> scores = segments.stream()

.map(segment -> assistant.scoring(query, segment.text()))

.map(Double::valueOf)

.toList();

return Response.from(scores);

}

public Response<List<Double>> scoreAll(List<TextSegment> segments, String query) {

List<Double> scores = segments.stream()

.map(segment -> assistant.scoring(query, segment.text()))

.map(score -> score != null ? score : "0")

.map(score -> {

try {

return Double.valueOf(score);

} catch (NumberFormatException e) {

// Log error and return default value

return 0.0;

}

})

.toList();

return Response.from(scores);

}

🤖 Prompt for AI Agents

In dat-core/src/main/java/ai/dat/core/scoring/LlmScoringModel.java around lines 31 to 37, the code assumes assistant.scoring always returns a valid numeric value; add defensive parsing and error handling so null or non-numeric responses don't crash. For each segment call assistant.scoring, check for null, and parse the returned string inside a try-catch (catch NumberFormatException and any runtime exceptions); on failure log a warning including segment identifier or snippet and the raw LLM response, then use a safe fallback score (e.g., 0.0) or skip the segment per project convention; collect the safe Double values and return Response.from(scores) so the method never throws due to LLM parse errors.

dat-core/src/main/resources/prompts/default/scoring_system_prompt.txt

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (2)

dat-core/src/main/resources/prompts/default/scoring_system_prompt.txt (1)

1-8: LGTM! Prompt is clear and ready for use.

The typos from the previous review have been fixed. The static analysis hint about "highly relevant" being redundant can be safely ignored—the phrase effectively distinguishes the 9-10 score range from the 6-8 "relevant" range.
dat-core/src/main/java/ai/dat/core/scoring/LlmScoringModel.java (1)
31-38: Add comprehensive error handling and logging for LLM scoring failures.

While the null check on Line 34 addresses null responses, the implementation lacks protection against other failure modes:

Parse exceptions: If the LLM returns non-numeric text despite instructions, LangChain4j's Integer parsing may throw exceptions rather than returning null.

Runtime exceptions: Network issues, timeouts, or model errors could throw uncaught exceptions.

Silent failures: The fallback to 0 on Line 34 masks failures without logging, making debugging difficult.

Wrap the scoring call in a try-catch block with logging:
 @Override
 public Response<List<Double>> scoreAll(List<TextSegment> segments, String query) {
     List<Double> scores = segments.stream()
-            .map(segment -> assistant.scoring(query, segment.text()))
-            .map(score -> score != null ? score : 0)
+            .map(segment -> {
+                try {
+                    Integer score = assistant.scoring(query, segment.text());
+                    return score != null ? score : 0;
+                } catch (Exception e) {
+                    log.warn("LLM scoring failed for segment: {}. Using fallback score 0. Error: {}", 
+                             segment.text().substring(0, Math.min(50, segment.text().length())), 
+                             e.getMessage());
+                    return 0;
+                }
+            })
             .map(Double::valueOf)
             .toList();
     return Response.from(scores);
 }
Don't forget to add the @Slf4j annotation to the class (Lombok is already imported).

🧹 Nitpick comments (1)

dat-core/src/main/java/ai/dat/core/scoring/LlmScoringModel.java (1)
35-35: Consider using a clearer method for Integer-to-Double conversion.

Double::valueOf works here through unboxing and widening (Integer → int → double), but Number::doubleValue or Integer::doubleValue would be more explicit and easier to understand.
-            .map(Double::valueOf)
+            .map(Number::doubleValue)

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5bd057f and 45b89a3.

📒 Files selected for processing (3)

dat-core/src/main/java/ai/dat/core/factories/DefaultContentStoreFactory.java (6 hunks)
dat-core/src/main/java/ai/dat/core/scoring/LlmScoringModel.java (1 hunks)
dat-core/src/main/resources/prompts/default/scoring_system_prompt.txt (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

dat-core/src/main/java/ai/dat/core/factories/DefaultContentStoreFactory.java (2)

dat-core/src/main/java/ai/dat/core/scoring/LlmScoringModel.java (1)

LlmScoringModel (19-45)

dat-core/src/main/java/ai/dat/core/utils/FactoryUtil.java (1)

FactoryUtil (29-376)

🪛 LanguageTool

dat-core/src/main/resources/prompts/default/scoring_system_prompt.txt

[style] ~7-~7: This phrase is redundant. Consider writing “relevant”.
Context: ...swers the query Score 9-10: Document is highly relevant and directly answers the query You MUST...

(HIGHLY_RELEVANT)

🔇 Additional comments (1)

dat-core/src/main/java/ai/dat/core/scoring/LlmScoringModel.java (1)

41-42: Both prompt resource files verified to exist in the classpath.

Verification confirms that both prompt files required by the LlmScoringModel interface are present and accessible:

prompts/default/scoring_system_prompt.txt ✓

prompts/default/scoring_user_prompt_template.txt ✓

The resource files are located in dat-core/src/main/resources/ and will be found correctly at runtime.

coderabbitai · 2025-11-20T04:01:09Z

dat-core/src/main/java/ai/dat/core/factories/DefaultContentStoreFactory.java

+        if (config.get(RERANK_MODE) && config.get(USE_LLM_RERANKING)) {
+            ChatModel rerankingChatModel = config.getOptional(RERANKING_LLM)
+                    .map(instances::get).orElse(defaultInstance).getChatModel();
+            builder.scoringModel(LlmScoringModel.builder()
+                    .chatModel(rerankingChatModel)
+                    .build());
+        }


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

LLM reranking instantiation looks correct.

The logic properly creates an LlmScoringModel only when both RERANK_MODE and USE_LLM_RERANKING are enabled, with a safe fallback to the default LLM instance if RERANKING_LLM is not specified.

Verify that the integration works correctly by checking how the scoringModel is used in DefaultContentStore:

🏁 Script executed:

#!/bin/bash # Description: Verify scoringModel usage in DefaultContentStore echo "Finding DefaultContentStore usage of scoringModel..." ast-grep --pattern 'class DefaultContentStore { $$$ scoringModel $$$ }' echo -e "\n\nSearching for rerank-related methods..." rg -n -A 5 -B 2 "\.scoringModel\(" dat-core/src/main/java/ai/dat/core/contentstore/DefaultContentStore.java

Length of output: 1639

--- 🏁 Script executed: ```shell #!/bin/bash # Description: Verify validation logic for USE_LLM_RERANKING and RERANK_MODE configuration echo "Checking validation code around lines 497-500..." rg -n -A 10 -B 5 "rerankMode\|RERANK_MODE" dat-core/src/main/java/ai/dat/core/factories/DefaultContentStoreFactory.java | head -100 echo -e "\n\nSearching for USE_LLM_RERANKING validation..." rg -n -A 5 -B 5 "USE_LLM_RERANKING" dat-core/src/main/java/ai/dat/core/factories/DefaultContentStoreFactory.java

Length of output: 330

--- 🏁 Script executed: ```shell #!/bin/bash # Description: Find correct path and verify validation logic echo "Finding DefaultContentStoreFactory.java..." fd "DefaultContentStoreFactory.java" echo -e "\n\nSearching for validation logic..." rg -n -A 10 -B 5 "rerankMode.*useLlmReranking\|USE_LLM_RERANKING" --type java

Length of output: 155

--- 🏁 Script executed: ```shell #!/bin/bash # Description: View actual validation logic around lines 497-513 echo "Viewing validation code..." sed -n '490,520p' dat-core/src/main/java/ai/dat/core/factories/DefaultContentStoreFactory.java echo -e "\n\nViewing RERANK_MODE and USE_LLM_RERANKING config declarations..." sed -n '70,85p' dat-core/src/main/java/ai/dat/core/factories/DefaultContentStoreFactory.java

Length of output: 3009

LLM reranking instantiation looks correct, but missing validation for nonsensical configuration.

The logic properly creates an LlmScoringModel only when both RERANK_MODE and USE_LLM_RERANKING are enabled, with a safe fallback to the default LLM instance if RERANKING_LLM is not specified. Integration verified—scoringModel is correctly passed to ReRankingContentAggregator in all reranking flows.

However, add validation to prevent USE_LLM_RERANKING=true without RERANK_MODE=true. Currently, if a user enables USE_LLM_RERANKING but leaves RERANK_MODE false, the LLM scoring model is silently not created (line 396-402 requires both conditions), and the configuration is ignored. Add a check at lines 497-500 (after retrieving both flags) to enforce: if useLlmReranking is true, then rerankMode must also be true.

coderabbitai · 2025-11-20T04:01:09Z

dat-core/src/main/java/ai/dat/core/factories/DefaultContentStoreFactory.java

+        Boolean useLlmReranking = config.get(USE_LLM_RERANKING);
+        Preconditions.checkArgument(!rerankMode || useLlmReranking || scoringModel != null,
+                "'" + RERANK_MODE.key() + "' is true and '" + USE_LLM_RERANKING.key() + "' is false, " +
+                        "reranking has not been set yet");


⚠️ Potential issue | 🟠 Major

Add validation that USE_LLM_RERANKING requires RERANK_MODE to be enabled.

The current validation ensures that when RERANK_MODE is true, either USE_LLM_RERANKING is true or a scoringModel exists. However, the reverse constraint is missing: USE_LLM_RERANKING should only be meaningful when RERANK_MODE is also true. Allowing USE_LLM_RERANKING=true with RERANK_MODE=false is a misconfiguration that silently does nothing (Lines 396-402 won't execute).

Add a validation check after Line 500:

Preconditions.checkArgument(!rerankMode || useLlmReranking || scoringModel != null, "'" + RERANK_MODE.key() + "' is true and '" + USE_LLM_RERANKING.key() + "' is false, " + "reranking has not been set yet"); + Preconditions.checkArgument(!useLlmReranking || rerankMode, + "'" + USE_LLM_RERANKING.key() + "' is true but '" + RERANK_MODE.key() + "' is false. " + + "LLM reranking requires reranking to be enabled.");

🤖 Prompt for AI Agents

In dat-core/src/main/java/ai/dat/core/factories/DefaultContentStoreFactory.java around lines 497 to 500, add a validation that if USE_LLM_RERANKING is true then RERANK_MODE must also be enabled: check the boolean from config (useLlmReranking) and assert that if useLlmReranking is true then rerankMode is true, using Preconditions.checkArgument with a clear error message that "'USE_LLM_RERANKING' requires 'RERANK_MODE' to be true" to fail fast on this misconfiguration.

feat: add support reranking using LLM

5bd057f

coderabbitai bot reviewed Nov 20, 2025

View reviewed changes

junjiem added 3 commits November 20, 2025 11:53

fix: add error handling for LLM response parsing

80dc5f2

fix: typos in the prompt

06ddd7e

fix: add validation for RERANKING_LLM configuration option

45b89a3

coderabbitai bot reviewed Nov 20, 2025

View reviewed changes

junjiem merged commit b332e22 into main Nov 20, 2025
1 check passed

junjiem deleted the feat/reranking_using_llm branch November 20, 2025 06:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add support reranking using LLM#147

feat: add support reranking using LLM#147
junjiem merged 4 commits intomainfrom
feat/reranking_using_llm

junjiem commented Nov 20, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Nov 20, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Nov 20, 2025

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Nov 20, 2025

Uh oh!

coderabbitai bot Nov 20, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

junjiem commented Nov 20, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

junjiem commented Nov 20, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 20, 2025 •

edited

Loading