ENG-1553 #6951

JadeCara · 2025-11-12T00:24:39Z

Description Of Changes

🎯 e2e tests were taking longer than usual to finish, they were taking 7-10 minutes to finish the ‘Dataset reference validation’ step.

Using Datadog logs located some bottleneck areas:

DatasetConfig N+1 queries (lines 497-510)
ConnectionConfig N+1 queries (lines 532-540)
Manual Task N+1 queries (create_manual_task_artificial_graphs)
AccessManualWebhook N+1 queries (get_manual_webhook_access_inputs)

Performance impact
Before (from logs):
18 seconds from start to dataset parsing
Multiple rounds of dataset parsing (suggesting repeated validation)
159 datasets × 2 relationships = potentially 318+ queries just for dataset loading
After (expected):
~3 queries for dataset loading (1 for datasets, 1 for connection_configs, 1 for ctl_datasets)
~2 queries for connection configs (1 for configs, 1 for datasets)
~3 queries for manual tasks (1 for tasks, 1 for configs/fields, 1 for dependencies)
~3 queries for manual webhooks (1 for webhooks, 1 for connection_configs, 1 for systems)
Total: ~11 queries instead of potentially 500+
These changes should reduce the "Dataset reference validation" step time. The optimizations address the N+1 query issues identified in the logs.

Code Changes

src/fides/api/models/manual_webhook.py - use selectinload
src/fides/api/service/privacy_request/request_runner_service.py - use selectinload
src/fides/api/task/manual/manual_task_utils.py - use selectinload and batch load

Steps to Confirm

Run with fidesplus pointed at this branch.
Create several DSRs - There should be no change in functionality.
I am going to reach out to @nrxsmith to see if there is any appreciable testing time diffs on nightly when merged.

Pre-Merge Checklist

vercel · 2025-11-12T00:24:44Z

The latest updates on your projects. Learn more about Vercel for GitHub.

2 Skipped Deployments

Project	Deployment	Preview	Comments	Updated (UTC)
fides-plus-nightly	Ignored	Preview		Nov 19, 2025 7:08pm
fides-privacy-center	Ignored			Nov 19, 2025 7:08pm

codecov · 2025-11-12T00:52:27Z

Codecov Report

❌ Patch coverage is 84.61538% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 87.31%. Comparing base (054f7de) to head (4c3411f).
⚠️ Report is 2 commits behind head on main.

Files with missing lines	Patch %	Lines
src/fides/api/task/manual/manual_task_utils.py	80.95%	2 Missing and 2 partials ⚠️

❌ Your patch status has failed because the patch coverage (84.61%) is below the target coverage (100.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #6951      +/-   ##
==========================================
- Coverage   87.32%   87.31%   -0.01%     
==========================================
  Files         525      525              
  Lines       34515    34526      +11     
  Branches     3984     3986       +2     
==========================================
+ Hits        30140    30148       +8     
- Misses       3509     3511       +2     
- Partials      866      867       +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

…rocess

greptile-apps · 2025-11-12T16:16:43Z

Greptile Overview

Greptile Summary

This PR addresses performance bottlenecks in the dataset reference validation step by eliminating N+1 query issues identified through Datadog logs. The changes use SQLAlchemy's selectinload to eagerly load relationships and batch loading to reduce query count from potentially 500+ to approximately 11 queries.

Key optimizations:

DatasetConfig loading: Eager loads connection_config and ctl_dataset relationships to avoid repeated queries when building dataset graphs
ConnectionConfig loading: Eager loads datasets relationship to prevent N+1 queries in filter_fides_connector_datasets
AccessManualWebhook loading: Eager loads connection_config.system relationship to avoid lazy loading during webhook processing
ManualTask loading: Refactored to batch load all manual tasks with their configs.field_definitions and conditional_dependencies relationships in a single query, replacing the previous approach of querying each connection key individually

The refactoring in manual_task_utils.py extracts common collection creation logic into a helper function (_create_collection_from_manual_task) to avoid code duplication while maintaining the same functionality. All changes are focused on query optimization without altering business logic.

Confidence Score: 5/5

This PR is safe to merge with minimal risk
The changes are focused performance optimizations using well-established SQLAlchemy patterns (eager loading with selectinload). The business logic remains unchanged, only the query strategy is optimized. The refactoring in manual_task_utils.py properly maintains the same functionality while eliminating N+1 queries through batch loading. All relationship paths being eagerly loaded are valid and exist in the models.
No files require special attention

Important Files Changed

File Analysis

Filename	Score	Overview
src/fides/api/models/manual_webhook.py	5/5	Optimized `get_enabled` method by eagerly loading `connection_config.system` relationship using `selectinload` to prevent N+1 queries when accessing webhook system information
src/fides/api/service/privacy_request/request_runner_service.py	5/5	Replaced `DatasetConfig.all()` and `ConnectionConfig.all()` with eager loading using `selectinload` to pre-fetch related `connection_config`, `ctl_dataset`, and `datasets` relationships, eliminating N+1 queries during dataset validation
src/fides/api/task/manual/manual_task_utils.py	5/5	Refactored `create_manual_task_artificial_graphs` to batch load all manual tasks with eager loading of `configs.field_definitions` and `conditional_dependencies` relationships, replacing per-connection queries with a single batched query

greptile-apps

_{3 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

thabofletcher · 2025-11-14T18:51:02Z

src/fides/api/task/manual/manual_task_utils.py

+            name=connection_key,
+            collections=[collection],
+            connection_key=connection_key,
+        )


I just want to say that I love the style of doing validation early and return/continue for failure so that the core logic doesn't end up nested seven layers deep 👍

thabofletcher

🚀 🚀 🚀

…rocess

Co-authored-by: Jade Wibbels <jade@ethyca.com>

bottle necks

bcec9f1

Merge branch 'main' into ENG-1553-long-dataset-reference-validation-p…

494a544

…rocess

JadeCara marked this pull request as ready for review November 12, 2025 16:13

JadeCara requested a review from a team as a code owner November 12, 2025 16:13

JadeCara requested review from thabofletcher and removed request for a team November 12, 2025 16:13

vercel bot deployed to Preview – fides-plus-nightly November 12, 2025 16:14 View deployment

greptile-apps bot reviewed Nov 12, 2025

View reviewed changes

thabofletcher reviewed Nov 14, 2025

View reviewed changes

thabofletcher approved these changes Nov 14, 2025

View reviewed changes

Merge branch 'main' into ENG-1553-long-dataset-reference-validation-p…

db4578e

…rocess

vercel bot deployed to Preview – fides-plus-nightly November 17, 2025 18:35 View deployment

JadeCara enabled auto-merge November 17, 2025 18:35

JadeCara added this pull request to the merge queue Nov 17, 2025

JadeCara removed this pull request from the merge queue due to a manual request Nov 17, 2025

updated changelog

0abe100

JadeCara enabled auto-merge November 17, 2025 19:12

Merge branch 'main' into ENG-1553-long-dataset-reference-validation-p…

d3fe9e6

…rocess

JadeCara disabled auto-merge November 17, 2025 19:57

Merge branch 'main' into ENG-1553-long-dataset-reference-validation-p…

d06d3ef

…rocess

vercel bot deployed to Preview – fides-plus-nightly November 17, 2025 20:22 View deployment

Merge branch 'main' into ENG-1553-long-dataset-reference-validation-p…

67ff5fe

…rocess

JadeCara enabled auto-merge November 17, 2025 23:16

vercel bot deployed to Preview – fides-plus-nightly November 17, 2025 23:19 View deployment

JadeCara added this pull request to the merge queue Nov 17, 2025

JadeCara removed this pull request from the merge queue due to a manual request Nov 17, 2025

Merge branch 'main' into ENG-1553-long-dataset-reference-validation-p…

4bceb7e

…rocess

JadeCara enabled auto-merge November 18, 2025 00:29

mark xfail

350179d

JadeCara added this pull request to the merge queue Nov 18, 2025

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Nov 18, 2025

JadeCara added this pull request to the merge queue Nov 18, 2025

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Nov 18, 2025

Merge branch 'main' into ENG-1553-long-dataset-reference-validation-p…

cf94b9d

…rocess

JadeCara enabled auto-merge November 19, 2025 18:31

vercel bot deployed to Preview – fides-plus-nightly November 19, 2025 18:36 View deployment

Merge branch 'main' into ENG-1553-long-dataset-reference-validation-p…

5e019e4

…rocess

vercel bot deployed to Preview – fides-plus-nightly November 19, 2025 19:04 View deployment

changelog

4c3411f

JadeCara added this pull request to the merge queue Nov 19, 2025

Merged via the queue into main with commit 9042667 Nov 19, 2025
68 of 69 checks passed

JadeCara deleted the ENG-1553-long-dataset-reference-validation-process branch November 19, 2025 20:19

jjdaurora pushed a commit that referenced this pull request Dec 5, 2025

ENG-1553 (#6951)

9551a5e

Co-authored-by: Jade Wibbels <jade@ethyca.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ENG-1553 #6951

ENG-1553 #6951

Uh oh!

JadeCara commented Nov 12, 2025 •

edited

Loading

Uh oh!

vercel bot commented Nov 12, 2025 •

edited

Loading

Uh oh!

codecov bot commented Nov 12, 2025 •

edited

Loading

Uh oh!

greptile-apps bot commented Nov 12, 2025

Uh oh!

greptile-apps bot left a comment

Uh oh!

thabofletcher Nov 14, 2025

Uh oh!

thabofletcher left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ENG-1553 #6951

ENG-1553 #6951

Uh oh!

Conversation

JadeCara commented Nov 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description Of Changes

Code Changes

Steps to Confirm

Pre-Merge Checklist

Uh oh!

vercel bot commented Nov 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Nov 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

greptile-apps bot commented Nov 12, 2025

Greptile Overview

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

thabofletcher Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

thabofletcher left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

JadeCara commented Nov 12, 2025 •

edited

Loading

vercel bot commented Nov 12, 2025 •

edited

Loading

codecov bot commented Nov 12, 2025 •

edited

Loading