-
Notifications
You must be signed in to change notification settings - Fork 84
[ENG-1290] Privacy Request Fields #6984
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
1 Skipped Deployment
|
Codecov Report❌ Patch coverage is ❌ Your patch status has failed because the patch coverage (81.81%) is below the target coverage (100.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #6984 +/- ##
==========================================
+ Coverage 87.00% 87.24% +0.23%
==========================================
Files 528 529 +1
Lines 34682 34759 +77
Branches 4010 4028 +18
==========================================
+ Hits 30176 30324 +148
+ Misses 3630 3553 -77
- Partials 876 882 +6 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…il.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: Jade Wibbels <jade@ethyca.com>
johnewart
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some naive questions - I don't see anything blocking with the PR, though it would be nice to maybe explain the logic a little bit for our viewers at home if you have a moment. If the questions spark anything meaningful in terms of changes then I'm happy to re-review - I think my only "might be nice to change if possible" is around the type-safety / passing around of arbitrary data (Any / dict[str, Any]) but I understand that sometimes you just can't avoid it 😄
| return value.isoformat() | ||
| if isinstance(value, UUID): | ||
| return str(value) | ||
| return value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this always be evaluation ready?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a totally reasonable question!
TLDR: this is set up to transition values into those expected/accepted by the evaluator. I have added a couple more transformations for edge cases and because of the updated generality moved it to task.conditional_dependencies.util.py so we can re-use it more easily.
Long answer for paper trail, might be useful to future us:
Manual tasks have the ability to create conditionally based on conditions that are set by the user. The original form of this was using data from previous steps in the DSR
Example: Only make a manual task if we know that the users data will be there. This is indicated by a value in a field in one of their tables like Bigquery etc.
- When you create a manual task with this condition it will create a reference to that table so the evaluation will only take place after that table has been visited and the data has been added to the growing DSR access dict.
- This data is then passed to the evaluator which knows that is is translating from a
field_pathto a nested dict and that there will be a value there to compare against theopandvaluein the condition. - The operators are all defined and work with specific data types defined on here
The new privacy request fields are not part of this access dict, so we need to put the privacy request (and related models data) into the same form and make sure the data is also consistent for evaluation. This function in particular is making sure the values will work with the evaluator.
|
|
||
| def to_evaluation_data(self, field_addresses: set[str]) -> dict[str, Any]: | ||
| """ | ||
| Transforms the PrivacyRequest into an evaluation-ready dictionary structure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we return Any, how does the evaluator know what to do with the values? Will it be expected to type check everything or do we need to transform it into something a bit more consistent that has a predictable interface?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The evaluator is expecting a nested dict. The operators are all lambdas which do a data type check as part of their operation.
| current = extract_nested_field_value(current, parts) | ||
|
|
||
| # If we started with an Identity object and didn't extract any field from it | ||
| # (current is still the identity object), return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be useful to say why we are returning None - it's not clear
| elif parts[0] == "custom_privacy_request_fields": | ||
| current = self.custom_privacy_request_fields_data | ||
|
|
||
| if current != self.privacy_request: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this because the default is "anything in the privacy request doesn't have an explicit start path"? What if I pass in privacy_request.xxx.yyy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of the fields start with privacy_request. We remove this from the path on line 72
# first part will always be privacy_request
parts: list[str] = field_address.split(".")[1:]The current is set to privacy request at that point - if the next part is one of the handled relationships (policy, identity, or custom_privacy_request_fields), the current is updated. I am going to update to pop the next part for each to make it clearer why.
This only saves 1 line and causes confusion :)
Co-authored-by: Jade Wibbels <jade@ethyca.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Ticket ENG-1290
Description Of Changes
🎯 As a user, I want to specify conditions that must be true to create manual tasks, which check the request data, so that I can create tasks conditionally based on data in the request.
This PR is focused on getting the data from the privacy request into dictionary form to be passed to the evaluator. It uses convenience fields introduced in #6983 and produces the minimum dictionary based on the field paths provided.
Code Changes
src/fides/api/task/conditional_dependencies/privacy_request/privacy_request_data.pytests/api/task/conditional_dependencies/privacy_request/test_privacy_request_data.pySteps to Confirm
Pre-Merge Checklist
CHANGELOG.mdupdatedmaindowngrade()migration is correct and works