Migrating JavaScript Dataflow Queries¶
The JavaScript analysis used to have its own data flow library, which differed from the shared data flow library used by other languages. This library has now been deprecated in favor of the shared library.
This article explains how to migrate JavaScript data flow queries to use the shared data flow library, and some important differences to be aware of. Note that the article on analyzing data flow in JavaScript and TypeScript provides a general guide to the new data flow library, whereas this article aims to help with migrating existing queries from the old data flow library.
Note that the DataFlow::Configuration class is still backed by the original data flow library, but has been marked as deprecated.
This means data flow queries using this class will continue to work, albeit with deprecation warnings, until the 1-year deprecation period expires in early 2026.
It is recommended that all custom queries are migrated before this time, to ensure they continue to work in the future.
Data flow queries should be migrated to use DataFlow::ConfigSig-style modules instead of the DataFlow::Configuration class.
This is identical to the interface found in other languages.
When making this switch, the query will become backed by the shared data flow library instead. That is, data flow queries will only work
with the shared data flow library when they have been migrated to ConfigSig-style, as shown in the following table:
API |
Implementation |
|---|---|
|
Old library (deprecated, to be removed in early 2026) |
|
Shared library |
A straightforward translation to DataFlow::ConfigSig-style is usually possible, although there are some complications
that may cause the query to behave differently.
We’ll first cover some straightforward migration examples, and then go over some of the complications that may arise.
Simple migration example¶
A simple example of a query using the old data flow library is shown below:
/** @kind path-problem */
import javascript
import DataFlow::PathGraph
class MyConfig extends DataFlow::Configuration {
MyConfig() { this = "MyConfig" }
override predicate isSource(DataFlow::Node node) { ... }
override predicate isSink(DataFlow::Node node) { ... }
}
from MyConfig cfg, DataFlow::PathNode source, DataFlow::PathNode sink
where cfg.hasFlowPath(source, sink)
select sink, source, sink, "Flow found"
With the new style this would look like this:
/** @kind path-problem */
import javascript
module MyConfig implements DataFlow::ConfigSig {
predicate isSource(DataFlow::Node node) { ... }
predicate isSink(DataFlow::Node node) { ... }
}
module MyFlow = DataFlow::Global<MyConfig>;
import MyFlow::PathGraph
from MyFlow::PathNode source, MyFlow::PathNode sink
where MyFlow::flowPath(source, sink)
select sink, source, sink, "Flow found"
The changes can be summarized as:
The
DataFlow::Configurationclass was replaced with a module implementingDataFlow::ConfigSig.The characteristic predicate was removed (modules have no characteristic predicates).
Predicates such as
isSourceno longer have theoverridekeyword (as they are defined in a module now).The configuration module is being passed to
DataFlow::Global, resulting in a new module, calledMyFlowin this example.The query imports
MyFlow::PathGraphinstead ofDataFlow::PathGraph.The
MyConfig cfgvariable was removed from thefromclause.The
hasFlowPathcall was replaced withMyFlow::flowPath.The type
DataFlow::PathNodewas replaced withMyFlow::PathNode.
With these changes, we have produced an equivalent query that is backed by the new data flow library.
Taint tracking¶
For configuration classes extending TaintTracking::Configuration, the migration is similar but with a few differences:
The
TaintTracking::Globalmodule should be used instead ofDataFlow::Global.Some predicates originating from
TaintTracking::Configurationshould be renamed to match theDataFlow::ConfigSiginterface: -isSanitizershould be renamed toisBarrier. -isAdditionalTaintStepshould be renamed toisAdditionalFlowStep.
Note that there is no such thing as TaintTracking::ConfigSig. The DataFlow::ConfigSig interface is used for both data flow and taint tracking.
For example:
class MyConfig extends TaintTracking::Configuration {
MyConfig() { this = "MyConfig" }
predicate isSanitizer(DataFlow::Node node) { ... }
predicate isAdditionalTaintStep(DataFlow::Node node1, DataFlow::Node node2) { ... }
...
}
The above configuration can be migrated to the shared data flow library as follows:
module MyConfig implements DataFlow::ConfigSig {
predicate isBarrier(DataFlow::Node node) { ... }
predicate isAdditionalFlowStep(DataFlow::Node node1, DataFlow::Node node2) { ... }
...
}
module MyFlow = TaintTracking::Global<MyConfig>;
Flow labels and flow states¶
The DataFlow::FlowLabel class has been deprecated. Queries that relied on flow labels should use the new flow state concept instead.
This is done by implementing DataFlow::StateConfigSig instead of DataFlow::ConfigSig, and passing the module to DataFlow::GlobalWithState
or TaintTracking::GlobalWithState. See using flow state for more details about flow state.
Some changes to be aware of:
The 4-argument version of
isAdditionalFlowStepnow takes parameters in a different order. It now takesnode1, state1, node2, state2instead ofnode1, node2, state1, state2.Taint steps apply to all flow states, not just the
taintflow label. See more details further down in this article.
Barrier guards¶
The predicates isBarrierGuard and isSanitizerGuard have been removed.
Instead, the isBarrier predicate must be used to define all barriers. To do this, barrier guards can be reduced to a set of barrier nodes using the DataFlow::MakeBarrierGuard module.
For example, consider this data flow configuration using a barrier guard:
class MyConfig extends DataFlow::Configuration {
override predicate isBarrierGuard(DataFlow::BarrierGuardNode node) {
node instanceof MyBarrierGuard
}
..
}
class MyBarrierGuard extends DataFlow::BarrierGuardNode {
MyBarrierGuard() { ... }
override predicate blocks(Expr e, boolean outcome) { ... }
}
This can be migrated to the shared data flow library as follows:
module MyConfig implements DataFlow::ConfigSig {
predicate isBarrier(DataFlow::Node node) {
node = DataFlow::MakeBarrierGuard<MyBarrierGuard>::getABarrierNode()
}
..
}
class MyBarrierGuard extends DataFlow::Node {
MyBarrierGuard() { ... }
predicate blocksExpr(Expr e, boolean outcome) { ... }
}
The changes can be summarized as:
- The contents of isBarrierGuard have been moved to isBarrier.
- The node instanceof MyBarrierGuard check was replaced with node = DataFlow::MakeBarrierGuard<MyBarrierGuard>::getABarrierNode().
- The MyBarrierGuard class no longer has DataFlow::BarrierGuardNode as a base class. We simply use DataFlow::Node instead.
- The blocks predicate has been renamed to blocksExpr and no longer has the override keyword.
See using flow state for examples of how to use barrier guards with flow state.
Query-specific load and store steps¶
The predicates isAdditionalLoadStep, isAdditionalStoreStep, and isAdditionalLoadStoreStep have been removed. There is no way to emulate the original behavior.
Library models can still contribute such steps, but they will be applicable to all queries. Also see the section on jump steps further down.
Changes in behavior¶
When the query has been migrated to the new interface, it may seem to behave differently due to some technical differences in the internals of the two data flow libraries. The most significant changes are described below.
Taint steps now propagate all flow states¶
There’s an important change from the old data flow library when using flow state and taint-tracking together.
When using TaintTracking::GlobalWithState, all flow states can propagate along taint steps.
In the old data flow library, only the taint flow label could propagate along taint steps.
A straightforward translation of such a query may therefore result in new flow paths being found, which might be unexpected.
To emulate the old behavior, use DataFlow::GlobalWithState instead of TaintTracking::GlobalWithState,
and manually add taint steps using isAdditionalFlowStep. The predicate TaintTracking::defaultTaintStep can be used to access to the set of taint steps.
For example:
module MyConfig implements DataFlow::StateConfigSig {
class FlowState extends string {
FlowState() { this = ["taint", "foo"] }
}
predicate isAdditionalFlowStep(DataFlow::Node node1, FlowState state1, DataFlow::Node node2, FlowState state2) {
// Allow taint steps to propagate the "taint" flow state
TaintTracking::defaultTaintStep(node1, node2) and
state1 = "taint" and
state2 = state
}
...
}
module MyFlow = DataFlow::GlobalWithState<MyConfig>;
Jump steps across function boundaries¶
When a flow step crosses a function boundary, that is, it starts and ends in two different functions, it will now be classified as a “jump” step.
Jump steps can be problematic in some cases. Roughly speaking, the data flow library will “forget” which call site it came from when following a jump step. This can lead to spurious flow paths that go into a function through one call site, and back out of a different call site.
If the step was generated by a library model, that is, the step is applicable to all queries, this is best mitigated by converting the step to a flow summary.
For example, the following library model adds a taint step from x to y in foo.bar(x, y => {}):
class MyStep extends TaintTracking::SharedTaintStep {
override predicate step(DataFlow::Node node1, DataFlow::Node node2) {
exists(DataFlow::CallNode call |
call = DataFlow::moduleMember("foo", "bar").getACall() and
node1 = call.getArgument(0) and
node2 = call.getCallback(1).getParameter(0)
)
}
}
Because this step crosses a function boundary, it becomes a jump step. This can be avoided by converting it to a flow summary as follows:
class MySummary extends DataFlow::SummarizedCallable {
MySummary() { this = "MySummary" }
override DataFlow::CallNode getACall() { result = DataFlow::moduleMember("foo", "bar").getACall() }
override predicate propagatesFlow(string input, string output, boolean preservesValue) {
input = "Argument[this]" and
output = "Argument[1].Parameter[0]" and
preservesValue = false // taint step
}
}
See customizing library models for JavaScript for details about the format of the input and output strings.
The aforementioned article also provides guidance on how to store the flow summary in a data extension.
For query-specific steps that cross function boundaries, that is, steps added with isAdditionalFlowStep, there is currently no way to emulate the original behavior.
A possible workaround is to convert the query-specific step to a flow summary. In this case it should be stored in a data extension to avoid performance issues, although this also means
that all other queries will be able to use the flow summary.
Barriers block all flows¶
In the shared data flow library, a barrier blocks all flows, even if the tracked value is inside a content.
In the old data flow library, only barriers specific to the data flow label blocked flows when the tracked value was inside a content.
This rarely has significant impact, but some users may observe some result changes because of this.
There is currently no way to emulate the original behavior.
Further reading¶
Analyzing data flow in JavaScript and TypeScript provides a general guide to the new data flow library.
Using flow state for precise data flow analysis provides a general guide on using flow state.