Table of Content Zone | |
---|---|
|
1. Overview of the detection methodology
The core detection capability within the platform is distributed across 3 distinct steps in the end-to-end evaluation flow.
...
Once data is ingested into the transaction history by the TMS API, the Channel Router and Setup Processor (CRSP) performs an initial “triage” step to determine if the transaction should be inspected by the platform, and in what way. At the moment this is a very simple decision based on the transaction type only (i.e. pain.001, pain.013, pacs.008 and pacs.002), though we envisage that the decision-making here can be more complex in the future by inspecting attributes contained in the message. For now, the CRSP uses the transaction type1 to select the typologies that are to be evaluated and triggers the rules required by the typologies. The CRSP routing is configured via a network map that defines the hierarchy of typologies and rules. While not directly influenced by a calibration process at present, the behavior of existing rules and typologies may result in changes to the scope of the evaluation defined in the network map. Some rules or typologies may be deemed to be ineffective in the current configuration and removed or recomposed, and new rules or typologies may be added as new behaviors emerge.
...
Each rule processor that receives the trigger payload from the CRSP evaluates the transaction and the historical behavior of its participants according to its specification and configuration. Rule processors are driven by a combination of parameters and result specifications to determine only one of a number of related outcomes. The rule outcome is then submitted to the typology processor for scoring.
The typology processor assigns a weighting to each rule outcome as it is received based on the rule’s parent typologies’ configurations. Once all the rule results for a specific typology has been received, the typology adds all the weighted scores together into the typology score. The typology score can be evaluated against an “interdiction” threshold to determine if the client system should be instructed to block a transaction “in flight” and also an investigation threshold to trigger a review process at the end of the transaction evaluation.2
...
Once these three steps are complete, the evaluation of the transaction is wrapped up in the Channel and Transaction Aggregation and Decisioning Processor where the results from typologies are aggregated and reviewed to determine if an investigation alert should be sent to the Case Management System. If any typology had breached either its investigation or interdiction threshold, the transaction will trigger an alert.
The evaluation process accommodates a number of different calibration levers that can be manipulated to alter the evaluation outcome.
...
In the CRSP:
Changes to the rule and typology scope of the evaluation
In the rule processors:
Changes to the parameters that influence the rule processors behaviour
Changes to the result bands that classify the rule processors output
In the typology processor:
Changes to the rule result weightings
Changes to the typology threshold
In this document, we will discuss how the various configuration documents are expected to be updated to influence evaluation behavior.
2. Configuration Management
Configuration documents are essentially files that contain a processor-specific configuration object in JSON format. The recommended way to upload the configuration file to the appropriate configuration database and collection is via Arango DB’s HTTP API that is deployed as standard during platform deployment.
2.1. Rule Processor Configuration
2.1.2. Introduction
A rule processor is a custom-built module that evaluates an incoming message according to its code. When a new rule processor is developed, the rule designer will specify both the input parameters for the rule, as well as the output results. Changes to these attributes can alter a rule processor’s behavior and it is expected that these attributes are hosted in the rule configuration so that the rule processor behavior can be altered by updating the configuration instead of changing the rule processor code.
A rule processor configuration document typically contains the following information:
Rule configuration metadata
A config object, that
may contain a number of parameters
may contain a number of exit conditions
will contains either result bands
or alternatively will contain result cases
Rule configuration metadata
The rule configuration “header” contains metadata that describes the rule. The metadata includes the following attributes:
id
identifies the specific processor and version related to the configurationcfg
is the unique version of the configuration. Multiple different versions of a config can co-exist simultaneously in the platform.desc
offers a readable description of the rule
The combination of the id
and cfg
strings forms a unique identifier for every rule configuration and is sometimes compiled into a database key, though this is not essential: the database enforces the uniqueness of any configuration to make sure that a specific version of a configuration can never be over-written.
The configuration object - parameters
A rule processor’s parameters are used to define how a rule processor will operate to evaluate the incoming message. The requirement for the parameters are coded into the rule processor and must be provided in the configuration for the rule processor to deliver a successful outcome. If any of the required parameters are missing, the rule processor will still deliver a result, but it will be a default error outcome. Parameters are given descriptive names to assist the operator in specifying them correctly. Parameters often differ from one rule to the next, but typically define thresholds and time-frames for the historical queries that are executed inside a rule processor. Some notable examples:
...
Parameter
...
Description
...
evaluationIntervalTime
...
The time-frame that defines the intervals into which a histogram is partitioned. Some rules perform a statistical analysis of behaviour over time and partitions the historical data into a histogram. This parameter defines, in milliseconds, the time-frame of each interval.
...
maxQueryLimit
...
TL;DR
Platform configuration is managed through a number of configuration files, each containing a JSON document that configures a specific processor type (CRSP, rules and typologies) and specific processor instance identified by a processor identifier (id@version) and a configuration version.
Changes to a rule processor’s behavior can be made by changing the parameters or rearranging the result bands (or cases).
Changes to the typology processor’s behavior for a specific typology can be made by changing the thresholds for interdiction or alerting or by changing the weighting of the rule results from the rule processors that contribute to the typology score.
Changes to the network map that is used in the CRSP for routing can add/remove rules to typologies, add/remove typologies from the channel and transaction type, and change the version of a configuration file that is used to calculate a rule or typology result.
In a production environment, configurations should never be over-written and new versions of configurations should be issued to supersede older versions. A new network map must then be issued to implement the updated configuration.
Info |
---|
In a test or PoC environment, it may sometimes be simpler to just overwrite existing configuration files so that you don’t have to constantly update the network map every time you experiment with a change, but don’t do this in your production environment. |
Configuration documents can be uploaded to the platform using the ArangoDB API deployed with the platform.
1. Overview of the detection methodology
The core detection capability within the platform is distributed across 3 distinct steps in the end-to-end evaluation flow.
...
Once data is ingested into the transaction history by the TMS API, the Channel Router and Setup Processor (CRSP) performs an initial “triage” step to determine if the transaction should be inspected by the platform, and in what way. At the moment this is a very simple decision based on the transaction type only (i.e. pain.001, pain.013, pacs.008 and pacs.002), though we envisage that the decision-making here can be more complex in the future by inspecting attributes contained in the message. For now, the CRSP uses the transaction type1 to select the typologies that are to be evaluated and triggers the rules required by the typologies. The CRSP routing is configured via a network map that defines the hierarchy of typologies and rules. While not directly influenced by a calibration process at present, the behavior of existing rules and typologies may result in changes to the scope of the evaluation defined in the network map. Some rules or typologies may be deemed to be ineffective in the current configuration and removed or recomposed, and new rules or typologies may be added as new behaviors emerge.
...
Each rule processor that receives the trigger payload from the CRSP evaluates the transaction and the historical behavior of its participants according to its specification and configuration. Rule processors are driven by a combination of parameters and result specifications to determine only one of a number of related outcomes. The rule outcome is then submitted to the typology processor for scoring.
The typology processor assigns a weighting to each rule outcome as it is received based on the rule’s parent typologies’ configurations. Once all the rule results for a specific typology has been received, the typology adds all the weighted scores together into the typology score. The typology score can be evaluated against an “interdiction” threshold to determine if the client system should be instructed to block a transaction “in flight” and also an investigation threshold to trigger a review process at the end of the transaction evaluation.2
...
Once these three steps are complete, the evaluation of the transaction is wrapped up in the Channel and Transaction Aggregation and Decisioning Processor where the results from typologies are aggregated and reviewed to determine if an investigation alert should be sent to the Case Management System. If any typology had breached either its investigation or interdiction threshold, the transaction will trigger an alert.
The evaluation process accommodates a number of different calibration levers that can be manipulated to alter the evaluation outcome.
...
In the CRSP:
Changes to the rule and typology scope of the evaluation
In the rule processors:
Changes to the parameters that influence the rule processors behavior
Changes to the result bands that classify the rule processors output
In the typology processor:
Changes to the rule result weightings
Changes to the typology threshold
In this document, we will discuss how the various configuration documents are expected to be updated to influence evaluation behavior.
2. Configuration Management
Configuration documents are essentially files that contain a processor-specific configuration object in JSON format. The recommended way to upload the configuration file to the appropriate configuration database and collection is via Arango DB’s HTTP API that is deployed as standard during platform deployment.
2.1. Rule Processor Configuration
2.1.2. Introduction
A rule processor is a custom-built module that evaluates an incoming message according to its code. When a new rule processor is developed, the rule designer will specify both the input parameters for the rule, as well as the output results. Changes to these attributes can alter a rule processor’s behavior and it is expected that these attributes are hosted in the rule configuration so that the rule processor behavior can be altered by updating the configuration instead of changing the rule processor code.
A rule processor configuration document typically contains the following information:
Rule configuration metadata
A config object, that
may contain a number of parameters
may contain a number of exit conditions
will contains either result bands
or alternatively will contain result cases
Rule configuration metadata
The rule configuration “header” contains metadata that describes the rule. The metadata includes the following attributes:
id
identifies the specific processor and version related to the configurationcfg
is the unique version of the configuration. Multiple different versions of a config can co-exist simultaneously in the platform.desc
offers a readable description of the rule
The combination of the id
and cfg
strings forms a unique identifier for every rule configuration and is sometimes compiled into a database key, though this is not essential: the database enforces the uniqueness of any configuration to make sure that a specific version of a configuration can never be over-written.
The configuration object - parameters
A rule processor’s parameters are used to define how a rule processor will operate to evaluate the incoming message. The requirement for the parameters are coded into the rule processor and must be provided in the configuration for the rule processor to deliver a successful outcome. If any of the required parameters are missing, the rule processor will still deliver a result, but it will be a default error outcome. Parameters are given descriptive names to assist the operator in specifying them correctly. Parameters often differ from one rule to the next, but typically define thresholds and time-frames for the historical queries that are executed inside a rule processor. Some notable examples:
Parameter | Description |
---|---|
| The time-frame that defines the intervals into which a histogram is partitioned. Some rules perform a statistical analysis of behaviour over time and partitions the historical data into a histogram. This parameter defines, in milliseconds, the time-frame of each interval. |
| The maximum number of records to return in the query. This parameter limits the number of results that can be returned from the database. |
| A time (in milliseconds) that limits the maximum extent of a historical query. A query with a value of 86400000 would only look up messages received within the last 24 hours. |
| The least number of transactions required for the rule processor to produce a result. Some statistical algorithms required at least a certain number of data-points to be able to render a useful result. If the minimum number of transactions cannot be retrieved, the rule processor will raised a non-deterministic exit condition. |
| A margin of error for an evaluation against a threshold. With a tolerance of 0 (zero) the match against a target value would have to be exact, but with a tolerance value of 0.1, the match could be in a range either 10% below or above the threshold value. |
...
Code | Description | Example(s) |
---|---|---|
| This condition applies to rule processors that rely on the current transaction being successful in order for the rule to produce a meaningful result. Unsuccessful transactions are often not processed to spare system resources or because the unsuccessful transaction means that the rule processor is unable to function as designed. |
|
| For certain rules, a specific minimum number of historical transactions are required for the rule processor to produce an effective result. This exit condition will be reported if the minimum number of historical records cannot be retrieved in the rule processor. |
|
| Currently unused. | |
| The statistical analyses employed in some rule processors evaluate trends in behavior over a number of transactions over a period of time. While the trend itself can be categorized and reported by the regular rule results, some results are not part of an automatable scaled result. This exception provides an outcome when the historical period does not show a clear trend, but the most recent period shows an upturn. |
|
| Similar to |
|
Example:
Code Block |
---|
"config": {
"exitConditions": [
{
"subRuleRef": ".x00",
"outcome": false,
"reason": "Unsuccessful transaction"
},
{
"subRuleRef": ".x01",
"outcome": false,
"reason": "Insufficient transaction history"
}
] |
Info |
---|
The |
The configuration object - rule results
While the parameters and exit conditions may be optional for a specific rule processor, the core function and output of a rule processor is contained in the results object. Rule processors offer two different kinds of rule results:
Banded results, where the result from the rule processor is categorized into one out of a number of discrete bands that partition a contiguous range of possible results.
...
Cased results, where the result from the rule processor is an explicit value from a list of discrete and explicit values.
...
The rule processor’s core purpose is to produce a definitive deterministic result based on its programmed behavioral analysis of historical data. The rule configuration defines the bands or values for which rule results can be provided.
...
period of time. While the trend itself can be categorized and reported by the regular rule results, some results are not part of an automatable scaled result. This exception provides an outcome when the historical period does not show a clear trend, but the most recent period shows an upturn. |
| |
| Similar to |
|
Example:
Code Block |
---|
"config": {
"exitConditions": [
{
"subRuleRef": ".x00",
"outcome": false,
"reason": "Unsuccessful transaction"
},
{
"subRuleRef": ".x01",
"outcome": false,
"reason": "Insufficient transaction history"
}
] |
Each exit condition contains the same attributes:
Attribute | Description |
---|---|
subRuleRef | Every rule processor is capable of reporting a number of different outcomes, but only a single outcome from the complete set is ultimately delivered to the typology processor. Each outcome is defined by a unique sub-rule reference identifier to differentiate the delivered outcome from the others and also to allow the typology processor to apply a unique weighting to that specific outcome. By convention, the exit condition sub-rule references are prefaced with an 'x'. |
outcome | The configuration file defines whether the result delivered by the rule processor is flagged as either Exit conditions are usually non-deterministic. |
reason9 | The reason provides a human-readable description of the result that accompanies the rule result to the eventual over-all evaluation result. |
Info |
---|
The |
The configuration object - rule results
While the parameters and exit conditions may be optional for a specific rule processor, the core function and output of a rule processor is contained in the results object. Rule processors offer two different kinds of rule results:
Banded results, where the result from the rule processor is categorized into one out of a number of discrete bands that partition a contiguous range of possible results.
...
Cased results, where the result from the rule processor is an explicit value from a list of discrete and explicit values.
...
The rule processor’s core purpose is to produce a definitive deterministic result based on its programmed behavioral analysis of historical data. The rule configuration defines the bands or values for which rule results can be provided.
...
It is extremely important that the configuration of a rule processor does not leave any gaps in the results, whether banded or cased. Every possible outcome of a rule result must be accounted for, otherwise the rule processor may deliver a result that the typology processor cannot interpret. In the event that a rule processor result misses the configured results, the rule processor will issue an error (.err
) result with a reason description of Value provided undefined, so cannot determine rule outcome
.
Rule results - banded results
Banded results are partitions in a contiguous range of results, effectively from -∞ to +∞. When a target value is evaluated against a result band the lower limit of a band is always evaluated with the >=
operator and the upper limit is always evaluated with the <
operator. This way, we can configure the upper limit of one band and the lower limit of the next band with the exact same value to make sure there is no overlap between bands and also no gap.
...
Where a lower limit is not provided, the rule processor will assume the intended target lower limit is -∞.
Where an upper limit is not provided, the rule processor will assume the intended target upper limit is +∞.
A rule processor with a banded results configuration can have an unlimited number of specified bands.
The rule result bands are specified in the config
object in the rule configuration as an array of elements under a bands
object:
Code Block |
---|
"config": {
"bands": [
{
"subRuleRef": ".01",
"upperLimit": 86400000,
"outcome": true,
"reason": "Account is less than 1 day old"
},
{
"subRuleRef": ".02",
"lowerLimit": 86400000,
"upperLimit": 2592000000,
"outcome": true,
"reason": "Account is between 1 and 30 days old"
},
{
"subRuleRef": ".03",
"lowerLimit": 2592000000,
"outcome": true,
"reason": "Account is more than 30 days old"
}
]
} |
Each rule result band contains the same information:
Attribute | Description |
---|---|
subRuleRef | Every rule processor is capable of reporting a number of different outcomes, but only a single outcome from the complete set is ultimately delivered to the typology processor. Each outcome is defined by a unique sub-rule reference identifier to differentiate the delivered outcome from the others and also to allow the typology processor to apply a unique weighting to that specific outcome. We have elected to assign a numeric sequence to the sub-rule references for result bands, prefaced with a dot (“.”) separator, but this format is not mandatory for the sub-rule reference string. Any descriptive and unique string would be an acceptable sub-rule reference. |
lowerLimit | This attribute defines the lower limit of the band range and is evaluated inclusively ( |
upperLimit | This attribute defines the upper limit of the band range and is evaluated exclusively ( |
outcome | The configuration file defines whether the result delivered by the rule processor is flagged as either |
reason9 | The reason provides a human-readable description of the result that accompanies the rule result to the eventual over-all evaluation result. |
One of the most frequent limit values in use in the platform is based on time-frames. In the platform, all time-frames and associated limits are represented in milliseconds. The following table reflects the conventional milliseconds for different time terms in our configurations:
Term | Milliseconds |
---|---|
second | 1,000 |
minute | 60,000 |
hour | 3,600,000 |
day | 86,400,000 |
week | 604,800,000 |
month (30.44 days) | 2,629,743,000 |
year (365.24 days) | 31,556,926,000 |
Rule results - cased results
...
In its current configuration, the platform only evaluates the pacs.002 as the trigger payload for the rule processors and typologies have only been defined with the final status of a payment transaction in mind.
The typology processor is not currently configured to interdict the transaction when the threshold is breached; only investigations are commissioned once the evaluation of all the typologies are complete.
https://frmscoe.atlassian.net/wiki/spaces/FRMS/pages/76906497/Configuration+management#References%3A
An explicit version reference has been planned for development to make it easier for an operator to link an evaluation result to the specific originating network map.
We have found during our performance testing that the text-based descriptions in our processor results undermines the performance gains we achieved with our ProtoBuff implementation. We will be removing the unabridged reason and processor descriptions from the configuration documents in favor of shorter look-up codes that will then also be used to introduce regionalized/language-specific descriptions.