Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

alert-icon-red-11.pngImage Added

ATTENTION:

This page has been migrated to the Tazama GitHub repository and is now located at:

https://github.com/frmscoe/docs/blob/main/Product/configuration-management.md

This page will no longer be maintained in Confluence.

Table of Content Zone
Table of Contents
minLevel1
maxLevel6
outlinefalse
styledefault
typelist
printabletrue

TL;DR

Platform configuration is managed through a number of configuration files, each containing a JSON document that configures a specific processor type (CRSP, rules and typologies) and specific processor instance identified by a processor identifier (id@version) and a configuration version.

...

The core detection capability within the platform is distributed across 3 three distinct steps in the end-to-end evaluation flow.

...

  • Changes to the rule and typology scope of the evaluation (a - network map)

In the rule processors:

  • Changes to the parameters that influence the rule processors' behavior (b - rule config)

  • Changes to the result bands that classify the rule processors output

...

  • ' outcomes (b - rule config)

In the typology processor:

  • Changes to the rule result weightings (c - typology config)

  • Changes to the typology threshold (c - typology config)

In this document, we will discuss how the various configuration documents are expected to be updated to influence evaluation behavior.

...

Configuration documents are essentially files that contain a processor-specific configuration object in JSON format. The recommended way to upload the configuration file to the appropriate configuration database (networkMap or configuration) and collection is via Arango DB’s HTTP API that is deployed as standard during platform deployment.

The platform processes configurations in a specific order to evaluate an incoming transaction. Starting with the Channel Router & Setup Processor (CRSP) that interprets the network map for routing, then following with the rule processors that interpret their individual rule configurations to determine how to evaluate the transaction, and then concluding with the typology processor that uses a variety of typology configurations to summarize rule results into typologies (fraud or money laundering scenarios).

...

The development cycle for the platform processors and their associated configurations follow a slightly different flow. The development and configuration process follows somewhat loosely cascading dependencies amongst the configuration documents: typologies rely on rules, and the network map that defines routing relies on typology-and-rule structures.

...

Rule results roll up into typologies through a typology configuration. One would typically only start composing a typology once the target rules have been developed and deployed, though sometimes rules may be added to, or removed from a typology, and through that its configuration, an existing typology. Without a view of the target rules and their configurations, blindly composing a typology would be very difficult, so this step usually follows the completion of the rules.

...

  • Rule configuration metadata

  • A config object, that

    • may contain a number of parameters

    • may contain a number of exit conditions

    • will contains either result bands

    • or alternatively will contain result cases

...

  • id identifies the specific rule processor and its version related to that will use the configuration. It is recommended that the rule processor “name” is drawn from the source-code repository where the rule processor code resides, and the version should match the semantical version of the source code as defined in the source code repository.

  • cfg is the unique version of the rule configuration. Multiple different versions of a rule configuration can co-exist simultaneously in the platform

  • desc offers a readable description of the rule

The combination of the id and cfg strings forms a unique identifier for every rule configuration and is sometimes compiled into a database key, though this is not essential: the database enforces the uniqueness of any configuration to make sure that a specific version of a configuration can never be over-written.

The configuration object - parameters

A rule processor’s parameters are used to define how a rule processor will operate to evaluate the incoming message. The requirement for the parameters are coded into the Example of the rule configuration metadata:

Code Block
{
  "id": "rule-001@1.0.0",
  "cfg": "1.0.0",
  "desc": "Derived account age - creditor",
  ...
  }

The configuration object - parameters

A rule processor’s parameters are used to define how a rule processor will operate to evaluate the incoming message. The requirement for the parameters are coded into the rule processor and must be provided in the configuration for the rule processor to deliver a successful outcome. If any of the required parameters are missing, the rule processor will still deliver a result, but it will be a default error outcome. Parameters are given descriptive names to assist the operator in specifying them correctly. Parameters often differ from one rule to the next, but typically define thresholds and time-frames for the historical queries that are executed inside a rule processor. Some notable examples:

Parameter

Description

evaluationIntervalTime

The time-frame that defines the intervals into which a histogram is partitioned. Some rules perform a statistical analysis of behaviour over time and partitions the historical data into a histogram. This parameter defines, in milliseconds, the time-frame of each interval.

maxQueryLimit

The maximum number of records to return in the query. This parameter limits the number of results that can be returned from the database.

maxQueryRange

A time (in milliseconds) that limits the maximum extent of a historical query. A query with a value of 86400000 would only look up messages received within the last 24 hours.

minimumNumberOfTransactions

The least number of transactions required for the rule processor to produce a result. Some statistical algorithms required at least a certain number of data-points to be able to render a useful result. If the minimum number of transactions cannot be retrieved, the rule processor will raised a non-deterministic exit condition.

tolerance

A margin of error for an evaluation against a threshold. With a tolerance of 0 (zero) the match against a target value would have to be exact, but with a tolerance value of 0.1, the match could be in a range either 10% below or above the threshold value.

Example of the parameters object:

Code Block
  "config": {
    "parameters": {
      "maxQueryRange": 86400000,
      "commission": 0.1,
      "tolerance": 0.1
    }
  }

...

Code

Description

Example(s)

.x00

This condition applies to rule processors that rely on the current transaction being successful in order for the rule to produce a meaningful result. Unsuccessful transactions are often not processed to spare system resources or because the unsuccessful transaction means that the rule processor is unable to function as designed.

Unsuccessful transaction

.x01

For certain rules, a specific minimum number of historical transactions are required for the rule processor to produce an effective result. This exit condition will be reported if the minimum number of historical records cannot be retrieved in the rule processor.

Insufficient transaction history

At least 50 historical transactions are required

.x02

Currently unused.

.x03

The statistical analyses employed in some rule processors evaluate trends in behavior over a number of transactions over a period of time. While the trend itself can be categorized and reported by the regular rule results, some results are not part of an automatable scaled result. This exception provides an outcome when the historical period does not show a clear trend, but the most recent period shows an upturn.

No variance in transaction history and the volume of recent incoming transactions shows an increase

.x04

Similar to .x03, but this exception provides an outcome when the historical period does not show a clear trend, but the most recent period shows an downturn.

No variance in transaction history and the volume of recent incoming transactions is less than or equal to the historical average

Example of the exitConditions object:

Code Block
  "config": {
    "exitConditions": [
      {
        "subRuleRef": ".x00",
        "outcome": false,
        "reason": "Unsuccessful transaction"
      },
      {
        "subRuleRef": ".x01",
        "outcome": false,
        "reason": "Insufficient transaction history"
      }
    ]

...

Attribute

Description

value

This attribute defines the specific value that will be matched in the rule processor (=).

Every case contains a value, with the exception of the default “else” case.

Values can be either strings, encapsulated in quotes, or numbers, without quotes.

subRuleRef

Every rule processor is capable of reporting a number of different outcomes, but only a single outcome from the complete set is ultimately delivered to the typology processor. Each outcome is defined by a unique sub-rule reference identifier to differentiate the delivered outcome from the others and also to allow the typology processor to apply a unique weighting to that specific outcome.

We have elected to assign a numeric sequence to the sub-rule references for result bandscases, prefaced with a dot (“.”) separator, but this format is not mandatory for the sub-rule reference string. Any descriptive and unique string would be an acceptable sub-rule reference.

By convention, the default “else” outcome has a sub-rule reference of .00.

outcome

The configuration file defines whether the result delivered by the rule processor is flagged as either true or false. The flag is somewhat arbitrary, but by convention we choose to assign a true flag to deterministic results that will have a weighting impact on the typology score and we assign a false flag to non-deterministic results that will not have a weighting impact on the typology score.

reason9

The reason provides a human-readable description of the result that accompanies the rule result to the eventual over-all evaluation result.

...

  • id identifies the specific typology processor and version related to its version that will be used by the configuration. There will typically only be a single typology processor active in the platform at a time, but it is possible and conceivable that multiple typology processors and/or versions can co-exist simultaneously. It is recommended that the typology processor “name” is drawn from the source-code repository where the typology processor code resides, and the version should match the semantical version of the source code as defined in the source code.

  • cfg is the unique version of the typology configuration. Though unlikely, multiple different versions of a typology configuration can co-exist simultaneously in the platform. The configuration consists of two parts: an arbitrary identifier for the typology to differentiate one typology from another, and then, separated by an @, a semantical version that defines the specific version of the configuration for that typology, for example typology-001@1.0.0.

  • desc offers a readable description of the typology

The combination of the id and cfg strings forms a unique identifier for every rule configuration and is sometimes compiled into a database key, though this is not essential: the database enforces the uniqueness of any configuration to make sure that a specific version of a configuration can never be over-written.

The Rules object

The rules object is an array that contains an element for every possible outcome for each of the rule results that can be received from the rule processors in scope for the typology.

...

Every. Possible. Outcome.

All the possible outcomes from the rule processors are encapsulated in each rule’s configuration, with the exception of the .err outcome that is not listed in the rule configuration for reasons mentioned above. When composing the typology configuration, the user must remember to include the .err outcome, but the rest of the rule results (exit conditions and banded/cased results) can be directly reconciled with the elements in the rules object.

Each rule result element in the rules array contains the same attributes:

...

Attribute

...

Description

...

id

...

The rule processor that was used to determine the rule result is uniquely identified by this identifier attribute.

...

cfg

...

The configuration version attribute specifies the unique version of the rule configuration that was used by the processor to determine this result.

...

ref

...

Every rule processor is capable of reporting a number of different outcomes, but only a single outcome from the complete set is ultimately delivered to the typology processor. Each unique outcome is defined by a unique sub-rule reference identifier to differentiate the delivered outcome from the others

The unique combination of id, cfg and ref attributes references a unique outcome from each rule processor and allows the typology processor to apply a unique weighting to that specific outcome.

...

true

...

The outcome of the rule result will be either true or false, depending if the configurer expected the result to deterministic or not. If the outcome is true, the rule result will be assigned the weighting associated with the true attribute in the configuration. By convention, deterministic (true) outcomes are assigned a positive number as a weighting.

...

false

...

Info

Why does the typology configuration cfg look different from the rule configuration cfg?

A rule processor (defined by its id) is closely paired with its configuration (defined by the cfg): the configuration works for that rule processor and no other, and the rule processor won't work with another rule processor's configuration.

A typology processor is a generic “engine” processor. It is not paired with a specific typology the way a rule processor is - it is intended to work for multiple, if not all, typologies. The typology configuration needs another way to reference the specific typology that will be scored by the typology processor. For that reason, the cfg attribute is subdivided in the same way as the id into name and a version parts. And remember we can have multiple parallel typology processors if we need them, so the id describes the specific typology processor and its version (for routing purposes), and the cfg describes the specific typology and the version of its configuration.

Example of the typology configuration metadata:

Code Block
{
  "id": "typology-processor@1.0.0",
  "cfg": "typology-001@1.0.0",
  "desc": "Use of several currencies, structured transactions, etc",
  ...
  }

The Rules object

The rules object is an array that contains an element for every possible outcome for each of the rule results that can be received from the rule processors in scope for the typology.

...

Every. Possible. Outcome.

All the possible outcomes from the rule processors are encapsulated in each rule’s configuration, with the exception of the .err outcome that is not listed in the rule configuration because the conditions and descriptions are built into the rule processor itself. When composing the typology configuration, the user must remember to include the .err outcome, but the rest of the rule results (exit conditions and banded/cased results) can be directly reconciled with the elements in the rules object.

Each rule result element in the rules array contains the same attributes:

Attribute

Description

id

The rule processor that was used to determine the rule result is uniquely identified by this identifier attribute.

cfg

The configuration version attribute specifies the unique version of the rule configuration that was used by the processor to determine this result.

ref

Every rule processor is capable of reporting a number of different outcomes, but only a single outcome from the complete set is ultimately delivered to the typology processor. Each unique outcome is defined by a unique sub-rule reference identifier to differentiate the delivered outcome from the others

The unique combination of id, cfg and ref attributes references a unique outcome from each rule processor and allows the typology processor to apply a unique weighting to that specific outcome.

true

The outcome of the rule result will be either true or false, depending if the configurer expected the result to deterministic or not. If the outcome is true, the rule result will be assigned the weighting associated with the true attribute in the configuration. By convention, deterministic (true) outcomes are assigned a positive number as a weighting.

false

The outcome of the rule result will be either true or false, depending if the configurer expected the result to deterministic or not. If the outcome is false, the rule result will be assigned the weighting associated with the false attribute in the configuration. By convention, deterministic (false) outcomes are usually assigned a weighting of 0 (zero).

Info

What does “every possible outcome” mean?

A rule processor must always produce a result, and only ever a single result from a number of possible results. The rule result will always fall into one of the following categories: error, exit or band/case. Results across all the categories are mutually exclusive and there can be only one result regardless of the category. Results are uniquely identified via the subRuleRef attribute:

  • ".err" is reserved for the error condition, of which there will only ever be one;

  • exit conditions are prefaced with an ".x" and there may be many;

  • bands/cases are typically sequentially numbered (and ".00" is reserved in cases) and will always have at least two.

The rule processor must produce one of these results (identified by the result’s subRuleRef) and when it does, the typology processor must be configured via a typology configuration to “catch” that specific subRuleRef. If the rule processor produces a result that the typology processor can't process, the typology processor won't be able to complete the evaluation of that specific typology or the channel that contains the typology or the transaction that contains the channel: the evaluation will "hang". For this reason alone the exit conditions must be represented in the typology configuration and interpreted in the typology processor, even if the interpretation is non-deterministic (false, with a zero weighting), but some (few!) exit conditions actually also have deterministic results that have a weighting.

Because the rules object contains every possible rule result outcome from each of the rule processors allocated to the typology, the typology configuration can become quite verbose, but here’s a short example of a rules object for a typology that contains two rules:

...

The messages object is an array that contains information about the transactions that the platform is expected to evaluate. Each element in the messages object contains the following attributes11:

  • id is the unique identifier for the Transaction Aggregation and Decisioning Processor (TADProc) that will be used to ultimately conclude the evaluation of a specific transaction. It is possible for a transaction to be routed to a unique TADProc that contains specialized functionality related to summarizing the transaction’s results10.

  • cfg is the unique version of the deployed TADProc that will be used to conclude the evaluation of the transaction.host defines the NATS subscription queue for the TADProc where the results of the previous processor in the evaluation flow, the typology processor, will be published11conclude the evaluation of the transaction.

  • txTp defines the transaction type for which the message element is intended. The txTp value here must match a corresponding TxTp attribute in the root of the incoming message. If no matching txTp attribute is found in the network map, the transaction will not be routed for evaluation and will simply be ignored by the CRSP.

  • channels defines the next layer of evaluation destinations along the route laid out by the network map for the evaluation.

Code Block
  "messages": [
    {
      "id": "004@1.0.0",
      "cfg": "1.0.0",
      "host": "TADP",
      "txTp": "pacs.002.001.12",
      "channels": [

...

  • id is the unique identifier for the rule processor and version that will be invoked to evaluate the transaction.

  • cfg defines the unique rule configuration version that will guide the execution of the rule processor.host defines the NATS queue that the rule processor will subscribe to so that it can receive the transaction from the CRSP for evaluation11. The NATS publishing destination for the rule processor is presently defined as an environment variable in the processor.

Code Block
              "rules": [
              
 {                   "id": "002@1.0.0",  {
                  "hostid": "RuleRequest002002@1.0.0",
                  "cfg": "1.0.0"
                },

Complete network map example

...

Configuration documents in Actio Tazama are strictly structured JSON documents. Each document contains an identifier related to the specific processor and version of that processor to which the configuration is to be applied. For example, the configuration for a rule processor would have the following attribute and value in the typology configuration:

...

The configuration version attribute defines the specific version of the configuration file when it is used by a processor.

Actio Tazama employs semantic versioning3 for both processor source control and configuration documents:

...

Collection name

Processor Type

configuration

Rule4

typologyExpression

Typologies5

transactionConfiguration

Transaction Aggregation and Decisioning6

...

Beyond this constraint imposed by the database, configuration versions are expected to be managed outside the platform. The Actio platform Tazama does not currently offer a native user interface for configuration management, though Sybrin, one of the FRMS Centre of Excellence’s System Integrator partners, have created a user interface that allows for the creation of configuration documents as well as the automated management of configuration versions between iterations of a configuration document.

...

The network map defines the routing of an incoming transaction to all rules and typologies that are required to evaluate the transaction7. By default, the platform is configured to evaluate a pacs.002 transaction that concludes a transaction initiated from a pain.001 or pacs.008 message with a status response.

...

  1. In its current configuration, the platform only evaluates the pacs.002 as the trigger payload for the rule processors and typologies have only been defined with the final status of a payment transaction in mind.

  2. The typology processor is not currently configured to interdict the transaction when the threshold is breached; only investigations are commissioned once the evaluation of all the typologies are complete.

  3. https://frmscoe.atlassian.net/wiki/spaces/FRMS/pages/76906497/Configuration+management#References%3Asemver.org/

  4. https://frmscoe.atlassian.net/wiki/spaces/FRMS/pages/6586489/Rule+Processor+Overview#4.1.-Read-rule-config

  5. https://frmscoe.atlassian.net/wiki/spaces/FRMS/pages/1740494/Typology+Processing#5.5.-Read-typology-configuration

  6. https://frmscoe.atlassian.net/wiki/spaces/FRMS/pages/6259944/Transaction+Aggregation+and+Decisioning+Processor+TADProc#7.4.2.-Read-transaction-configuration

  7. https://frmscoe.atlassian.net/wiki/spaces/FRMS/pages/6520927/Channel+Router+and+Setup+Processor+CRSP#3.1.-Read-Network-Map

  8. An explicit version reference has been planned for development to make it easier for an operator to link an evaluation result to the specific originating network map.

  9. We have found during our performance testing that the text-based descriptions in our processor results undermines the performance gains we achieved with our ProtoBuff implementation. We will be removing the unabridged reason and processor descriptions from the configuration documents in favor of shorter look-up codes that will then also be used to introduce regionalized/language-specific descriptions.

  10. In its default deployment, the platform contains a single version of the “core” platform processors (the typology processor and TADProc) at a time. Though it is possible to deploy and maintain multiple parallel versions of these processors and manage routing to these processors through the network map, this guide will only focus on singular core processors for now.

  11. Before our implementation of NATS, EKUTA Tazama processors were implemented as RESTful micro-services. The host attributes in the network map contained the URL where the processors could be addressed. With our initial implementation of NATS, the routing information was moved into environment variables that were read into the processors when they were deployed, or restarted in the event of a processor failure. At some point we will revert the routing information back to the network map so that we can adjust routing more dynamically while processors are in flightWe have now removed the need to specify the host property for a processor - the routing is automatically determined from the network map at processor startup - see https://github.com/frmscoe/General-Issues/issues/310 for details.