One of the leading Banks in the UAE uses SAS, a post facto platform to handle Anti Money Laundering (AML). Alerts are currently rule based. Transactions made by a customer could be flagged as suspicious based on various rules, scenarios and parameters.
The flagged alerts are manually checked by a Maker –checker analysis for validity before closing the ticket with appropriate comments.
The compliance team of the Bank decides the Rule-scenarios. The parameters of the scenarios are periodically reviewed and updated by the compliance team.
For example, SAS maintains a watch list of countries, in case there is any transaction from this country an alert is generated. In future if the compliance team wants to add/delete a country in its existing watch list, it can do so by altering scenario parameters.
Currently SAS generates numerous alerts claiming those alerts to be FALSE POSITIVES (illegal), however post review 90%+ of the alerts are found to be normal (legal). Due to this scenario, the organization is prone to suffer from high operational cost.
Our proof of concept work is to automate the work of operations people (makers) who manually check the alerts by effectively using Artificial Intelligence.
Comments and Obersavation
ALERT_ID’s – primary key
Important columns were TRANSACTION_ID, ALERT_ID, AMOUNT
Important columns were ALERT_ID, EVENT_DESCRIPTION
Event description tells whether the ALERT is FALSE_POSITIVE or not. To find the TRUE POSITIVE LABELS we need to manually go through the comments from FSK_COMMENT which have ALERT_ID’s and corresponding descriptions for alerts.
Total of 1024 scenarios
Using this approach, scenarios would be classified into two groups. High risk scenario group: This will contain the scenarios that contribute to False Positives. Low risk scenario group: This will contain the scenarios, which contribute to True Positives. A risk scoring function is employed to give risk score for each the rules. We will restructure the rules or scenarios in a hierarchical order by assigning priority score obtained from the above-mentioned risk scoring function. When an alert is generated it will be checked for its risk and can be auto-closed as per the risk group.
Using this approach, all the scenarios would be clubbed for a given transaction. In addition transaction information would be added, and the model would then predict whether a scenario is True Positive (TP) or False Positive (FP). To achieve this, we will collect as much transactions as possible and club the scenarios alerted for those transactions. Also, we will consider the scenario information (TP or FP) for the respective transactions from the previous records. By grouping this altogether we will be finding patterns in data using Machine Learning that makes a scenario FP or TP and storing it in a model. Once the mode is ready, for any new transaction that is alerted with different scenarios the model would be able to predict the nature of scenarios.
This is an advanced approach, here we will take the most repeating scenarios in case of TP as well as FP.
Assuming a top 50 in each case out of 1024, the training Data should have Transaction data of customers plus customer information and labels are whether a scenario is TP or FP. The intention is to be able to perform a multi-label classification.
Each data point should have the features as follows:
Customer information like age, gender, geography, high risk or low risk customer, amount transferred, account type, transfer mode, beneficiary account details etc.,
Scenarios for alerts as labels whether TP of FP.
Likewise we will have all transactions made by customers and the scenario details for each transaction. We will be predicting how likely each scenario is TP or FP based on customer information and transaction details.
After analysis of the collected the data, each alert was passed through a NLP model build to analyze text. In addition, through negligible manual efforts the labels of True Positives were identified.
The data had 52,768 alerts for False Positives.
The approach was to map each alert with the transaction, customer id, event and scenarios.ls.
Later, the SAS people informed MQI that they maintain another table called FSK_TRANSACTION_ALERT with Transaction details.
Since our approach mainly revolves around customer information and scenarios, data pertaining to customer information would be required to proceed further.
We recommend mapping transactions with TRANSACTION_ID and other customer information along with CUSTOMER_ID for the results to be more effective.
We have a done a prediction modeling for the alerts to be TP or FP using the Risk scores as mentioned below (generated by SAS):
This is done on an alert level modeled as binary classification problem with one class as TP and other class as FP.
The results obtained from alert level modelling were promising and an indication of correct approach.
As the next step we moved these models on actual and live customer information and actual data.
Further analysis was done on the transaction details and customer specific data an enhanced outcome of the suspicious transactions was derived.
Auditing of rules will happen once a year. It takes around 2+ months of human efforts (entire team) to audit and analyze the rules. Based upon their auditing, new rules were added or existing rules will be changed or removed.
The recommendation module would analyze the performance of AML engine, generate reports.
It would have two components
1.Auto audit module
2.Rules validation and recommendation module
Telecom & Software Development Partner of Choice !