Software Development

Reduction of False Positives
in SAS AML Data

Our proof of concept work is to automate the work of operations people (makers) who manually check the alerts by effectively using Artificial Intelligence.


One of the leading Banks in the UAE uses SAS, a post facto platform to handle Anti Money Laundering (AML). Alerts are currently rule based. Transactions made by a customer could be flagged as suspicious based on various rules, scenarios and parameters.

Additional Challenge

The flagged alerts are manually checked by a Maker –checker analysis for validity before closing the ticket with appropriate comments.

The compliance team of the Bank decides the Rule-scenarios. The parameters of the scenarios are periodically reviewed and updated by the compliance team.

For example, SAS maintains a watch list of countries, in case there is any transaction from this country an alert is generated. In future if the compliance team wants to add/delete a country in its existing watch list, it can do so by altering scenario parameters.

Currently SAS generates numerous alerts claiming those alerts to be FALSE POSITIVES (illegal), however post review 90%+ of the alerts are found to be normal (legal). Due to this scenario, the organization is prone to suffer from high operational cost.

Our proof of concept work is to automate the work of operations people (makers) who manually check the alerts by effectively using Artificial Intelligence.

Table Name

Comments and Obersavation


ALERT_ID’s – primary key
Other columns were DATE_KEY, TIME_KEY etc.


Important columns were TRANSACTION_ID, ALERT_ID, AMOUNT


Important columns were ALERT_ID, EVENT_DESCRIPTION

Event description tells whether the ALERT is FALSE_POSITIVE or not. To find the TRUE POSITIVE LABELS we need to manually go through the comments from FSK_COMMENT which have ALERT_ID’s and corresponding descriptions for alerts.


Total of 1024 scenarios
SAS generates alerts based on these scenarios
These scenarios are subject to change over time


Classification of Scenarios and Hierarchical segmentation

Using this approach, scenarios would be classified into two groups. High risk scenario group: This will contain the scenarios that contribute to False Positives. Low risk scenario group: This will contain the scenarios, which contribute to True Positives. A risk scoring function is employed to give risk score for each the rules. We will restructure the rules or scenarios in a hierarchical order by assigning priority score obtained from the above-mentioned risk scoring function. When an alert is generated it will be checked for its risk and can be auto-closed as per the risk group.

Transaction level scenario predictive modeling

Using this approach, all the scenarios would be clubbed for a given transaction. In addition transaction information would be added, and the model would then predict whether a scenario is True Positive (TP) or False Positive (FP). To achieve this, we will collect as much transactions as possible and club the scenarios alerted for those transactions. Also, we will consider the scenario information (TP or FP) for the respective transactions from the previous records. By grouping this altogether we will be finding patterns in data using Machine Learning that makes a scenario FP or TP and storing it in a model. Once the mode is ready, for any new transaction that is alerted with different scenarios the model would be able to predict the nature of scenarios.

Customer level scenario predictive modeling

This is an advanced approach, here we will take the most repeating scenarios in case of TP as well as FP. Assuming a top 50 in each case out of 1024, the training Data should have Transaction data of customers plus customer information and labels are whether a scenario is TP or FP. The intention is to be able to perform a multi-label classification. Each data point should have the features as follows:
Transaction id
Customer information like age, gender, geography, high risk or low risk customer, amount transferred, account type, transfer mode, beneficiary account details etc.,
Scenarios for alerts as labels whether TP of FP.
Likewise we will have all transactions made by customers and the scenario details for each transaction. We will be predicting how likely each scenario is TP or FP based on customer information and transaction details.

Constraints with the current data

After analysis of the collected the data, each alert was passed through a NLP model build to analyze text. In addition, through negligible manual efforts the labels of True Positives were identified.
The data had 52,768 alerts for False Positives.
The approach was to map each alert with the transaction, customer id, event and

Data Problem

Later, the SAS people informed MQI that they maintain another table called FSK_TRANSACTION_ALERT with Transaction details.
Since our approach mainly revolves around customer information and scenarios, data pertaining to customer information would be required to proceed further.
We recommend mapping transactions with TRANSACTION_ID and other customer information along with CUSTOMER_ID for the results to be more effective.

*There are no common alerts between the above two files

*There are no common alerts between the above two files

Work Done with Available Data

We have a done a prediction modeling for the alerts to be TP or FP using the Risk scores as mentioned below (generated by SAS):

This is done on an alert level modeled as binary classification problem with one class as TP and other class as FP.
With current data and developed model we are able to close 97432 alerts within 15 minutes.

Current Status

The results obtained from alert level modelling were promising and an indication of correct approach.
As the next step we moved these models on actual and live customer information and actual data.
Further analysis was done on the transaction details and customer specific data an enhanced outcome of the suspicious transactions was derived.

Future Work

Auditing of rules will happen once a year. It takes around 2+ months of human efforts (entire team) to audit and analyze the rules. Based upon their auditing, new rules were added or existing rules will be changed or removed.

The recommendation module would analyze the performance of AML engine, generate reports.
It would have two components
1.Auto audit module
2.Rules validation and recommendation module

Auto Audit Module:

The proposed module will audit the AML engine based upon its weights learned. It will function as a feedback system and generate performance reports of AML solution with the help of NLP Module.

Rules validation and recommendation Module

The proposed module will validate each rule and its parameters, based upon the reports generated by audit module. It will prioritize the scores to different rules and recommend the rules which would be added, changed, removed etc.

Find world Best Services & Resources!

Telecom & Software Development Partner of Choice !


Terra Edge Soft Private Limited 101,Hallmark,Plot 149, Sector 28,Vashi ,
Navi Mumbai,India- 400703

Phone: (+91) 9892700307

© 2022 Created with TERRA EDGESOFT. All rights reserved. Privacy Policy

This website uses cookies to provide you with the best browsing experience.