Fraud Prevention
Identify Social Scams in P2P Payments using Real-Time ML
Any ecosystem that supports the transfer of value between users, from marketplaces to online games, will inevitably attract scammers looking to defraud unsuspecting customers. A machine learning model, driven by real-time risk signals, can track the flow of money across touchpoints, and identify scams before the money is out the door.

Fraud Prevention
Identify Social Scams in P2P Payments using Real-Time ML
Any ecosystem that supports the transfer of value between users, from marketplaces to online games, will inevitably attract scammers looking to defraud unsuspecting customers. A machine learning model, driven by real-time risk signals, can track the flow of money across touchpoints, and identify scams before the money is out the door.

Fraud Prevention
Identify Social Scams in P2P Payments using Real-Time ML
Any ecosystem that supports the transfer of value between users, from marketplaces to online games, will inevitably attract scammers looking to defraud unsuspecting customers. A machine learning model, driven by real-time risk signals, can track the flow of money across touchpoints, and identify scams before the money is out the door.

Problem
Bad guys move money to hide
Fraudsters and social scammers hide their activity by quickly moving money through a network of mule accounts.
Solution
Add real-time link analysis
Compute real-time risk signals across deposits, transfers, and withdrawals to track the suspicious activity.
Result
Stop the fraud before the money is gone
An accurate risk model at cashout time instantly blocks bad withdrawals while giving a low-friction experience to good customers.
Problem
Bad guys move money to hide
Fraudsters and social scammers hide their activity by quickly moving money through a network of mule accounts.
Solution
Add real-time link analysis
Compute real-time risk signals across deposits, transfers, and withdrawals to track the suspicious activity.
Result
Stop the fraud before the money is gone
An accurate risk model at cashout time instantly blocks bad withdrawals while giving a low-friction experience to good customers.
Problem
Bad guys move money to hide
Fraudsters and social scammers hide their activity by quickly moving money through a network of mule accounts.
Solution
Add real-time link analysis
Compute real-time risk signals across deposits, transfers, and withdrawals to track the suspicious activity.
Result
Stop the fraud before the money is gone
An accurate risk model at cashout time instantly blocks bad withdrawals while giving a low-friction experience to good customers.
A custom fraud solution your data team can manage
A custom fraud solution your data team can manage
Sumatra gives data scientists the infrastructure and development tools needed to build, deploy, and operate machine learning solutions that stop fraud and abuse in real time. Here are the steps for shipping a risk model for P2P social scams with Sumatra.
Sumatra gives data scientists the infrastructure and development tools needed to build, deploy, and operate machine learning solutions that stop fraud and abuse in real time. Here are the steps for shipping a risk model for P2P social scams with Sumatra.
1. Define your risk signals as code
1. Define your risk signals as code
Sumatra provides a concise language, Scowl, for defining powerful, stateful features that are computed in real time. Here are a few examples of signals you can build to identify risky transactions:
Sumatra provides a concise language, Scowl, for defining powerful, stateful features that are computed in real time. Here are a few examples of signals you can build to identify risky transactions:
Flag suspiciously-linked deposits
Flag suspiciously-linked deposits
event cash_in
card_hashes_per_bin :=
CountUnique(card_hash by card_bin, acct_id)
dollars_in_same_device_diff_account :=
Sum(amount by device_id)
- Sum(amount by device_id, acct_id)
On each cash_in transaction, we can compute risk factors like the number of unique credit cards the account has been attempting to use from the same card BIN. Similarly, we can see how much money this same device is depositing to other accounts.
On each cash_in transaction, we can compute risk factors like the number of unique credit cards the account has been attempting to use from the same card BIN. Similarly, we can see how much money this same device is depositing to other accounts.
Identify transfers between strangers
Identify transfers between strangers
event transfer
pair_receives :=
Count(by sender, receiver as receiver, sender)
pair_sends :=
Count(by sender, receiver exclusive) -- not including this one
is_friend := pair_sends + pair_receives > 0
On each P2P transfer, we can determine if this sender and receiver have ever sent money to each other in the past. This is obviously an important risk factor for social scams.
On each P2P transfer, we can determine if this sender and receiver have ever sent money to each other in the past. This is obviously an important risk factor for social scams.
Propagate risk forward to cash_out time
Propagate risk forward to cash_out time
event cash_out
transfers_from_stranger :=
Count<transfer>(by acct_id as receiver
where not is_friend last 30 days)
risky_deposits :=
Sum<cash_in>(amount by acct_id
where card_hashes_per_bin > 2)
Sumatra's native support for cross-event aggregates means that risk signals computed across the customer journey are made available at every decision point, allowing for risk assessments that are comprehensive and based on the freshest context.
Sumatra's native support for cross-event aggregates means that risk signals computed across the customer journey are made available at every decision point, allowing for risk assessments that are comprehensive and based on the freshest context.
2. Train your ML model on backfilled features—no reimplementation required
2. Train your ML model on backfilled features—no reimplementation required
Sumatra includes a distributed offline compute engine (similar to Spark), which can quickly backfill Scowl features over historical data to generate a model training set as a dataframe in Python.
Sumatra includes a distributed offline compute engine (similar to Spark), which can quickly backfill Scowl features over historical data to generate a model training set as a dataframe in Python.
from sumatra import Client
client = sumatra.Client()
train = client.replay(
features=["cash_out.*"],
start_ts="2022-12-01",
end_ts="2023-03-01"
)
df = train.get_features("cash_out")
Note the Scowl feature definitions are directly used for replay, with no reimplemention required. Backfilled features are point-in-time consistent with what would have been computed online.
Now let's train a model on our dataframe using the popular scikit-learn package to build a gradient boosted tree model and save it as a PMML model artifact.
Note the Scowl feature definitions are directly used for replay, with no reimplemention required. Backfilled features are point-in-time consistent with what would have been computed online.
Now let's train a model on our dataframe using the popular scikit-learn package to build a gradient boosted tree model and save it as a PMML model artifact.
from sklearn2pmml import sklearn2pmml
from sklearn2pmml.pipeline import PMMLPipeline
from sklearn.ensemble import GradientBoostingClassifier
pipeline = PMMLPipeline(
[
("imputer", SimpleImputer(missing_values=np.nan, strategy="mean")),
("classifier", GradientBoostingClassifier(n_estimators=30)),
]
)
pipeline.fit(df, labels)
sklearn2pmml(pipeline, "cash_out_model.xml", with_repr=True)
3. Deploy model directly in Sumatra
3. Deploy model directly in Sumatra
Sumatra not only computes model inputs, it also executes machine learning models, directly in the same environment without any need to stand up separate model services.
First, we upload our PMML, which can be done from the same Python client used to build our training set:
Sumatra not only computes model inputs, it also executes machine learning models, directly in the same environment without any need to stand up separate model services.
First, we upload our PMML, which can be done from the same Python client used to build our training set:
client.create_model_from_pmml("cash_out_model", "cash_out_model.xml")
The platform automatically tracks model versions and their input/output schemas.
The platform automatically tracks model versions and their input/output schemas.



Finally, we update our Scowl code to perform model inference at cash_out time:
Finally, we update our Scowl code to perform model inference at cash_out time:
-- cash_out.scowl
score := ModelPredict<cash_out_model>({
transfers_from_stranger,
risky_deposits,
dollars_out_by_account,
...
}).probability_scam
And we're live!
And we're live!
When we click "Apply" in Sumatra, we deploy a serverless service that performs our online feature computation and model serving with near instant freshness, median latency around 50ms, and effortless auto-scaling up to 10,000 TPS.
When we click "Apply" in Sumatra, we deploy a serverless service that performs our online feature computation and model serving with near instant freshness, median latency around 50ms, and effortless auto-scaling up to 10,000 TPS.
More fraud recipes
More fraud recipes
To check out another recipe for reducing fraud and abuse with Sumatra, see: Supercharge Your Stripe Radar with ATO Risk Signals.
To check out another recipe for reducing fraud and abuse with Sumatra, see: Supercharge Your Stripe Radar with ATO Risk Signals.
Get started
Get AI without the gimmicks
Get started
Get AI without the gimmicks
Get started