Prevent Fake Account Creation with Smart Rate Limiting

Fake accounts (aka synthetic accounts or bot accounts) are used to wreak havoc on social media platforms, commit credit card fraud, launder money, abuse affiliate programs, and even disguise account takeover attacks.

Add "smart rate limiting" based on behavioral, device, domain, and geolocation data to stop the bots without impacting good users.

Stop the bots with real-time risk signals.

IP Rate Limits are not enough

Any remotely-sophisticated abuser has multiple IP addresses at their disposal for running their scripts at scale.

Combine a suite of risk signals

Combine behavioral, device, domain, and geolocation signals to detect suspicious signups across IPs.

Bad for the bots, good for customers

With smart rate limiting, the improved detection accuracy means less abuse and less friction on good signups.

Less Abuse. Less Friction.

In smart rate limiting, we adjust how much traffic we allow through, based on a comprehensive assessment of the IP's risk level. By implementing smart rate limiting, based on real-time risk signals, we achieve more precise targeting than a fixed rate limit.

As a result, smart rate limiting stops more bots and stops them sooner, while allowing more good signups through.

How to Build a Smart Rate Limiter Without the Lift

Using Sumatra, a smart rate limiter can be deployed by a single engineer in an afternoon.

1. Call Sumatra at signup time

Using the SDK for your backend language (Node or Python), pass signup data to Sumatra and get back a thumbs up/down decision on whether to allow it.

import sumatra
sumatra.api_key = "7c096c2f-4023-4696-814e-a4c371517e8e"

features = sumatra.enrich("signup", {
        "email": "",
        "ip": "",
        "name": "Darrel Smith",
        "browser": {"language": "en-US"},

if features["verdict"] == "Block":
    ...  # block signup
    ...  # allow signup

2. Create risk signals in low code

Sumatra provides a concise language, Scowl, for defining powerful, stateful features that are computed in real time. Here are a few examples of how you might identify risky signups:

Did they supply a gibberish name?

gibberish_score := GibberishNameScore(name)

For example, "bsdfkjbasdf asdkjfhskjdhf" would get a high gibberish score.

Is the email handle similar to the name?

name_handle_similarity := 
    StringSimilarity(Lower(name), Lower(RemoveNonAlpha(EmailHandle(email))))

For example, "Bob Smith" and "" would have high similarity.

Is email domain in the disposable domain table?

domain := EmailDomain(email)
disposable_domain := Lookup<disposable_domains>(by domain)

One of Sumatra's built-in Tables is a list of more than 100,000 known disposable email domains.

Does the IP country match the browser locale?

ip_country := IPCountry(ip)
locale_country := language.Split("-")[-1]
ip_locale_country_match := ip_country = locale_country

Out-of-the-box, Sumatra can enrich events with the IP country. A US-based IP would match the "en-US" locale.

Has traffic spiked for this email domain?

count_by_domain_1w := Count(by domain last week)
count_by_domain_1h := Count(by domain last hour)
count_by_domain_ratio := count_by_domain_1h / Maximum(count_by_domain_1w, 1)

Time-windowed aggregates, like these counts, are Sumatra's bread and butter. Scowl supports a large collection of aggregate functions.

Has traffic spiked in this geographical region?

ip_geo := IPLocate(ip)
geo_area := GeoHashEncode(, ip_geo.lng).Substr(0, 5)
count_by_geo_1w := Count(by geo_area last week)
count_by_geo_1h := Count(by geo_area last hour)
count_by_geo_ratio := count_by_geo_1h / Maximum(count_by_geo_1w, 1)

Finally, this risk signal showcases several powerful features at once: IP geolocation, geo-hash encoding, and time-windowed aggregates.

3. Set rate limit based on risk level

Instead of treating all IPs the same, use Scowl to combine the risk from the computed signals and set a dynamic rate limit on each signup attempt. If the IP exceeds the rate limit, then block.

-- rules.scowl
rate_limit := Case(
    when score > 500 then 0
    when score > 300 then 1
    when score > 100 then 2
    default 5

count_by_ip_12h := Count(by ip last 12 hours)
rule_smart_ip_rate_limit := Block when count_by_ip_12h > rate_limit

Over time, this simple rule-based logic can become more sophisticated and incorporate machine learning models, trained on data sets replayed by Sumatra.

Once the initial integration is complete, the logic can be extended and enhanced by any data-savvy team member, with no additional engineering work required.

More fraud recipes

To check out another recipe for reducing fraud and abuse with Sumatra, see: Supercharge Your Stripe Radar with ATO Risk Signals.

Ready to start building these and many more fraud signals?