Discussion about this post

User's avatar
Drake Morrison's avatar

Man, this is why you are consistently in the same tier as Eliezer and Scott in my mind. Cleary naming an important phenomena while being compassionate and clear-headed about multiple perspectives on it in a way that makes me immediately want to tell all my friends.

Expand full comment
Sarah Nibs's avatar

At work, where we predict which ecommerce transactions are fraudulent and have financial incentive to be correct, we spent years being bitten by large fraud attacks where each individual transaction looked a bit suspicious but was not bad enough to block, and no assets tied all the transactions together in any obvious way. Eventually (finally) we began assessing the system as a whole and asking whether there was an elevated volume of just-under-the-threshold transactions in some segment of traffic. When there was, we lowered the threshold. For everyone. For a time.

And the reason we let fraudsters get away with it for so long is in large part because all of our systems were set up to assess One Single Transaction, plus other transactions concretely tied to it but explicitly not in any way that could set up a dangerous feedback loop, so none of our systems were set up to recognize the obvious-in-retrospect threshold attacks.

It's far simpler to narrow the scope of the problem to "assess this instance", and then your data model doesn't have a natural place to include global information and it doesn't fit through any of your nice interfaces. And you miss gigantic attacks on the system through myopia.

All that is to say (1) this for sure happens in ... "non-social"? contexts too, and (2) it can happen even if the thresholder isn't actually trying to be a thresholder. Though obviously a thresholder with *knowledge* of the rules can be a lot more efficient about it.

Expand full comment
5 more comments...

No posts