Thursday, September 13, 2012

How to deter without being a dick

Effectively deterring humans is a bit different than effectively deterring an idealized game theoretic agent.  If you're a skilled elephant jockey, you can remove a good amount of this discrepancy, but you need to know how it works so that you can correct for it.

If we look at what actually motivates our actions, we find that rewards and punishments (including those from simulations) are the driving factor. Basically, if it's not near mode, it's not motivating.  And since people use near mode more for near term things, this gives us hyperbolic discounting - people act as if they value immediate things hyperbolically more than future things. This really blows up for instant feedback, so you can get effective deterrence with minimal utility loss if you do it right away. is strongly associated with fear and dread. Think of something that is hard to think about, or that surge of terror from the approaching lion - or the dread of waking up at 5 am to go to work - where your brain is just screaming "No! No! No no no!" -  that is the feeling of aversion.  

Not all 'negative' emotions deter.  Interestingly enough, sadness doesn't seem to do it. Dreading sadness might, but sadness itself doesn't seem to.  Empirically, people that are bumming out for extended periods of time don't act extraordinarily motivated to stop it. However, people that are absorbed in fear will do whatever they can to make sure that they stop feeling it -  even if they have to develop a phobia to do it.

We want to maximize the feeling of aversion while minimizing actual loss of utility.  This means you want things that demand near mode thought. Encourage vivid imagination by giving vivid descriptions. Make it instant. Make it scary. If you want it to stick in absence of punishment, make it intermittent.  But we don't want it to actually destroy value. So make it brief. Generate scary without harm. Even make a game about it - deterrence works even when you're having fun.  One of my preferred methods of deterrence (from both sides) is to playfully say "No!" as if you're talking to a misbehaving puppy. It's mild and playful enough to not burn any utility or derail the conversation, but still effective enough to stop bad conversational habits!

Electric shocks are perfect on all fronts. Instant, scary, and harmless. Makes me want to put on a shocker collar and hand the remote to a someone that is good at spotting self deception...

The main failure of attempts to deter behavior is that the wrong thing gets conditioned against.  If you're thinking "Oh god, why am I shocking myself?!" then you're conditioning against shocking yourself, and your incentive scheme itself won’t stick.  Remember, classical conditioning is simple. Fire together wire together. It's the salient cues that will be associated with the punishment; if the focus is on the punisher, then the conditioned aversion will be tied to the punisher, not the bad deed - even if it's an intrapersonal issue. 

If you're punishing people (again, including yourself) with disapproval, you run the risk of them finding it unfair and associating you with the unpleasantness. It's safer and more effective to ask them in a neutral tone if this is something they should be doing - and let them feel that counterfactual dread - and deter themselves. If they won’t do this even after the costs are explained, they're declaring war anyway.

A safer option is to deter people by removing positive attention, but it is more limited. This is a form of "negative punishment", and is safer because it is less likely to be framed as a manipulative attack, and more likely to be framed as simply not giving rewards that weren't earned.

Deterrence is tricky, but not necessarily destructive. Just make sure you're aware of what you're doing.


  1. >If we look at what actually motivates our actions, we find that rewards and punishments (including those from simulations) are the driving factor.

    "If people are good only because they fear punishment, and hope for reward, then we are a sorry lot indeed."
    -Albert Einstein

    "Reward and punishment" is definitely an interesting double bind, and it gets results (of a certain kind), but as for being the best or only option I can't say that I agree.

    1. If you really know your stuff, I'd love to get detailed feedback from you. The thing is, Sturgeon's Law applies to blog commenters too, and since I don't know you, I'm not sure why I or my readers should care if you personally disagree. Either way, it'd be helpful if you expanded on why you disagree.

      In any case, with respect to the quote, it appears that you missed the point of the second half of the post. That quote is talking about external pressures, and my point is to internalize them. If you don't do bad things because knowingly doing bad things makes you feel bad - and you reflectively endorse that bad feeling, then it's just that you're a good person and you don't want to be bad. The quote doesn't apply to you - yet you can still model the process with operant conditioning.

    2. For the record, Aeonios did turn out to be an interesting fellow that I'm interested in talking to :)