Last modified: 2013-05-03 19:34:16 UTC
Just like we have $wgCaptchaBadLoginAttempts, there should be a config array maximum attempts for other $wgCaptchaTriggers. Without this, guessing and building canned answer libraries becomes easier (though $wgCaptchaDeleteOnSolve helps).
I agree that this is important, because all the captchas appear to have been cracked. Asirra used to work pretty well, but who knows, maybe the spammers are using brute force to get through all 4,096 (2^12; there are twelve photos with two possibilities, cat or dog, apiece) possibilities. Changing $wgAsirraCellsPerRow doesn't help; if, say, you change it to 8, then it will simply put 8 on the top row and 4 on the bottom, so you still only have 12. Is there a way to impose a throttle without possibly causing collateral damage to other users of that IP address? Maybe the only solution is to disallow edits by anons, and impose per-account throttles on CAPTCHA attempts. So, e.g., if FooUser256 has x number of bad attempts within a y-second period, he gets throttled. The problem is, he can use brute force to get through the CAPTCHA to create many accounts, and then sit around making attempts in parallel to get through the CAPTCHA on each account. So, even if there is a throttle, with enough accounts, he can eventually get through.
Maybe a better solution would be to present the user with two CAPTCHAs? It's twice as much work for users, but makes the CAPTCHA exponentially more difficult to crack through guessing. I'm thinking that would be particularly useful for QuestyCaptcha.
I agree, this would be a very useful feature, although implementing it needs to be handled carefully. For passwords, you typically either present a captcha, or you introduce an exponentially-increasing delay. To get a handle on the effect, it would be nice to start logging captcha presentations, in addition to the pass/fail logging that we do. That will let us calculate the pass or fail rate of a single IP or User. I suspect that if we throttle based on the pass rate, instead of a static number of requests, that would more accurately block someone brute forcing, while not disrupting edits for users who happen to be behind a proxy with a large number of legitimate users.
Clarifying this a little, the feature request is that we limit the number of captchas that a single user can harvest in a given amount of time. Maybe keep 2 key in memcache for captcha passes and fails, for each user. And if fails > X and fails / (fails + passes) > Y%, then we refuse to serve up any more captchas?
It's not just a question of limiting the number of captchas that a single user can harvest... there needs to be a limit on the number of failed attempts. Handing them one CAPTCHA and letting them make 4096 guesses at it won't help. The $wgCaptchaBadLoginAttempts variable actually isn't intended to limit the number of failed CAPTCHA attempts. It's intended to limit the number of bad password attempts on an existing account before subsequent log in attempts require both a CAPTCHA and password be supplied. As such, a bot attempting to create 4096 shiny new accounts against Assira by submitting a random 12-bit answer "dog, cat, dog, dog, dog, cat, dog, dog, dog, cat, dog, dog" will get one new account and 4095 errors. If after 'n' consecutive bad CAPTCHA attempts the offending 'bot were simply http://en.uncyclopedia.co/wiki/Banned_from_the_Internet there would be no 4096th attempt to get past Assira.
See also https://gerrit.wikimedia.org/r/#/c/44376 , which as Chris noted there, is relevant to this. It's an API for reloading CAPTCHAs, which allows an AJAX button.
*** Bug 47740 has been marked as a duplicate of this bug. ***