And then ask the question, “would this kitty fit into a shoe box? Why, or why not?”. Then sort the answers manually. (Bonus: it’s cuter than captcha.)
This would not scale well, and you’d need a secondary method to handle the potential blind user, but I don’t think that bots would be able to solve it correctly.
It’s fine if the photo is either shopped or a false-perspective illusion. It could be even a drawing. The idea is that this sort of picture imposes a lot of barriers for the bot in question:
must be able to parse language
must be able to recognise objects in a picture, even out-of-proportion ones
must be able to guesstimate the size of those objects, based on nearby ones
must handle RW knowledge, as “X only fits Y if X is smaller than Y”
must handle hypothetical, unrealistic scenarios, as “what if there was a kitty this big?”
Each of those barriers decrease the likelihood of a bot being able to solve the question.
Here’s where things get interesting - humans could theoretically come up with multiple answers for this. Some will have implicit assumptions (as the size of the shoebox), some won’t be actual answers (like “what’s the point of this question?”), but they should show a type of context awareness that [most? all?] bots don’t.
A bot would answer this mechanically. At the best it would be something like “yes, because your average kitten is smaller than your average shoebox”. The answer would be technically correct but disregard context completely.
Reminds me of how bots tend to be really bad at figuring out whether the word “it” applies to the subject or the object in a sentence like: “The bed does not fit in the tent because it is too big”
Yup - they struggle really hard with syntactical ambiguity that relies on world knowledge for disambiguation. We know that “it” = “the bed” in this sentence because “it is too big” needs to be logically connected as the reason for “the bed does not fit in the tent”, and the only way for this to happen that doesn’t conflict with our world knowledge is if the bed is big, but the tent is small. And we can even change the “it” to refer to the object by simply changing the adjective:
The bed does not fit in the tent because it is too small.
Without any sort of grammatical change.
Donkey sentences are also hard for them, like:
Everyone who owns a donkey beat it.
If you’re human, this sentence implies that 1) there are multiple donkeys, owned by different people; and 2) each of those people beat one’s own donkey. But machines have a really hard time getting those two things right.
And you can exploit a lot of those quirks of RL language to make the bots go nuts. A few of them might slip through, but this is low-cost for the humans, so you can pile them up.
Show a picture like this:
And then ask the question, “would this kitty fit into a shoe box? Why, or why not?”. Then sort the answers manually. (Bonus: it’s cuter than captcha.)
This would not scale well, and you’d need a secondary method to handle the potential blind user, but I don’t think that bots would be able to solve it correctly.
This particular photo is shopped, but i think false-perspective Illusions might actually be a good path…
It’s fine if the photo is either shopped or a false-perspective illusion. It could be even a drawing. The idea is that this sort of picture imposes a lot of barriers for the bot in question:
Each of those barriers decrease the likelihood of a bot being able to solve the question.
Is the kitty big, or is the man small? And how big are the shoes? This is a difficult question.
Here’s where things get interesting - humans could theoretically come up with multiple answers for this. Some will have implicit assumptions (as the size of the shoebox), some won’t be actual answers (like “what’s the point of this question?”), but they should show a type of context awareness that [most? all?] bots don’t.
A bot would answer this mechanically. At the best it would be something like “yes, because your average kitten is smaller than your average shoebox”. The answer would be technically correct but disregard context completely.
Reminds me of how bots tend to be really bad at figuring out whether the word “it” applies to the subject or the object in a sentence like: “The bed does not fit in the tent because it is too big”
Yup - they struggle really hard with syntactical ambiguity that relies on world knowledge for disambiguation. We know that “it” = “the bed” in this sentence because “it is too big” needs to be logically connected as the reason for “the bed does not fit in the tent”, and the only way for this to happen that doesn’t conflict with our world knowledge is if the bed is big, but the tent is small. And we can even change the “it” to refer to the object by simply changing the adjective:
Without any sort of grammatical change.
Donkey sentences are also hard for them, like:
If you’re human, this sentence implies that 1) there are multiple donkeys, owned by different people; and 2) each of those people beat one’s own donkey. But machines have a really hard time getting those two things right.
And you can exploit a lot of those quirks of RL language to make the bots go nuts. A few of them might slip through, but this is low-cost for the humans, so you can pile them up.