Paul Rosenzweig on Lawfare remarks on the ongoing efforts to differentiate people from bots in the broad contextual problem of fake news:
A recent project by “Truthy” at the University of Indiana gives a good window into how the technology might work (and how difficult it may be to implement at scale). The project, known as BotOrNot, is an academic project partially funded by NSF and DoD. The objective is a limited one — to try and assess whether the traffic from a particular twitter account can be analyzed to determine whether the account is connected to a real human being or whether it is controlled by a bot network. Because the assessment is probabilistic rather than definitive the “score” assigned to an account is a numerical percentage rather than an absolute “bot or not” determination.
And thus my contingent humanity. The analytics of my own Twitter account (@RosenzweigP) assign me a score of 13% as a bot (or, reciprocally, as 87% as a real human being). The analytics rely on things like the timing of my tweets, their language structure, and my use of hashtags as a way of gauging my genuine nature. Interestingly, the area in which I most resemble a bot seems to lie in a sentiment analysis—apparently, my tweets have been angry of late (you can imagine why) and that is indicative, at some level, of artificiality.
That last bit concerning the operationality of the software touched my contradictory nature. What if I were a bot writer, dedicated to retweeting fake news, and this came up? What would I do?
Actually, I’d write my bot to be trainable, and then I’d train it – on known human Twitter accounts. Let it learn how humans write, when they write, the patterns that emerge. And then it can use those same patterns when it goes about its dark business. I’m not saying it’d be easy – you’d need some training and experience in the Big Data arena, I’m sure – but it’s probably doable.