Back to Recent research

Discuss this story

AUROC of 0.983 and average precision of 0.913 that sounds pretty good to me! Looking forward to seeing the results of this once it's applied in a more practical setting. ((u|Sdkb))talk 11:57, 1 June 2020 (UTC)[reply]

Looking at it, I suspect it might help to filter out company Paidcoi SPAs, which is certainly worthwhile, but the "professionals" can tweak their behaviour fairly easy to heavily minimise their appearance without that much more work (e.g. in edits used to gain AC). Nosebagbear (talk) 12:45, 1 June 2020 (UTC)[reply]
Goodhart's law applies. Nemo 14:16, 1 June 2020 (UTC)[reply]
I've been working in this arena for a while, and in fact have a credit in the paper for contributing labeled data that was used to train the model. We aren't sure how sophisticated some of these operations are but my feeling is there's a distinct break between the activities of the outfits catering to well-funded Global North entities (in particular corporations and their executives, entertainers/entertainment companies, and politicians and political groups) – probably what you mean by the "professionals" – and the rest. I wouldn't be surprised if the former are highly aware of the investigative techniques used on-Wiki, and adapt to whatever metrics and techniques we apply, but the latter are unable to, at least quickly. But the greatest volume of stuff that has to be dealt with is due to the less sophisticated group, and it would still be useful to have tools that willow that away so human effort can be focused on the remainder. ☆ Bri (talk) 16:30, 1 June 2020 (UTC)[reply]
Yes, or in other words it's easy to focus on the least consequential cases, while large-scale manipulations by well-funded enemies of the neutral point of view will be left untouched. Nemo 18:00, 1 June 2020 (UTC)[reply]
That's kind of the opposite of what I said. Enhanced tools can help identify the least consequential cases; and dogged and talented experts can detect large-scale manipulations by well-funded enemies of the neutral point of view. You should drop in at WP:COIN and see how it works. ☆ Bri (talk) 02:47, 2 June 2020 (UTC)[reply]
@Sdkb: I put together the dataset that the authors used and generated a lot of the features. When I last checked on unseen articles, it was classifying 50% as UPE... so clearly not much help in practice. Admittedly I wasn't aware they'd published this and they might have improved, but the metrics they were getting back then were pretty similar. I still think this is possible, but it requires a lot more work to generate the training data. SmartSE (talk) 17:16, 3 June 2020 (UTC)[reply]
  • Guy, some of your edits might score high on that axis, but I don't think you would be selected by the algorithm due to the additional features outlined in 4.2 User-based features. Both of these would score low: average time between two consecutive edits made by the same user in the same article and the percentage of edits made by a user that are less than 10 bytes. Unclear if the last thing is scored across the suspicious article or across the account's lifetime, but either way. ☆ Bri (talk) 19:03, 1 June 2020 (UTC)[reply]