The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was  Approved.

Operator: Intelligentsium (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 12:20, Friday, June 17, 2016 (UTC)

Automatic, Supervised, or Manual: automatic

Programming language(s): Python

Source code available: Available

Function overview: To aid in new WP:DYK nominations by checking for basic criteria such as sufficient length, newness, and citations.

Links to relevant discussions (where appropriate): Wikipedia talk:Did you know#RFC: A bot to review objective criteria

Edit period(s): Fixed intervals (~once per hour)

Estimated number of pages affected: Subpages of Template:Did you know nominations and author talk pages

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): No

Function details: The DYK nominations page is perennially backlogged. Nominations typically take several days to a week to be reviewed. This bot will ease the backlog by checking basic objective criteria immediately after nomination so the author is made aware of those issues immediately.

Specific criteria which will be checked are:

If there are issues, the bot will leave a note on the nomination page and on the nominator's talk page.

This bot is intended to supplement, not substitute for human review.

Discussion[edit]

FYI I'm going to run a short test in my userspace to ensure the code to save pages is working correctly. Intelligentsium 17:45, 17 June 2016 (UTC)[reply]

Here is an example run, which you can see below. Any feedback is welcome.
The source code is also posted here. I'm not a professional programmer and much of this was written yesterday so please excuse any sloppiness. Intelligentsium 19:43, 17 June 2016 (UTC)[reply]
For full disclosure there are a few known issues
  • Unable to handle multi-article nominations. I'm not sure how best to implement that as sometimes single articles have commas, sometimes multinoms are made under only one article, and sometimes the link is a redirect.
  • Maintenance template grepping is a hack because I was lazy - it looks for dated templates as content templates usually are not dated (this does introduce false positives, for example ((use mdy dates)))
  • The char count is not exactly the same as Shubinator's tool as his tool parses the HTML while mine uses wikitext. Let me know if there is a significant (>5%) discrepancy
  • Sometimes the paragraph division is off, possibly because a single return in the editor doesn't break the paragraph in display.
  • I mostly ignore exceptions since there are many, many ways a nomination can be malformed
Intelligentsium 19:57, 17 June 2016 (UTC)[reply]
You wrote in the discussion that reviewers need to manually use Shubinator's tool and Earwig's tool to perform these standard checks. These issues could be pointed out easily by a bot for nominators to work on, rather than having to wait several days/weeks until a human reviewer gets around to raising them. What if pasting the output of Shubinator's tool and Earwig's tool was made standard in DYK submissions? Not to say that I have any issues—I fully support this bot—I'm just a bit surprised that you actually went to the trouble of this BRFA before what I saw as the most obvious solution.
I also recommend mwparserfromhell to parse wikitext instead of those nasty regular expressions. You may find using ceterach on Python 3 to make handling unicode much smoother, as well. Σσς(Sigma) 03:39, 18 June 2016 (UTC)[reply]
Seconded that mwparserfromhell is a wonderful library to use. I, too, once used regex to parse wikitext, but one of the many problems with doing so is that the expressions constantly have to be updated as editors find new and exciting ways to write malformed wikitext. Regex-based wikitext parsing is really technical debt, and once you switch over, it'll be so much easier. Enterprisey (talk!(formerly APerson) 03:44, 18 June 2016 (UTC)[reply]
Thanks, I'll look into the mwparser. @Sigma: I'm not sure I understand your comment. Using Shubinator's and Earwig's tools is standard review practice but because there are hundreds of submissions and as many of the users who participate at DYK are new users, the reviewer ends up having to perform the check. Intelligentsium 04:04, 18 June 2016 (UTC)[reply]
What I meant was, what if using Shubinator's and Earwig's tools, or gathering equivalent data through some other means, was required to submit a DYK?
many of the users who participate at DYK are new users I was not aware of this. Thank you for your response. Σσς(Sigma) 04:20, 18 June 2016 (UTC)[reply]
Here are some updated results

Intelligentsium 00:59, 19 June 2016 (UTC)[reply]

Thanks, I'm not that familiar with the DYK mechanics - that doesn't seem like the most appropriate use of the Template namespace -- but that is not being introduced by your bot so outside of this review. — xaosflux Talk 20:46, 23 June 2016 (UTC)[reply]
Approved for trial (50 edits or 15 days). Please provide a link to the relevant contributions and/or diffs when the trial is complete.xaosflux Talk 20:46, 23 June 2016 (UTC)[reply]
Intelligentsium Don't know if this issue was covered at DYK or not. But here goes. Is the bot configured to determine if each and every hook on a nomination is sourced? It would help a lot. — Maile (talk) 21:21, 23 June 2016 (UTC)[reply]
That's almost certainly not possible unless the hook was taken word-for-word from the article. ~ RobTalk 21:27, 23 June 2016 (UTC)[reply]
Thanks for the quick answer. Nevertheless, this bot is going to be a good addition to DYK. — Maile (talk) 21:48, 23 June 2016 (UTC)[reply]
Thanks Xaosflux. I've wondered myself why we use Template: pages for nominations rather than Wikipedia: subpages. It's probably an artefact of never moving away from the talk page of Template:Did you know for nominations, unlike ITN or TFA which have their own Wikipedia: pages.
@Maile66: Unfortunately, Rob is correct; that would be exponentially more difficult than anything the bot currently does. I don't know if you follow xkcd but this xkcd comes to mind... Intelligentsium 22:41, 23 June 2016 (UTC)[reply]
Funny stuff! — Maile (talk) 22:48, 23 June 2016 (UTC)[reply]
Approved for extended trial (250 edits or 30 days). Please provide a link to the relevant contributions and/or diffs when the trial is complete. This trial is approaching the original edit limit - there has been a tremendous amount of community discussion below and issues appear to concerns are being continually addressed - extending the trial period to allow this to continue. — xaosflux Talk 15:09, 30 June 2016 (UTC)[reply]

Trial comments[edit]

Agree that we should eschew overall icons. (The tiny check and X are okay.) The character count difference in Ellen F. Golden was 13: 4003 for the bot, 3990 for DYKcheck. I was wondering, though, why the bot says "No issues found" when there was a potential issue (see the small red X) with copyvio. I also think it might make sense to add an extra line to the bottom of the review, after what's there, that starts with the "review again" icon and would perhaps say something like "Full human review needed", perhaps also in bold. Otherwise, I think people may believe that a regular review has begun, and go on to another nomination. BlueMoonset (talk) 00:38, 24 June 2016 (UTC)[reply]
Excellent work on the bot so far, and agreeing with comments above. It did occur to me when I read through the ones checked by the bot, that visually speaking, potential reviewers might think, "Oh...that one's already been done...I don't need to bother with it." So, yeah, maybe something eye-catching to let the reviewer know a human is still needed. — Maile (talk) 00:55, 24 June 2016 (UTC)[reply]
  • Thanks for the suggestion. @Yoninah: I wanted something which would complement for reviews with issues. I deliberately chose as it is not one of the icons we usually use to indicate that the nomination is not yet approved, but I see how this could be confusing to new reviewers. @BlueMoonset: I'm still debating the best way to handle possible copyvio. The bot just sees the percentage and compares it to a threshold value of 20% (which I can change if people would like me to do so). The articles I've seen with close paraphrasing are usually at least 15-25%, which is why the threshold is here. However, this also catches some articles which use titles and quotes extensively, and because the metric has low confidence, I don't know if this should be flagged as an "issue" per se, rather something a human should look further into (which they should always do anyway as the note says). If this is flagged as an issue then the nominator will automatically be informed, for what could just be a waste of time. @Maile66: (also relevant to Yoninah and BlueMoonset's comments) This may also be a matter of people getting used to the bot; some other areas of the wiki have bot-endorsed/bot-issues-found icons which are distinct from the regular icons, and once people are aware of the bot and understand what the bot icons mean, there should be less confusion. Intelligentsium 03:22, 25 June 2016 (UTC)[reply]
  • That symbol is used for Good Articles, and its use here is confusing. I'd suggest using the red "re-review" roundel , as the intention is for a human to confirm the bot-generated review. If we really want a new symbol, perhaps the blue plus roundel could be used. Antony–22 (talkcontribs) 03:48, 25 June 2016 (UTC)[reply]
  • The red re-review icon is a great idea. Yoninah (talk) 19:03, 25 June 2016 (UTC)[reply]
Hi, sorry I was unexpectedly called out for the whole day today so I haven't had a chance to respond to these comments til now. I will respond to each comment individually above. Intelligentsium 02:24, 25 June 2016 (UTC)[reply]

Bot tags formal names[edit]

Bot just reviewed my DYK nomination…said risk was ~25%. However, virtually everything it tagged as close paraphrasing was a formal name combined with simple grammar words (e.g. “in the National Register of Historic Places”, “with the Brooks-Scanlon Lumber Company”, “for the Pilot Butte Development Company”, “and the Central Oregon Irrigation Company”, “mayor of the City of Bend, Oregon”, etc). “National Register of Historic Places” and other formal names can’t be avoided yet the bot tagged them multiple times causing a high risk score. Is there any way you can modify the bot to avoid tagging formal name in the review process?--Orygun (talk) 21:09, 10 July 2016 (UTC)[reply]

Do you mean for the copyvios tool? 25% is quite low for that tool, and even a 100% rating doesn't mean there's a copyright violation because it could have just caught a properly attributed quote. Editor review is required to determine if a copyright violation or close paraphrasing has taken place. The copyvios tool is only a shortcut for checking for that. ~ Rob13Talk 21:18, 10 July 2016 (UTC)[reply]
  • Yes, comment is related to copyright tool. At 25% the tool marks the copyright section with a red X (vice a green checkmark) so human reviewer is given the impression that there is a copyright problem. In the case of my article, I think a human reviewer would quickly see that there wasn't a copyright violation, but if formal names hadn't been tagged risk percent would have been in low single digits and could have been marked with a green check.--Orygun (talk) 21:38, 10 July 2016 (UTC)[reply]
  • Orygun, the bot is not doing the check, it is just reporting the output from using Earwig's tool which every reviewer should check. It is not uncommon for the tool to flag a possible copyvio issue which a reviewer can see is not an issue (eg. I recently saw a 98% case where the text had been copied from WP without proper attribution). Close paraphrasing and copyvios get missed at DYK too often, so the bot reminding everyone to check is a good thing. I doubt anyone who knows what they are doing will see a high percentage as a mark against you without investigating because the tool finds similarities which might be problematic and flags them for attention, it doesn't conclude whether or not a problem actually exists. EdChem (talk) 02:11, 11 July 2016 (UTC)[reply]
  • Hi, Orygun, EdChem and BU Rob13 are correct: the copyright checker is Earwig's tool and there is a note that titles and cited quotes may trigger a false positive. However, only a human check can verify whether a copyright violation exists; the bot merely alerts the human reviewer to be pay more attention when Earwig's tool reports a violation greater than 20%. It would be possible to raise this threshold if there is consensus to do so, but in my (manual) reviews I have found that >20% is almost always a reason for taking a closer look at the very least, and violations can exist even below that. Intelligentsium 23:32, 13 July 2016 (UTC)[reply]
  • What is the language used for >20%? There might be a case for using softer language for 20–50% (possible close paraphrasing, for instance) and stronger language for >50% (possible copyright violation, for instance). ~ Rob13Talk 23:38, 13 July 2016 (UTC)[reply]
  • I don't think there's necessarily a greater possibility of copyvio for >50% than >20% (usually >50% just means there's a mirror somewhere); close paraphrasing generally falls on the lower end, in the form of snippets and phrases rather than entire sentences or paragraphs. I have changed it so the notice will now be a purple question mark (?) and the bot will not automatically notify the nominator to avoid spamming; the human reviewer will have to review the comparison and confirm if a violation indeed exists. Intelligentsium 14:20, 14 July 2016 (UTC)[reply]
  • I've found copying when the Earwig-reported number was less than 10%; copyvio/plagiarism/close paraphrasing is something that should be always checked by a human reviewer. I would disagree with any request to set the number higher than 20%, and think the idea of a purple question mark is a good one. Given the number of false positives generated by Earwig, it's probably a good idea not to notify the user if Earwig numbers are the only issues found. BlueMoonset (talk) 15:14, 14 July 2016 (UTC)[reply]

DYK bot[edit]

At Template:Did you know nominations/Samiun dan Dasima, the new review bot tagged the article as lacking a citation for the plot section. As the film is still extant, and the plot is implicitly cited to the film (and no citation is required, per WP:DYKSG #D2, can we please add an exception to the bot's code so that sections titled Plot or Summary aren't checked? If we have a swath of film articles nominated, not having an exception coded might lead to more work for reviewers (or mislead new reviewers into thinking plot summaries need a citation). — Chris Woodrich (talk) 02:08, 10 July 2016 (UTC)[reply]

@Intelligentsium: copying this over from WT:DYK. Hope you see this here Chris Woodrich — Maile (talk) 21:22, 10 July 2016 (UTC)[reply]
I saw this, thanks (though oddly I didn't get this ping...). I've had to do a bit of unexpected travelling over the past few days, nothing too major but I might not be able to respond in-depth until this weekend. However I will look into this issue. Intelligentsium 23:26, 13 July 2016 (UTC)[reply]
Hi again. I haven't read through this whole page, but I'm wondering if someone mentioned the length of the text that the DYK review bot is adding to each nomination. It takes me much longer now to scroll down T:TDYK to find suitable hooks to promote to prep. I'm wondering if the bot's review could be placed in a collapsed box so prep promoters can easily scroll through and select suitable hooks? Thanks, Yoninah (talk) 15:29, 15 July 2016 (UTC)[reply]
Intelligentsium any feedback on this? Another option may to to wrap the review in <noinclude> tags. — xaosflux Talk 04:52, 25 July 2016 (UTC)[reply]
I agree that this would be helpful in terms of length and appearance—the nomination templates get very long, and some reviewers shy away from the ones that look busy. I'm not sure whether it would be better to have a line indicating that the automated review has been done but is collapsed, or just noinclude the whole section. BlueMoonset (talk) 05:28, 25 July 2016 (UTC)[reply]
Sorry, I missed this point (also note this is related to the discussion on WT:DYK). I'll add <noinclude> tags to prevent bot reviews from cluttering the nominations page or impacting load time for slow connections. However, I'm a bit wary of collapsing the comments, especially if there are issues that need to be addressed as I'd like those to be immediately visible. Intelligentsium 23:03, 25 July 2016 (UTC)[reply]

Copyvio language misleading[edit]

Hi folks, EEng just brought this to my attention on my talk page. It's misleading to call what the copyivo tool returns a probability of violation. It's really a measurement of how much text in the article is in common with the suspected source, but fuzzed a bit. There's a big difference between saying "the probability of a violation is 15%" and "~15% of the article was found elsewhere on the internet". Now, here's my suggestion. Don't try to interpret the significance of the percentage yourself; the tool tells you how to interpret it. If the tool's API indicates that no violation is present (where resp["best"]["violation"] == "none" in the returned JSON), then the bot should say "No copyright violation suspected. (review)", with a green checkmark, and you can eschew the note that follows. Otherwise resp["best"]["violation"] contains a descriptive string, either "possible" or "suspected". If the former, say "A copyright violation may be possible, according to an automated tool with X% confidence. (confirm)"; otherwise, "A copyright violation is suspected by an automated tool, with X% confidence. (confirm)" with the ? and the clarifying note that reads "Please manually verify that there is no copyright infringement...". This should reduce confusion unless a real match is found. What do you think? — Earwig talk 00:23, 18 July 2016 (UTC)[reply]

I like this idea. Even though I'm familiar with the tool and what the percentages mean, it's clear from this thread that the existing language is confusing and this could help. ~ Rob13Talk 00:25, 18 July 2016 (UTC)[reply]
The Earwig, at what level will the tool's API tell you that there the chance of a copyvio is "none"? Do you have a set percentage within the tool? Also, some sources, such as a book at Google books, may appear to have been checked, but what's actually checked is the metadata page, not the actual contents (or specific page) of the book that has been cited. BlueMoonset (talk) 03:38, 19 July 2016 (UTC)[reply]
It's 40% at the moment. Also, I can only check what's in the HTML or PDF at the URL returned by the search engine. Google Books is not friendly to scrapers. — Earwig talk 03:40, 19 July 2016 (UTC)[reply]
Done. Thanks for the suggestion! Intelligentsium 20:14, 21 July 2016 (UTC)[reply]
Given that I've found copyvio down as low as 9.8%, saying that there's "no copyright violation suspected" at 40% or lower seems very misleading to me, and indeed could give the human reviewer a false sense that there's no need to check further. We've had reviewers in the past citing the Earwig percentage as sufficient evidence of a lack of copyvio/close paraphrasing/plagiarism. It's not, of course, but we have to be very careful in what is said here. Further, not all sources are (or can be) checked; sometimes a slow response from a website will leave its pages unchecked by the bot, when a human check would check and possibly find duplicated material. BlueMoonset (talk) 05:21, 25 July 2016 (UTC)[reply]
I agree; I've found copyvio/close paraphrasing at 10-15% as well. I've modified the language to state "A copyright violation is unlikely according to automated metrics" together with the reminder to check manually. The percentage will still be reported regardless. Intelligentsium 00:20, 29 July 2016 (UTC)[reply]

Moving towards stable operations[edit]

@Intelligentsium:, just checking in, the discussion and responses above have been great! Once live, where would you want editor feedback to go (e.g. your talk, the bot's talk, some other page)? Are there any outstanding technical or operational issues (not including enhancement requests)? — xaosflux Talk 00:46, 17 July 2016 (UTC)[reply]

((OperatorAssistanceNeeded)) — xaosflux Talk 18:34, 21 July 2016 (UTC)[reply]
Oops, sorry for the belated response! They can go on my talk page; I'll add a note to my userpage to this effect. Intelligentsium 20:11, 21 July 2016 (UTC)[reply]

Trial complete. I'm at 249 now. There are no glaring issues remaining. In the most recent run I note there was an anomaly relating to an unusually large nomination. The bot currently does not handle the case where a reviewer reviews a multi-article hook with N articles, then proceeds to claim QPQ credits for N of their own articles (which the reviewer is entitled to do). I will look into implementing a check for this as an additional feature, but this should not be a common occurrence. Intelligentsium 23:58, 24 July 2016 (UTC)[reply]

Approved for extended trial. Please provide a link to the relevant contributions and/or diffs when the trial is complete. I think we need to be moving this to approved - there are a couple of questions higher up, but don't want you to have to shut down so here are another 100 edits while this is in closing. — xaosflux Talk 13:52, 25 July 2016 (UTC)[reply]
 Approved. This bot is basically live already, no need to keep this going here anymore. Intelligentsium has been one of the most responsive bot operators we have had in a long time during a trial, and I am confident they will continue to be responsive to any suggestions brought up for this bot in the future. — xaosflux Talk 03:09, 30 July 2016 (UTC)[reply]
The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.