The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was  Approved.

Operator: MusikAnimal (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 02:49, Wednesday, December 9, 2015 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): Ruby

Source code available: GitHub

Function overview: Monitors Category:Wikipedia pages with incorrect protection templates and repairs protection templates on those pages, or removes the templates, as necessary

Links to relevant discussions (where appropriate): [1] (permalink), [2]

Edit period(s): Continuous

Estimated number of pages affected: Probably between 15 to 30 a day

Exclusion compliant (Yes/No): No

Already has a bot flag (Yes/No): Yes

Function details: Sorry, I realize this makes my third open BRFA. There aren't any immediate ones coming after this, promise :)

A general overview of how to fix protection templates can be found under "Remedies" at Category:Wikipedia pages with incorrect protection templates.

Old approach. See #Reworking below

The bot is capable of automatically generating all the correct protection templates from scratch. That is, check the current protection info and generate the generic templates based on that. This is pretty cool as it will definitively remove the page from the category but it may not always be what we want. For instance, in someone's userspace they may want ((pp-protected)) but not ((pp-move)), even if the page is moved-protected. So the logic is as follows:

  • Removes any protection templates representing a protection that isn't present. So if it's semi'd but not move-protected, ((pp-move)) will be removed but the semi left as-is, or repaired if necessary
  • Removes any protection templates on template pages that have ((documentation)) (or any of it's redirects) or ((collapsible option)) (or any of it's redirects). Those templates automatically add the padlock icon. If they are not present, the bot will wrap the appropriate protection templates with <noinclude>...</noinclude>, or place them inside any existing noinclude.
  • If it is protected, it will try to repair only what's already on the page. So if ((pp-blp)) is there but incorrectly used, but the page is also move-protected, the bot will only repair ((pp-blp)) and not add ((pp-move)) if it didn't already exist
  • If there is a protection template such as ((pp)) but it's usage is completely wrong or lacking any indication of what it's supposed to be, the bot will assume the user just didn't know what they were doing and will auto-generate all templates relevant to the current protection info
  • Protection templates always get moved to the top of the page, even if they were originally at the bottom. I assume this is fine.
  • The bot is aware of all protection templates, even the old ones that redirect to ((pp)), etc, and if a repair is done the bot will use the target template instead of the redirect. It does this by fetching all the redirects to the templates listed at ((protection templates)), generates a map of all the templates and their target templates (e.g. ((sprotected)) is actually ((pp))), along with the type of protection they represent (e.g. ((pp-blp)) is for edit, ((pp-move)) is for move). This information is cached for a week, as the redirects are unlikely to change that often.
  • You'll notice at Category:Wikipedia pages with incorrect protection templates one of the last-ditch efforts to fix pages is to make a null edit. The bot does not attempt this. I'll have to revisit it as I don't have a good solution, and also don't have a reliable way of reproducing this scenario.


Some examples of what the bot would do (assume the repaired expiries and protection levels are correct):

  • ((pp|semi))((pp|semi|action=edit|expiry=00:00 1 January 1970))
  • ((pp-blp|expiry=5 January 1969))((pp-blp|expiry=00:00 1 January 1970))
  • ((sprotected))((pp|semi|action=edit|expiry=00:00 1 January 1970))
  • ((Pp-semi-BLP))((pp-blp|expiry=00:00 1 January 1970))
  • ((pp-template|expiry=00:00 January 1970))&lt;noinclude&gt;((pp-template|expiry=00:00 1 January 1970))&lt;/noinclude&gt;
  • ((pp|action=autoreview))((pp-pc1|expiry=00:00 1 January 1970))
  • on template page: ((pp-semi))&lt;noinclude&gt;((pp-template|expiry=00:00 1 January 1970))((other stuff that's already in an existing noinclude))&lt;/noinclude&gt;
  • when edit/move protected: ((pp|1=invalid_reason))((pp|semi|action=edit|expiry=00:00 1 January 1970))((pp-move|expiry=00:00 January 1970))

So on and so forth. Given this task covers all ground for correcting protection templates, it incidentally also partially does the job of Cyberbot II's task #1 (removing pp-pc1 if it's no longer protected). Pinging Cyberpower678 for input. I don't think this is a problem; MusikBot and Cyberbot both handle conflicts, so it should be fine. I see that Kingpin13 made this same argument at Cyberbot's BRFA.

The last thing I wanted to bring up was correcting protection templates on fully protected pages. That requires the admin bit, as you know. My last adminbot experience was not so great, but I feel like here the situation is different, as we're only making minor edits. I do not have strong desire to cover the fully protected pages, just know that I'm confident it can do it reliably. I'm also open to creating a new bot account dedicated to this task, if that means anything. MusikAnimal talk 02:49, 9 December 2015 (UTC)[reply]

Discussion[edit]

Definitely going to suggest a unique bot account this time if we decide to go down the adminbot route, since it's continuous. I am unclear about the exact role the bot would fill; there's a redundancy aspect (with DumbBOT) for sure, but how much work is not already covered? More specific comments incoming. — Earwig talk 03:38, 9 December 2015 (UTC)[reply]

Pinging MarnetteD and Redrose64. They seem to stay busy correcting protection templates: Wrong expiry, date format, wrapping templates in noinclude on templates pages, etc, but most commonly just removing them from unprotected pages, it seems. The latter is all DumbBOT does as I understand. I do not know why it misses so many pages. MusikAnimal talk 03:54, 9 December 2015 (UTC)[reply]

Reworking[edit]

I've reworked the bot quite a bit, trimming it down more to the basics, with some additional configurable features and more informative edit summaries. It will only remove templates when a protection has expired, wrap them in <noinclude>...</noinclude> within the template space, or auto-generate protection tags if there is invalid use of ((pp)). I also would like to retire the idea of an admin bot for the time being. I'd rather see it perform well on semi'd/unprotected templates for a good while before allowing it to edit highly visible pages. Finally, the bot is highly configurable. See User:MusikBot/FixPP for the full documentation.

So a run down the logic again, relevant config options are in parentheses:

  1. Loops through all the pages in Category:Wikipedia pages with incorrect protection templates.
  2. The bot does not attempt to parse any page that uses ((PROTECTIONLEVEL)), suggesting the usage of protection templates is automated via parser functions.
  3. First, check if there's no protection at all on the page. If so, remove all protection templates. Plain and simple. (remove_all_if_expired)
  4. Remove any protection templates representing a protection that is not currently on the page, leaving other protection templates as-is. (remove_individual_if_expired)
  5. Normalize any instances of ((pp)); for instance convert ((pp|expiry=Jan 1st, 1970|1=blp)) to the more appropriate ((pp-blp|expiry=00:00, 1 January 1970)). This parsing of ((pp)) and its params are necessary in order for the bot to determine whether or not it should be removed. If it turns out it matches the current protection state and should not be removed, we might as well provide an option to save the normalized template, as it may be responsible for why the page is in the maintenance category. (normalize_pp_template)
  6. If while parsing ((pp)) we aren't able to determine what it represents, e.g. ((pp|semi)), the bot can assume the user didn't know how to add the template and auto-generate all the appropriate templates given the current protection state. (auto_generate)
  7. Wrap protection templates with <noinclude>...</noinclude> on pages in the template namespace. (noinclude_in_template_space)
  8. Remove protection templates on pages in the template namespace if the page contains ((collapsible option)) or ((documentation)), or any of their redirects. (remove_from_template_space_if_doc_present)
  9. If the bot wasn't able to do any of the above, it gives up and will let the hard-working humans takeover by caching the "touched" timestamp of the page, and not processing the page again unless it has been changed.
  10. The core protection templates are defined in the config. The bot fetches all redirects to those templates in order to know how to map them to the target template and know what they represent. This information is cached for a week.
  11. There is also configuration for the valid values of |small= and the reasons specified at Template:Pp

Hopefully this new approach addresses some of the aforementioned concerns. I have been testing rigorously in production, manually making the bot-suggest edits using my alternate account, and am fairly confident the bot is stable. During the initial trial any runs will be manually invoked and fully-monitored MusikAnimal talk 08:23, 12 December 2015 (UTC)[reply]

Something we've not mentioned - redirects. If a redir is protected, and has any form of prot template, remove it; if there is no ((redr)), add one. --Redrose64 (talk) 00:12, 13 December 2015 (UTC)[reply]
I tried this on testwiki and a redirect with ((pp)) did not put it in the maintenance category. Either way I think handling this (adding ((redr))) might fall outside the scope of this bot task. Let's revisit this idea at a later time.
I also am inclined to keep the normalize_pp_template (#5 above) disabled. The issue is we can still end up with a modification to a page that does not necessarily remove it from the maintenance category, when that was what we set out to do. The other subtasks are definitively constructive, so maybe I should stick to those for the time being. Thoughts? MusikAnimal talk 04:57, 14 December 2015 (UTC)[reply]
Incidentally, are we sure there's not something wrong with this series of modules/templates when it comes to when they decide to add that category? Several of the members of those categories seem fine. and are inexplicably in the category despite all template parameters appearing normal (particularly when the expiry is near but hasn't actually actually passed). --slakrtalk / 05:05, 19 December 2015 (UTC)[reply]
In these cases, is the datestamp complete? That is, does it contain a valid, correct time as well as the date? If it does, is that time expressed in UTC (although not necessarily stating "(UTC)")? This last case might occur if the person setting up the pp template didn't realise that it needs a UTC time, see Wikipedia:Village pump (technical)/Archive 142#Misleading expiry time for protection. --Redrose64 (talk) 08:19, 19 December 2015 (UTC)[reply]
() All the tests I have done add the date in the format hh:mm, day month year and goes by what the API gives us, which is in UTC. I've found pages show up in the maintenance category a few hours or so before they actually expire, supporting the theory of some timezone differentiation. However, I'm lead to believe the template themselves are possibly not coded correctly, as again putting in a valid UTC time doesn't do the trick. For this reason I'm with Slakr in that maybe adding the time shouldn't be bot-automated, as it is indeed contingent on how the template is coded, which may be subject to change. Moreover, I just don't like the potential of redundant bot edits. E.g. I'm finding similar issues with template pages, see testwiki:Template:Test. The protection templates are wrapped in <noinclude>...</noinclude>, and are valid, yet the page is still in the maintenance category?
By contrast removing templates for which a corresponding protection doesn't exist is fool-proof, and I'd like to move forward with just that for now. So essentially, this task is just a more comprehensive DumbBOT, as that bot seems to miss many pages. How does this sound, at least for a start? The code is there for the other tasks, so we can consider enabling them later once we see that those remedies definitively work MusikAnimal talk 18:19, 20 December 2015 (UTC)[reply]


((BAGAssistanceNeeded)) Where do we stand on this? As stated above, I'd like to move forward with the most simplest of the tasks, which is removing protection templates for which a corresponding protection type does not exist. So if you go by the big numbered list above, the only actionables are #3 and #4. There is still a continual flow of pages meeting this criteria that current bots seem to overlook, so I think this by itself is still worthwhile. I'd like to revisit the other features at a later time, once we iron out exactly how they should work. A subsequent BRFA can be filed for those, if we feel that is necessary MusikAnimal talk 04:09, 30 December 2015 (UTC)[reply]

Alternatively I could just restart WP:Bots/Requests for approval/Lowercase sigmabot and we could end this BRFA right now. Σσς(Sigma) 00:15, 3 January 2016 (UTC)[reply]
Approved for trial (25 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. @MusikAnimal: that seems reasonable. Might as well see how this goes. That said, @Σ and MusikAnimal:, I'm slightly concerned about any clashes between the bots. If the old bot runs, that'd likely be fine (as long as nothing's substantially changed when it comes to what it used to be expected to do versus what it'd be expected to do now), but then we also need to make sure that the newer bot doesn't create a situation where one does something to a template, the other does something else, then the other one "fixes" that bot's fix, etc... nor vice versa. --slakrtalk / 03:01, 3 January 2016 (UTC)[reply]
Also to be clear (though I thought it fairly evident from the adjustment to the scope given in the "reworking" section above), I don't personally feel this task warrants an adminbot. After all, a slightly wrong template on a full-protected page doesn't have to be perfect; MediaWiki makes it very clear should a user try to edit the page that the page is protected, why, and when it expires. :P --slakrtalk / 03:09, 3 January 2016 (UTC)[reply]
I'm not actually familiar with Σ's protection template bot. Does it modify the templates, or just remove them when the expiration expired? This task came about as for whatever reason many pages go unnoticed by existing bots. At any rate, MusikBot will not reprocess the same page within a 3 hour window. It also won't try anything if it was the last editor to the page. Finally, if it removes (or in the future possibly repairs) a template, it should as intended remove the page from the maintenance category, meaning it wouldn't process it again anyway as that's what the bot goes off of. Edit conflicts are also handled accordingly, so we should be ok with any potential bot wars.
I agree about the admin bot or even template-editor bot idea. Consider it withdrawn. I would like to eventually enable some features like repairing of expiries, etc, once we see it actually removes the page from the maintenance category, as sometimes it does not. There's so many factors involved, so I'm going to continue to work on that and once ready perhaps we can do another trial through this same BRFA, or a new one, whatever is advisable MusikAnimal talk 05:56, 3 January 2016 (UTC)[reply]
Sigma's bot worked quite well, but it always seemed to miss something - some non-protected pages would be left with inapplicable protection templates, and it would be left to somebody passing by (often myself) to clean them up. Hence threads like User talk:Σ/Archive/2014/August#Backlog in removal of prot templates; User talk:Σ/Archive/2013/September#Backlog. It would also occasionally add unnecessary prot templates to pages that already had them. --Redrose64 (talk) 15:49, 3 January 2016 (UTC)[reply]


Trial complete. Alright, the base remove-template-if-no-protection trial is complete. The bot stayed pretty busy, and the trial would have been completed much earlier if I hadn't disabled it a few times, as I didn't want to leave running overnight.

A few diffs: [3][4][5][6]

The diffs you care about:

I guess the solution is remove one newline if another newline exists before/after the protection template, or if it's at the top of the page.

Examples:

I'll need an extended trial to prove I can make this happen, but I assume you can believe it's easily fixable.

As for Σ's bot, correct me if I'm wrong, but it adds protection templates to protected pages, and removes them when they are no longer protected. MusikBot however only repairs existing protection templates, which may or may not include removing them as they are no longer applicable. That being said the bots should coexist peacefully.

If we are OK with what we have here with this trial, I'd like to now put focus on the task of correcting expiries. A little background:

MusikBot will only update expiries where the protection has been extended, but the expiry in the template has not been udpated. This is definitely constructive and won't yield any redundant edits. This is what I'd like pursue next.

Meanwhile if you will allow the bot to continue removing templates for which a protection doesn't exist, that'd be great. Please advise, and many thanks! MusikAnimal talk 20:46, 4 January 2016 (UTC)[reply]

it adds protection templates to protected pages, and removes them when they are no longer protected. MusikBot however only repairs existing protection templates, which may or may not include removing them as they are no longer applicable That is right. So if lcsb added a protection template to a protected page, MusikBot would not edit it.
That being said the bots should coexist peacefully. That seems to follow.
Σσς(Sigma) 22:26, 9 January 2016 (UTC)[reply]


A user has requested the attention of a member of the Bot Approvals Group. Once assistance has been rendered, please deactivate this tag by replacing it with ((t|BAG assistance needed)). Hoping to move forward with this, at least the basic task of removing unneeded protection templates. I know the other stuff is a little complicated and a lot to read into... I can talk on IRC if that helps! Thanks MusikAnimal talk 00:27, 11 January 2016 (UTC)[reply]

Would you be satisfied if this BRFA closed with the verdict that your bot be free to remove, but not add or tweak, protection templates? Σσς(Sigma) 23:11, 12 January 2016 (UTC)[reply]
That'd make me feel better about putting all this work into it, yes :) But I have code ready to go to repair expiries, something that is currently tediously being done manually. Much of the other features are more prone to error, I think, and could be explored later in a different BRFA MusikAnimal talk 23:38, 12 January 2016 (UTC)[reply]
As one of those doing the manual work I would like to make a suggestion if the repairing of expires is not approved. Since I am not an admin I do not know exactly what the steps are in adding protection templates. What I do know is that - if a page that is currently protected has that protection extended - the template rarely gets updated by the admin performing the extension. This is especially true in the case of PC protections. Now, I can't say that it never happens because if an admin does edit the template with the new expiry time I would not see that :-) So my suggestion is that a step be added to the tools used for applying protections that reminds the admin to update any protection templates that are on the article where the protection is being extended. If this is not possible or practical I am happy to continue working with those pages. We wikignomes can turn even the most tedious of tasks into a heigh ho situation. Thanks for your time. MarnetteD|Talk 23:54, 12 January 2016 (UTC)[reply]
Thanks MarnetteD :) This is a bot-friendly task, in this case, as MusikBot is already going through the same category. It knowing all the redirects, it can parse any given protection template, and spit it back out with the correct expiry. That being said, I can also work on an update to Twinkle to correct existing protection templates. I will also make sure it adds in the time, which it currently does not. This is what causes so many pages to show up in the maintenance category just before the protection is set to expire (as the absence of a time is treated as midnight) MusikAnimal talk 02:07, 13 January 2016 (UTC)[reply]

() I am still unsure about a few things...

— Earwig talk 10:14, 16 January 2016 (UTC)[reply]

 Approved. (for removing protection templates only) – Time to put this request out of its misery, really. The underlying task of removing protection templates from pages where they don't apply is straightforward, and I'm satisfied enough with the newline solution that we can go ahead as long as the bot's initial edits are supervised. I do think the usage and syntax of protection templates needs a broader look (and I still don't understand why we can't deprecate |expiry= in favor of ((PROTECTIONEXPIRY))), but that's out of scope here, and we should do that with a clean discussion or proposal. — Earwig talk 05:03, 18 January 2016 (UTC)[reply]

The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.