The following discussion is an archived debate. Please do not modify it. Subsequent comments should be made in a new section. The result of the discussion was Approved.

Operator: Lemmey

Automatic or Manually Assisted: Automatic

Programming Language(s): Python

Function Summary: Restores missing reference names

Edit period(s) (e.g. Continuous, daily, one time run): As needed.

Already has a bot flag (Y/N):

Function Details: Bot processes articles in the Category:Pages with incorrect ref formatting. Bot looks for missing reference names and the looks at the article history to restore those names.

Discussion[edit]

Looking at the contribs this BOT has already fixed 4 articles. The best example is Overpopulation ( diff) where the bot fixed 2 broken references dating back several edits. Category currently contains 1075 articles Excluding Wikipedia, Talk, and User pages. --Lemmey talk 05:30, 3 May 2008 (UTC)[reply]

That's pretty nifty. What do others think? — Werdna talk 05:18, 3 May 2008 (UTC)[reply]

Sounds good to me. paranomia (formerly tim.bounceback)a door? 20:28, 3 May 2008 (UTC)[reply]
BlackListed Links[edit]

This seems useful. What happens if a ref used a now-blacklisted link? Gimmetrow 07:35, 3 May 2008 (UTC)[reply]

Why not check if each link is blacklisted? Also, the talk link in your signature is annoying. — Werdna talk 04:12, 4 May 2008 (UTC)[reply]

According to MaxSem and confirmed by testing it appears that it is not possible to save a blacklisted link when making an edit. It appears to be a non-issue. --Lemmey talk 08:11, 4 May 2008 (UTC)[reply]
Yes, if the link is in the spam blacklist, the bot won't save. What happens? Will the bot crash, or keep trying to make the same edit? But I'm also asking about links simply removed because they are not reliable sources - a soft blacklist if you will. Gimmetrow 20:53, 4 May 2008 (UTC)[reply]
The function throws an exception and then goes on to the next article. The bot is designed to attempt each article in "Category:Pages with incorrect ref formatting" once. I can create a list for use in future runs that will skip any articles attempted in the previous pass. This will prevent a rollback war between the bot and any editors / other bots.
As far as a particular named source being deemed unreliable, my view is that likely occurred due to a conversation on the talk page. As such the article would likely have enough eyes to already have all the instances of the named reference removed. (Example Source "BLOGGER" is deemed unreliable, it is unlikely a giant red broken ref warning with the name "BLOGGER" will not attract attention.) Since I'm only looking at ~1100 articles in the category, I expect this particular scenario to be minimal. --Lemmey talk 05:20, 5 May 2008 (UTC)[reply]

Once the trial is done, how often do you think you'll scan through the category? It will have a lot of articles at first, but eventually it will get down to just the handful that appear after the last scan by the bot. So once a day? once an hour? Related to that, I think it might be helpful to identify in the edit summary how long the named reference was missing, either by date or version number. If you're doing this like I would expect, that shouldn't be too hard. Restoring really old refs would be a flag to check the ref, I would think. Finally, if you try to edit an article and can't, it may be a blacklisted link, or it may be protected, or it may simply be an edit conflict. You would want to re-try edit conflicts after some delay. Gimmetrow 06:30, 5 May 2008 (UTC)[reply]

I'd say no more than once a week. It really depends on how many are left and how fast the category turn over is (how fast it grows or shrinks). The bot isn't perfect. Right now it skips <ref name= "That have spaces in their names with or without the quotes"> as in United States housing market correction. I'll need to add that and be able to search really deep (500+ versions), something I currently have capped for processing time reasons. I'll look into the version number idea. --Lemmey talk 13:03, 5 May 2008 (UTC)[reply]
Notes[edit]

Note: Discussion about this bot is occuring at WP:AN. SQLQuery me! 08:41, 4 May 2008 (UTC)[reply]

Noone seems to be discussing the bot there, only bot policy and the approval process as I noted on the BAG talk page. --Lemmey talk 09:06, 4 May 2008 (UTC)[reply]
Approved for trial (100 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete.. Strange problem solved. Gimmetrow 20:53, 4 May 2008 (UTC)[reply]
((BAGAssistanceNeeded)) Bot has completed trial.

((BAGAssistanceNeeded)) Bot has completed trial, BOT now has the ability to post the version # of the article the reference was restored from, see http://en.wikipedia.org/w/index.php?title=User:Lemmey&diff=prev&oldid=213721652.

 Approved. Gimmetrow 07:37, 21 May 2008 (UTC)[reply]

Issues[edit]

Case[edit]

There is a problem: your bot considers ref names case-insensitive, while it's not the case. MaxSem(Han shot first!) 17:54, 10 May 2008 (UTC)[reply]
Issue is that the editor considered ref names to be case-insensitive substituting Columbia when he should have used columbia, it was a non-rendered ref and was fixed by the bot. Had it been looking for Columbia the Bot would have bottomed out and not fixed the ref. I'll state that having a named ref stated in full more than once is unsightly, unnecessary, and inefficient but I'll argue that it is not a more serious problem than a visible fault. I ran the bot on the article twice to fix all occurances. --Lemmey talk 18:35, 10 May 2008 (UTC)[reply]
The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.