The following discussion is an archived debate. Please do not modify it. Subsequent comments should be made in a new section. The result of the discussion was Withdrawn by operator.

Operator: The Helpful One

Automatic or Manually Assisted: Automatic

Programming Language(s): Python

Function Overview: Deletes broken redirects

Edit period(s): Periodically

Already has a bot flag (Y/N): Y

Function Details: This will use pywikipedia's redirect.py to delete broken redirects - if approved it will require the admin flag.

Discussion[edit]

Since the code seems well-tested and this is a task previously undertaken by others who have since left the project, I can't see why this technically shouldn't be trialled. Awaiting further community input though following THO's post to AN. Fritzpoll (talk) 14:20, 19 February 2009 (UTC)[reply]

Per Xenos question the bot uses ('\<li\>\<a href=".+?" title="(.*?)"') regex to check the html of the special page to find redirects which point to a non-existing page inside the HTML. ·Add§hore· Talk To Me! 14:27, 19 February 2009 (UTC)[reply]
Does it check that this has always been the case though? Could I come in, redirect a page to a non-existent page, and then it would be deleted by the bot, rather than restored? WilyD 14:31, 19 February 2009 (UTC)[reply]
It does not check to see if this has always been the case. It simply gets the list, Finds pages which redirect to nowhere land and deletes. ·Add§hore· Talk To Me! 14:34, 19 February 2009 (UTC)[reply]
That would be a problem. This morning I found several redirected to self pages in C:CSD that had been articles that someone redirect to themselves later. Redirects to nowhere are likely to suffer the same false positives. WilyD 14:39, 19 February 2009 (UTC)[reply]
There is the code that was used for the previous bot, rdbot.pl - but I am not sure if that code will do want you want - I will ask the creator of it later on today. The Helpful One 14:43, 19 February 2009 (UTC)[reply]
Yes if ($revCount == 1); edit and if ($revCount != 1); dont edit ·Add§hore· Talk To Me! 14:44, 19 February 2009 (UTC)[reply]
Thanks Addshore, but in human language please ;p –xeno (talk) 14:46, 19 February 2009 (UTC)[reply]
Sorry :P If the amount of revisions of the page is 1 it will delete, If it is not one it will not delete. ·Add§hore· Talk To Me! 14:48, 19 February 2009 (UTC)[reply]
As long as you're checking that, it'd make sense to add the multiple revision ones to C:CSD, so they can be inspected by eye (as they need either reversion or deletion). Cheers, WilyD 14:57, 19 February 2009 (UTC)[reply]
Hi there, I asked a user to see if they could do this - and they've changed the script to do so, User:Richard0612/redirect.py. The script should now do what you want if I use this new version. Thanks, The Helpful One 16:07, 19 February 2009 (UTC)[reply]
I have no further concerns. Seems like a worthwhile idear. WilyD 16:24, 19 February 2009 (UTC)[reply]

Approved for trial (20 deletions). Please provide a link to the relevant contributions and/or diffs when the trial is complete. As usual for admin bots, the trial deletions are to be run on your main account. Please include ((Sam1649)) in your edit summaries to make the edits easy to find through the following database query (replace 'Thehelpfulbot' with 'Thehelpfulone' for the trial edits): Richard0612 21:00, 19 February 2009 (UTC)[reply]

SELECT CONCAT('* ',log_action,' on [[',log_title,']] at ',log_timestamp) FROM logging WHERE log_comment LIKE '((Sam1649))%' AND log_user_text = 'Thehelpfulbot';
In response to the redirects, it would leave those alone as they are not broken redirects. When I used to look at the broken redirects list, it was, in my opinion quite large and while at the moment it doesn't seem to be - when it is large is it tedious for admins to sort through. However, the bot will only delete the page if it only has one revision, if it has more than one revision then the bot will skip the page, and tag it for speedy deletion for a human admin to search through. The Helpful One 12:13, 20 February 2009 (UTC)[reply]
Trial complete. Trial complete, awaiting database query. The Helpful One 14:36, 22 February 2009 (UTC)[reply]
Fixed query and results here. Q T C 14:42, 22 February 2009 (UTC)[reply]
This article Had 2 edits, from 2 different users but the bot still deleted it - any idea what's wrong with the code?! The Helpful One 18:18, 22 February 2009 (UTC)[reply]
Code should be fixed now (what was wrong was me looking for the complex answer and not remembering this ...), but it's best to make sure: Approved for trial (20 deletions). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Richard0612 23:37, 23 February 2009 (UTC)[reply]

This task isn't needed currently. --MZMcBride (talk) 00:09, 3 March 2009 (UTC)[reply]

Correct me if I'm mistaken, but it appears that the code author is approving his own trials? That is, Richard0612 is writing the underlying source code, approving the code for trial, and then having Thehelpfulone run the code under his account. Am I missing something? --MZMcBride (talk) 03:12, 3 March 2009 (UTC)[reply]

MZMcBride is perfectly correct, although in actual fact I only added one line to the standard 'redirect.py' script, so that only pages with one revision would be deleted. However, this doesn't remove the fact that I should have waited for another BAG member to approve a trial, to avoid any problems. I would be grateful if someone experienced with adminbots could check the code and the deletions before anything else happens with this task, and I apologise for my lapse in judgement. Richard0612 17:27, 3 March 2009 (UTC)[reply]

This request seems to be a sprawling mess. A few questions:

  1. Is it using redirect.py or rdbot.pl? They're entirely different sets of code.
  2. How is the broken redirects list generated?
  3. How often will the bot be run? "Periodically" means nothing.
  4. Does the code check for date of last edit?
  5. Why on Earth would it tag the pages for deletion? This part makes no sense whatsoever to me.
  6. Whose idea was it to insert nonsense into the deletion logs and why? (For years people are now going to see ((Sam1649)) and wonder why the hell it's there. This seems pretty poorly thought-out.)
  7. Why is this being proposed as part of an already-existing bot? Isn't there some sort of requirement to separate functionality into a separate account?
  8. How does the code handle redirects to other projects?
  9. How does the code handle file redirects or category redirects?
  10. Why on Earth does it not include the target page in the deletion summary? (That's the most important part!)

--MZMcBride (talk) 18:17, 3 March 2009 (UTC)[reply]

In response to your questions:
  1. It is using redirect.py Redirect2.py - coded by Richard0612
  2. The bot pages use of the Special Page - Special:BrokenRedirects
  3. Periodically means whenever needed - most likely weekly (as the special page is updated around every 3 days)
  4. Currently this task will be on hold, as there's a bug with pywikipedia that it's not picking up the article history - which means this bug needs to be fixed. It checks the page history, if the page history has >1 revision, then it will tag for speedy deletion.
  5. It would tag pages for deletion because if they have more than 1 edit in the history - a vandal may have double redirected it or it may need to be reverted - which is why it would need a second look by a human.
  6. The ((Sam1649)) template is to be able to run a database query so that people can see what deletions were made on my admin account by my bot. This allows me to differentiate my edits from the bots. Note: This is only for the trial, not for the real bot edits.
  7. I'm not sure about this - but I would add it to a different account if the bot code is fixed.
  8. I'm not sure for the deletions, but I think think it leaves other projects alone. - In the bot's eyes soft redirects are not real redirects, so it would ignore these pages. Because MzMcBride doesn't understand, if a page redirects to another project (this is called a "soft redirect") the bot will not make any changes to the page.
  9. I would need to test it for file redirects / category redirects, because I've not had this yet.
  10. If this is needed then I think I should be able to (or get someone to) change the edit summary coding to show the deletions. I have updated the edit summary accordingly.
However, you think that this task is not needed - if others also agree with this - then I will withdraw the task until or if it is needed. Was the task not useful when User:RedirectCleanupBot performed the task? The Helpful One 21:51, 5 March 2009 (UTC)[reply]
Note to other BAG members: It would appear that THO is waiting on a MediaWiki bug fix (or something like that), so please don't expire this request for the time being. Think of it as "on hold", if you will. - Jarry1250 (t, c) 20:28, 8 March 2009 (UTC)[reply]
Yes, I'm waiting for a pywikipedia bug fix so that the bot will work correctly. The Helpful One 21:18, 8 March 2009 (UTC)[reply]
The bug has now been fixed. The Helpful One 17:53, 9 March 2009 (UTC)[reply]
Can I please have a trial to re-test the bot to see if it works? (I'll see if I can incorporate the target page in the deletion summary). The Helpful One 17:55, 9 March 2009 (UTC)[reply]
Approved for trial (20 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Have fun! Reedy 19:30, 9 March 2009 (UTC)[reply]
Thanks for the trial, made comments on my answers to the questions. The Helpful One 20:50, 9 March 2009 (UTC)[reply]
Added another update to my answers. The Helpful One 20:53, 9 March 2009 (UTC)[reply]
Trial complete. 17 deletions and 5 taggings. See here for now for the deletion log and check my deleted contributions for the taggings. The Helpful One 21:40, 10 March 2009 (UTC)[reply]

A couple of things:

  1. Stop adding gibberish to the deletion logs. If you want to find the deletions, use "Robot:" or look for the deletions that are grouped together and all follow a specific pattern. It's not that hard. Adding gibberish like ((Patsy6969)) is entirely unnecessary.
  2. Above, you misunderstood one of my questions. How does the script handle redirects like #REDIRECT [[mw:User:Foo]]?
  3. Why is the bot tagging pages? Isn't it much simpler to have it output a list somewhere and have an admin review the list at their leisure rather than creating a backlog? (And it saves from needless edits.)
  4. How does the bot handle file and category redirects? (This isn't a theoretical question; both of these things exist in MediaWiki core and will shortly be enabled.)

--MZMcBride (talk) 04:05, 11 March 2009 (UTC)[reply]

The template is for generating User:OverlordQ/ThbQuery. It will be removed when/if approved. Xclamation point 04:08, 11 March 2009 (UTC)[reply]
In response to question 4, a member of the pywikipedia team said that once file and category redirects are implemented - they will add that to the code, after all it is hard to test something that has not been officially released yet! The Helpful One 19:40, 11 March 2009 (UTC)[reply]
Who said they will add the code? If the person doesn't have time or doesn't want to, what's plan B? Do you realize pywikipedia breaks quite often because it uses screen-scraping rather than the API? And where's the updated code for this bot? User:Richard0612/redirect.py hasn't been updated in several weeks. --MZMcBride (talk) 20:21, 11 March 2009 (UTC)[reply]
I have updated the code for the script - however it's really part of the updated code of pywikipedia that made the change for the page history. Nobody themselves said they'll update the code - they just said that they'll work on it. Perhaps this should be put on hold until file and category redirects are implemented so that can be tested as well? However, in the mean time, the broken redirect list continues to grow is currently quite large at nearly 500 broken redirects. The Helpful One 20:45, 11 March 2009 (UTC)[reply]
TBH, I really don't like the idea of an adminbot being run by someone who isn't able to fix it when it breaks. Its not a critical task, so its not that important that it be fixed immediately, but that means there's no rush to get a bot made to do this. Also, 500 broken redirects isn't really a whole lot; that's about 0.01% of all redirects. Mr.Z-man 04:13, 13 March 2009 (UTC)[reply]
I must say that I would prefer that the 'only delete if 1 revision' modification was incorporated into pywikipedia's redirect.py directly (as an option/argument perhaps?), that way the code wouldn't have to be reworked whenever pywikipedia is updated and there wouldn't be two branches of (essentially) the same code. Richard0612 12:18, 15 March 2009 (UTC)[reply]
Moving this from THO's talk page, with minor changes for clarity: There still seem to be issues with the script. I currently have a redirect loop in my user space, as an example for a WP:VPR discussion. It starts at User:Amalthea/test1, and redirects test1 → test2 → test3 → test2
The script made the following three changes, in this order:
I'm unsure what it was trying to do, and haven't looked at the code. The first tag was, at that moment, incorrect, since all targets do and did exist. It doesn't look like a concurrency issues here, or that the bot was thinking ahead to tag test2 too.
The target swapping of test1 was most probably caused by the loop, which the bot should either not touch at all, or handle differently.
Evidently, it wanted to fix the double redirect there, which it doesn't have to in the first place since double redirects currently work (and will continue to work if bugzilla:17888 is implemented, for which there is consensus).
In any case, it should have used WP:CSD#G8 these days instead of the deprecated WP:CSD#R1 for tagging a redirect to a redlink.
Lastly, and most importantly, all of the pages have a non-trivial page history, so they should never be speedy deleted, or tagged as such.
Cheers, Amalthea 14:13, 15 March 2009 (UTC)[reply]
What's the hangup on this? Fixes in pywikipedia? The bot itself? I'd be more apt to agree with MZMcBride's suggestions above. Also why is it also trying to fix double redirects? Unintended side-effect? Q T C 22:42, 25 March 2009 (UTC)[reply]
The above was from an approved task of THB, which is to fix double redirects - a different task and a different operation from the same code. I've received comments now from above and off-wiki that running an admin bot probably wouldn't be a good idea if I didn't know how to fix it if it went wrong. However, the broken redirect category can be cleaned up with this script, so I am unsure what to do! The Helpful One 15:16, 28 March 2009 (UTC)[reply]
Well FWIW, I've been running this which goes every two hours and just does reporting. I really think the vast majority of them need human intervention and blindly deleting all of them with a single revision is a bit heavy handed as some of the redirects can be saved and/or might require fixing templates, etc. Q T C 03:29, 30 March 2009 (UTC)[reply]

Question: Are there any differences between this and Wikipedia:Bots/Requests for approval/FritzpollBot 2? – Quadell (talk) 19:49, 17 April 2009 (UTC)[reply]

A user has requested the attention of the operator. Once the operator has seen this message and replied, please deactivate this tag. (user notified)Quadell (talk) 12:56, 28 April 2009 (UTC)[reply]

Hi there. Essentially, it will be doing the same job, with a different code (perhaps?). Anyways, if Fritzpoll knows how to fix things that go wrong with his bot, then I am happy to let his bot do this task - if you read it was mentioned that I won't be able to fix things if they go wrong, which, for this bot, seems true. MZMcBride is asking the same questions that he asked on this page, it will be interesting to see the answers. The Helpful One 18:04, 29 April 2009 (UTC)[reply]

A user has requested the attention of the operator. Once the operator has seen this message and replied, please deactivate this tag. (user notified) Now that User:Yet Another Redirect Cleanup Bot is going at these, do you still want to do this as well, or should we withdraw it? – Quadell (talk) 00:04, 17 May 2009 (UTC)[reply]

Hi, this request can now be withdrawn as that Redirect Cleanup Bot is working fine for this task. Thanks, The Helpful One 17:35, 18 May 2009 (UTC)[reply]


The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.