Galobot 2

Operator: Galobtter (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 10:06, Tuesday, October 16, 2018 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): Python/Pywikibot

Source code available: here

Function overview: Message users who add broken file links to articles

Links to relevant discussions (where appropriate): Wikipedia:Bot_requests#CAT:MISSFILE_bot; Wikipedia:Bots/Requests_for_approval/RonBot_12

Edit period(s): Daily

Estimated number of pages affected: ~10-20 a day

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): Yes

Function details: Sends a talk page message to auto-confirmed users who add broken (red linked) file links to mainspace pages, by scanning CAT:MISSFILE. Mechanism is similar to Wikipedia:Bots/Requests for approval/DPL bot 2. Runs daily, seeing what new red linked files have been added, and messages the user who added them if they are auto-confirmed; doesn't message non-autoconfirmed users as they are likely vandals/wouldn't know how to fix the link. Most people who break file links are IPs/non-autoconfirmed so of the 70 or so broken links added each day I estimate only ~10 people will be messaged per day.

Figures out what image is broken and who did it using mw:API:Images and mw:API:Parse to get file links and finds out the revision in which the broken file link was added.

Message sent will be something like:

Hello. Thank you for your recent edits. An automated process has found that you have added a link to a non-existent file File:Hydril Compact BOP Patent.jpg to the page Blowout preventer in this diff. If you can, please remove or fix the file link.

You may remove this message. To stop receiving these messages, see the opt-out instructions. Galobtter (pingó mió) 10:06, 16 October 2018 (UTC)[reply]

Discussion

Note that it may take forever before pages with recently deleted files show up in Category:Articles with missing files so consider obtaining a list of articles from a database report and purging those so that the category is updated before you start notifying users. --Stefan2 (talk) 10:38, 16 October 2018 (UTC)[reply]
Thanks for the comment! In this case, nobody, because it skips cases where the file has been added and then removed and then added, i.e where the file has been added more than once. However if User A adds a file and later User B deletes the file, it'll notify User A, but only if that revision occurred within 24 hours before being listed in CAT:MISSFILE as it only checks the revisions that have occurred since the last run 24 hours ago. I was thinking previously, whether it should skip cases where the file has been deleted after a user adds a file? (can check deletion logs). Galobtter (pingó mió) 11:27, 16 October 2018 (UTC)[reply]
Actually, checking the deletion logs seems pretty necessary since the bot probably shouldn't spam people if FileDelinkerBot/CommonsDelinkerBot goes down. Will add Galobtter (pingó mió) 11:51, 16 October 2018 (UTC)[reply]

Not a good task for a bot. This is effectively equivalent to messaging someone every single time they make a typo and will likely be perceived as spam and/or be irritating to established editors. At 10-20 edits/day, this is pretty low impact, and comes off as a solution in search of a problem. -FASTILY 07:24, 22 October 2018 (UTC)[reply]

As it runs daily, it'll only message if people leave the broken file link for at-least a few hours. I wouldn't want to be messaged every time I made a typo but certainly if I broke a link to file and so caused an easily fixed problem in an article. And there is a definite problem it is trying to help solve: CAT:MISSFILE steadily rising and people spending quite a bit of time every day getting it down (because someone has to eventually fix the file link). That it'd only message 10-20 people a day shows that the number of people who break file links is quite low and so people are unlikely to messaged repeatedly that it becomes an irritant. Galobtter (pingó mió) 07:45, 22 October 2018 (UTC)[reply]
I'll split my response for clarity:
CAT:MISSFILE steadily rising and people spending quite a bit of time every day getting it down (because someone has to eventually fix the file link).
Unless CAT:MISSFILE is primarily populated by editors making typos, this is not a legitimate reason to run this task.
it'd only message 10-20 people a day shows that the number of people who break file links is quite low
Sounds like we don't need this task then
and so people are unlikely to messaged repeatedly that it becomes an irritant
It's irritating to people that do get messaged, especially if you're bothering them over minor things. In fact, this is one of the reasons I am opposed to this task. -FASTILY 03:55, 23 October 2018 (UTC)[reply]
I think the number here is somewhat underestimated - Wikipedia:Bot_requests#CAT:MISSFILE_bot says a 10 day trial generated 681 pages with broken file links. It hardly a minor thing if someone has broken a file link in an article, I think they would want to know. Some of these errors are definitely know to be due a poor search and replace with AWB, if the editor is not aware, then there is the strong possibility that the editor will use the same setup and create even more broken links. Ronhjones  (Talk) 19:50, 25 October 2018 (UTC)[reply]
The reason for that number is that it is mostly IPs or non-autoconfirmed users breaking links and many errors are from failures of the delinker bots upon deletion of files. Galobtter (pingó mió) 20:00, 25 October 2018 (UTC)[reply]
As a regular patroller of CAT:MISSFILE, I can say definitively that many red-linked files are due a poor search and replace with AWB or other script-assisted editors. See these two edit histories (1 and 2) for recent examples of red-linked images caused by script-assisted editing. I'm a less active patroller now than I used to be but I'm sure KatnissEverdeen and Sam Sailor can provide other examples. - tucoxn\talk 07:09, 27 October 2018 (UTC)[reply]

I definitely would agree with Ronhjones and Tucoxn. However, while Tucoxn is definitely right that a lot of red-linked files are because of 'find and replace' AWB/script edits, I would also add that people (especially new editors) often don't realize that editing a filename breaks the image. I would argue that a message would be helpful, as I have received many confused messages on my talk page legitimately asking why I reverted them and what they did wrong. Here are a few other examples to illustrate this point (all of these people messaged on my talk page later saying they didn't know they had done something wrong). 1 2 3. Happy to provide other examples if you like. Cheers, Katniss May the odds be ever in your favor 16:17, 27 October 2018 (UTC)[reply]

Ronhjones and Tucoxn are both right here. Seasoned editors running AWB/scripts and overlooking changes to filenames is a common mistake. I am no saint myself: my first interaction with KatnissEverdeen was when she made me aware that I had overlooked a script-assisted change of a dash to emdash endash in a filename. The more "permanent" solution to these scenarios is to create redirects on Commons. I wish we had a little script for doing that, and if any of you have a good idea where to propose it, I would appreciate your feedback. Galobtter, thanks for coding the bot, I for one would like to know when I screwed something up. Sam Sailor 21:55, 27 October 2018 (UTC) (Amended. Sam Sailor 20:37, 29 October 2018 (UTC))[reply]
@Sam Sailor, KatnissEverdeen, and Galobtter: As a commons admin - I know that will be - c:Commons:Bots/Work_requests to request someone to invent/run a bot, and c:COM:BRFA for bot approvals. Ronhjones  (Talk) 00:40, 28 October 2018 (UTC)[reply]
@Ronhjones and Galobtter: I would agree with the script idea, not sure of the technical lingo I would need to use to request it though. I'm sure you all would be much better at wording the request than I would . Sam Sailor "I am no saint myself: my first interaction with KatnissEverdeen was when she made me aware that I had overlooked a script-assisted change of a dash to emdash endash in a filename." - Haha, I totally forgot about that...very easy thing to screw up and nobody's perfect . Katniss May the odds be ever in your favor 15:39, 29 October 2018 (UTC) (Amended "emdash" to "endash" in quote per WP:TPO for clarity. Sam Sailor 20:37, 29 October 2018 (UTC))[reply]
@Sam Sailor, KatnissEverdeen, and Galobtter: I'm not sure that commons would like such a bot. With 50 million images on site, it might be quite a few redirects! I'll post a question over there and see what they say. Ronhjones  (Talk) 15:45, 29 October 2018 (UTC)[reply]

() @Ronhjones: I have no clue if that is a job for a bot, I was thinking about a script that would make it a bit easier to create redirects on Commons.
Suppose you patrol CAT:MISSFILE, and you "correct" a spelling correction only to be undone which again causes a redlinked file. Here it would save some seconds with a script that could load up https://commons.wikimedia.org/wiki/File:Nutrient_absorbtion_to_blood_and_lymph.png and pop up a box containing the string File:Nutrient absorbtion to blood and lymph.png where you could change it to File:Nutrient absorption to blood and lymph.png, press Create redirect, and a redirect would be created from the latter to the former. Sam Sailor 20:37, 29 October 2018 (UTC)[reply]

Love this idea! I think this would be a super easy solution to quite a few of our issues here. Katniss May the odds be ever in your favor 20:40, 29 October 2018 (UTC)[reply]
@Sam Sailor, KatnissEverdeen, and Galobtter: Interesting. I don't write scripts very well at all, I've no idea how well a script on en-wiki would work well with commons - there are still some old users who have different usernames on commons - might cause issues! However, you don't need a commons redirect - it could be local redirect on en-wiki (does not matter if it redirects to a commons image), that would keep it much more simpler. Maybe you should ask at Wikipedia:User scripts/Requests Ronhjones  (Talk) 21:28, 29 October 2018 (UTC)[reply]
@Ronhjones:, thank you, I created a local redirect at File:Nutrient absorption to blood and lymph.png to File:Nutrient absorbtion to blood and lymph.png, but it did not work. Are there special requirements to the syntax of redirects in file space? Sam Sailor 12:06, 8 November 2018 (UTC)[reply]