.mw-parser-output .hlist dl,.mw-parser-output .hlist ol,.mw-parser-output .hlist ul{margin:0;padding:0}.mw-parser-output .hlist dd,.mw-parser-output .hlist dt,.mw-parser-output .hlist li{margin:0;display:inline}.mw-parser-output .hlist.inline,.mw-parser-output .hlist.inline dl,.mw-parser-output .hlist.inline ol,.mw-parser-output .hlist.inline ul,.mw-parser-output .hlist dl dl,.mw-parser-output .hlist dl ol,.mw-parser-output .hlist dl ul,.mw-parser-output .hlist ol dl,.mw-parser-output .hlist ol ol,.mw-parser-output .hlist ol ul,.mw-parser-output .hlist ul dl,.mw-parser-output .hlist ul ol,.mw-parser-output .hlist ul ul{display:inline}.mw-parser-output .hlist .mw-empty-li{display:none}.mw-parser-output .hlist dt::after{content:": "}.mw-parser-output .hlist dd::after,.mw-parser-output .hlist li::after{content:" · ";font-weight:bold}.mw-parser-output .hlist dd:last-child::after,.mw-parser-output .hlist dt:last-child::after,.mw-parser-output .hlist li:last-child::after{content:none}.mw-parser-output .hlist dd dd:first-child::before,.mw-parser-output .hlist dd dt:first-child::before,.mw-parser-output .hlist dd li:first-child::before,.mw-parser-output .hlist dt dd:first-child::before,.mw-parser-output .hlist dt dt:first-child::before,.mw-parser-output .hlist dt li:first-child::before,.mw-parser-output .hlist li dd:first-child::before,.mw-parser-output .hlist li dt:first-child::before,.mw-parser-output .hlist li li:first-child::before{content:" (";font-weight:normal}.mw-parser-output .hlist dd dd:last-child::after,.mw-parser-output .hlist dd dt:last-child::after,.mw-parser-output .hlist dd li:last-child::after,.mw-parser-output .hlist dt dd:last-child::after,.mw-parser-output .hlist dt dt:last-child::after,.mw-parser-output .hlist dt li:last-child::after,.mw-parser-output .hlist li dd:last-child::after,.mw-parser-output .hlist li dt:last-child::after,.mw-parser-output .hlist li li:last-child::after{content:")";font-weight:normal}.mw-parser-output .hlist ol{counter-reset:listitem}.mw-parser-output .hlist ol>li{counter-increment:listitem}.mw-parser-output .hlist ol>li::before{content:" "counter(listitem)"\a0 "}.mw-parser-output .hlist dd ol>li:first-child::before,.mw-parser-output .hlist dt ol>li:first-child::before,.mw-parser-output .hlist li ol>li:first-child::before{content:" ("counter(listitem)"\a0 "} Overview Dictionary Bot policy Bot Approvals Group Guide Noticeboard Newsletter Bot Requests Requests for Approval Adminbots ((BAG Tools)) Creating a Bot List of bots Activity monitor Status report History Types
Bots noticeboard

Here we coordinate and discuss Wikipedia issues related to bots and other programs interacting with the MediaWiki software. Bot operators are the main users of this noticeboard, but even if you are not one, your comments will be welcome. Just make sure you are aware about our bot policy and know where to post your issue.

Do not post here if you came to


New Pywikibot release 3.0.20200326

(Pywikibot) A new pywikibot release 3.0.20200226 was deployed as gerrit Tag and at pypi. It also was marked as „stable“ release and the „python2“ tag. The PAWS web shell depends on this „stable“ tag. The „python2“ tag indicates a Python 2 compatible stable version and should be used by Python 2 users.

Among others the changes includes:


The following code cleanup changes are announced for one of the next releases:

All changes are visible in the history file, e.g. here

Best  @xqt 12:04, 27 March 2020 (UTC)[reply]

Re-examination of ListeriaBot

I think that this Bot's BRFA was defective and that this deficiency has been shown repeatedly. I first encountered the bot when it was used by an editor to create a bunch of list articles in mainspace (see entries which start List of museums). This bot did not seem to be approved to operate in mainspace which seems to have since been confirmed in this BOTN discussion earlier this year. Today there is evidence of repeated problems with the bot and non-free images. From today: Bot operator's talk page &AN earlier discussions: User talk:ListeriaBot/Archive 1#Adding non-free images, Wikipedia:Village pump (technical)/Archive 154#ListeriaBot adding non-free images to Wikipedia namespace page, Wikipedia:Village pump (technical)/Archive 158#Listeria bot and non-free images, User talk:Magnus Manske/Archive 6#Non-free images being added by Lysteria bot and Wikipedia:Village pump (technical)/Archive 159#Lysteria bot and shadowing. This bot, even during the approval process, was operated outside of policies and the operator does not seem to be responsive when concerns are raised. I ask for two actions:

  1. The bot be partially blocked from article space so that it may not edit there
  2. Changes be made to the bot, either by the current operator or by a new operator, to ensure that future non-free image problems do not occur and to the extent that they do that changes are made in a responsive manner

I recognize that this bot is important to the running of many behind the scenes tasks and do not want to disrupt those; it's only because of that that I am not asking for the bot's approval to be revoked until action 2 can be completed. However, I believe that this bot has, since before its approval, been operating outside of the bot policy and that this continues to the present time and in disruptive ways. Best, Barkeep49 (talk) 14:34, 11 April 2020 (UTC)[reply]

Partial block was floated last time, this time I'll implement. Artice-space only, but at the moment that seems to be the main soncern. Should stop all the bickering at ((Wikidata list)) as well. Primefac (talk) 15:02, 11 April 2020 (UTC)[reply]
Thanks Primefac, but I do wish to note that today's issue around non-free images was caused in userspace. Hence action request 2. Best, Barkeep49 (talk) 15:06, 11 April 2020 (UTC)[reply]
Fair enough, though I'd like to hear other input before blocking wholesale again (unless you think only allowing in WP-space until this is sorted isn't too unreasonable).
Actually, that could still be an issue in WP-space. Do we know if things are properly set up on places like WiR? Primefac (talk) 15:12, 11 April 2020 (UTC)[reply]

let us consider what Listeria lists stands for

This discussion has been closed. Please do not modify it.
The following discussion has been closed. Please do not modify it.

I am quite happy to have the bot flag of ListeriaBot considered but lets do it properly. When you really, really, really want to go this way. Let us consider what Listeria lists stands, its Wikipedia alternatives. Disambiguation, red links blue links and black links. Most importantly how we share in the sum of all knowledge and how English Wikipedia can play a vital role in it. Let's include images linked to people, the role Commons can play in this. How English Wikipedia can keep its non free images and inform on the images that it keeps in this way. Let this conversation not be about an edge case.

By all means discuss a bot flag for ListeriaBot but do present a serious alternative. Serious not in intentions but serious in that it will serve us in a way that is imho missing in what Wikipedia stands for in its dismissal of collaboration on multi project and multi language levels. Without a reasonable outcome branding us all as Wikipedia is mostly painful because of what we could stand for together. Thanks, GerardM (talk) 18:03, 11 April 2020 (UTC)[reply]

GerardM, I don't say this often, but you just wrote a lot of text without saying anything. What are you trying to say? Primefac (talk) 18:12, 11 April 2020 (UTC)[reply]
Primefac maybe you understand my blogpost better.. For me this episode is another reason why I do not want to be associated with Wikipedia. What is it with you people? Thanks, GerardM (talk) 08:44, 12 April 2020 (UTC)[reply]
@GerardM: Your blog post also doesn't say anything that is useful to this situation. Everybody agrees that the core job this bot does is useful, so simply repeating that it is useful and explaining why it is useful adds nothing of value. The issue is that the bot will not be unblocked unless and until it is reprogrammed so that it doesn't edit outside of what it has authorisation to do. This is exactly how every other bot that has bugs is treated - if the operator does not stop it then it is blocked. Listiera bot is not special, it is being held to same standards required of every bot that operates on the English Wikipedia. Thryduulf (talk) 12:15, 12 April 2020 (UTC)[reply]
@Thryduulf: at issue is that English Wikipedia has pictures that are not free. These pictures only show on English Wikipedia. What we can do is include a link to the Wikidata item on Commons and only show pictures marked that way. It has additional benefits because those pictures will be easier to find including in "other" languages like Russian, Kannada, Comanche. It says so in the blogpost.
Also do you not think that this is where English Wikipedia untouchables take a position where the penalty to our community is excessive. What are you guys thinking??
Also, when are we going to discuss and act on issues with quality on English Wikipedia. At least 4% of list entries in English Wikipedia are erroneous and the quality of maintenance by hand is substantially less than Listeria maintained lists. Together we will do a better job. Thanks, GerardM (talk) 12:31, 12 April 2020 (UTC)[reply]
How is any of that relevant to this discussion? It doesn't matter what else Listeria bot does, can or could do, it will not be unblocked unless and until it is reprogrammed so that it doesn't do anything it does not have community consensus to do. Thryduulf (talk) 12:37, 12 April 2020 (UTC)[reply]
How is it that you are only willing to consider a perceived wrong and not willing to consider the meat of the matter, quality? When you insist on branding ListeriaBot as ill behaved because of a corner case, a four percent improvement fixing the false friends in Wikipedia lists is quite substantial and should have your attention. Thanks, GerardM (talk) 13:10, 12 April 2020 (UTC)[reply]
Simply put, the harm done by one avoidable copyright violation outweighs all the other good things the bot does. Human editors that behave in the way this bot does (knowingly or recklessly introducing copyright problems, editing otherwise than in accordance with consensus, ignoring editing restrictions) are regularly blocked. The good they do elsewhere is not regarded as justifying the harm they cause. Listeriabot is not special and there is no reason to treat it differently than any other bot or editor would be treated. Thryduulf (talk) 14:23, 12 April 2020 (UTC)[reply]
Simply put, substantiate your claims. You argument is about a corner case where a procedure exists to overcome issues. We are talking about issues at Commons, a project that is more stern in its maintenance of copyright then English Wikipedia is. On the other hand, an error rate of 4% of all English Wikipedia lists is substantial, there is no mitigation the case is well argued. In addition Magnus has demonstrated that Listeria lists are better maintained than the average manually maintained list. Now consensus is something to hide behind when arguments fail you. Such behaviour is not special and harms our cause. Is quality of English Wikipedia a consideration at all? Thanks, GerardM (talk) 14:47, 12 April 2020 (UTC)[reply]

Accidental usage of non-free images

The issue raised here seems to be centered on the accidental use of non-free images via the edits of this bot. I've posted a proposal at Wikipedia_talk:Non-free_content#Requiring_non-free_content_to_indicate_that_in_their_filenames that would resolve this without requiring changes to this bot. Please comment there! Thanks. Mike Peel (talk) 18:45, 11 April 2020 (UTC)[reply]

You do realize that until that RFC concludes you're essentially saying you're okay with situations like this, where two bots mindlessly edit war for no reason other than the fact that one bot is not "behaving" correctly? Primefac (talk) 20:33, 11 April 2020 (UTC)[reply]
@Primefac: I'm suggesting a broader solution that would fix this issue while simultaneously avoiding any similar situation arising in the future. I could implement it tomorrow if there's consensus to do so. Thanks. Mike Peel (talk) 20:43, 11 April 2020 (UTC)[reply]
Then by all means, please fix the bot so that it doesn't add non-free files to non-articles. Primefac (talk) 21:13, 11 April 2020 (UTC)[reply]
@Primefac: I'm suggesting fixing enwp so the bot wouldn't cause problems. Mike Peel (talk) 21:15, 11 April 2020 (UTC)[reply]
While I realize it's only been a few hours, there is currently no consensus to implement your plan. I am genuinely curious, why is there such reluctance to make this change? Primefac (talk) 21:18, 11 April 2020 (UTC)[reply]
It is not entirely obvious to me that this change is an easy one to make. How many additional server queries would be required to detect commons-images-shadowed-by-non-free-local-images, compared to the queries the bot already makes to do its work, how much extra load would be caused by the queries, and what information from the server is available for the bot to determine whether a local image that shadows a commons image is unfree? Have you actually done this analysis? Or do you just assume that because you can do it as a human by clicking on and using your human ability at natural languages to read a few pages that it will be equally easy for a bot? I don't actually know that it's difficult, but I don't know that it's easy, and I don't see convincing evidence that you do either. —David Eppstein (talk) 23:47, 11 April 2020 (UTC)[reply]
The bot would just need to check whether the image is in Category:All non-free media. This can easily be done using the categorymembers API, or if the bot runs on toolforge, using the database mirrors. ST47 (talk) 23:53, 11 April 2020 (UTC)[reply]
@David Eppstein and ST47: ST47 beat me to it. Glancing at the source, the bot makes ample use of SQL queries. A patch for this issue would be no more than 10 lines of code (perhaps I'll create a pull request...). --Mdaniels5757 (talk) 00:06, 12 April 2020 (UTC)[reply]
@Mdaniels5757: I suspect you may find it will take a bit more than that, and be rather more server-intensive than you seem to think, to deal with a problem of shadowed file-names that shouldn't exist anyway. Jheald (talk) 12:41, 12 April 2020 (UTC)[reply]
I don't think that the bot has to detect shadowed files; it needs just to detect whether the enwiki filepage is non-free and that can be done by checking for the All non-free media categories category or the ((Non-free media)) template. It's not really reasonable to expect a bot (or even a human) to detect incorrectly licenced files on either project; I'd file these under GIGO and let editors take care of them as they come across them. Jo-Jo Eumerus (talk) 08:52, 13 April 2020 (UTC)[reply]
@Jo-Jo Eumerus: Sure. I don't disagree.
But (as I suggest in more detail in the section below) what will add to the complexity of the bot, and the load on the servers, is having to detect when it is adding files at all, and then having to run a SQL request to check each one of them -- and having to do so in a way that is specific to en-wiki distinct from any of the other 70 wikis the code is serving, fracturing what otherwise is relatively simple single unified code.
Moreover, again as argued below, the most relevant point I think is it may actually be beneficial that the bot is surfacing files with this shadowing issue, that (once the edit is made) can then be rather easily picked up by an SQL intersection of images in the non-free category and images on Listeria pages, so that the underlying filename problem can then be identified and fixed, rather than it continuing to fester under the surface. Jheald (talk) 09:18, 13 April 2020 (UTC)[reply]
But again, why does it need to display the image for that purpose? As repeated ad nauseam through all three discussions, the fix isn't some kind of lengthy recoding; it's literally the addition of a single colon so the relevant section of the reports generates [[:File:Filename.jpg]] instead of the current [[File:Filename.jpg]]. ‑ Iridescent 09:58, 13 April 2020 (UTC)[reply]
(ec) @Iridescent: As I understand it, that is not the fix that Jo-Jo was suggesting. He was suggesting making that change only for the files that are local non-free ones. Which is a much larger coding job, with the issues noted above.
What you suggest would remove display of all the images in all the Listeria lists - all 2500 of them on enwiki - to deal with a transient incidental issue that affects at most only a handful of images at a time out of all of those lists, and in talk-space not main space. It means for example, in a Listeria blue-link/red-link list of paintings by an artist, Listeria would no longer show what the paintings were, which can be hugely useful for the identification of paintings that may go by various different names, or for the identification of duplicates; it makes it hard to identify paintings that may have substandard images, or eg to prioritise article creation for paintings that have really good images. So I do think that the blanket turning off of all images on Listeria pages is a step to be avoided if we possibly can.
Also, it means that the mechanism described above, of being able to identify shadowed images by a simple SQL query through their being used on a Listeria page would fail, because they would no longer be being used on a Listeria page. Jheald (talk) 10:21, 13 April 2020 (UTC)[reply]
Mind you, we already have a bot that flags shadowed files (GreenC bot), so I wouldn't consider another bot doing the same as a large advantage. Jo-Jo Eumerus (talk) 10:17, 13 April 2020 (UTC)[reply]
If it's doing such a good job, then why do we have this problem? Jheald (talk) 11:36, 13 April 2020 (UTC)[reply]
Probably because ShadowsCommons situations were not really time sensitive matters until ListeriaBot began getting confused by them. I am also not sure how the latter helps "surfacing" the shadowed files. Also, I think that I and Iridescent are approaching different ends of Listeriabot in an attempt to resolve this issue (I am looking at the input, Iridescent is proposing a change to the output) Jo-Jo Eumerus (talk) 12:36, 13 April 2020 (UTC)[reply]
@Jo-Jo Eumerus: Thinking about this a bit more, from a strictly coding point of view, the cleanest approach (if there really is a problem here that needs to be dealt with) might be to let the existing code make its edit, wait a few seconds for the SQL tables to update, then run an extra script to make an SQL query to see whether any of the images now on the page were also in the non-free images category, and if so then edit the page to insert a colon to turn the displayed file into a link, and also make sure that the file was categorised in Category:Wikipedia files that shadow a file on Wikimedia Commons.
This would have the advantage of requiring only the most minimal changes to the existing main Listeria script, that runs across 71 wikis; and also making sure that the identified files were in the category for fixing. But it would come at the cost of an extra edit, and of the files still appearing where they shouldn't for a few seconds. Do you think such an approach would be (a) workable, and might be (b) acceptable? Jheald (talk) 13:59, 13 April 2020 (UTC)[reply]

Status of bot

What happens to Wikidata updates?

Apologies if I've missed this, but what's going to happen with the Wikidata list updates from now on (apart from them not happening)? Thanks. Lugnuts Fire Walk with Me 07:27, 12 April 2020 (UTC)[reply]

As it is, they happen. Just not on English Wikipedia. Thanks, GerardM (talk) 08:39, 12 April 2020 (UTC)[reply]
Lugnuts, which Wikidata lists specifically, can you give examples ? There are many possible interpretations of your phrasing, and more exact answers require more exact questions. —TheDJ (talkcontribs) 08:46, 12 April 2020 (UTC)[reply]
@TheDJ:, ones such as WP:WIROLY. This would be updated every day or so, removing links that now have wiki articles. Lugnuts Fire Walk with Me 08:56, 12 April 2020 (UTC)[reply]
Lugnuts, the only effect is that updates won’t happen. And you cant make new lists either. —TheDJ (talkcontribs) 09:03, 12 April 2020 (UTC)[reply]
What happen is that the hundreds, if not thousands, of editors whose work is assisted by Listeriabot, and by whose consensus it has operated for years, get badly inconvenienced. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:00, 12 April 2020 (UTC)[reply]
Inconvenience in non-essential tasks is a small price to pay when the alternative is mass copyright violation and bot operator that cannot and/or will not follow basic bot policy. Thryduulf (talk) 09:58, 12 April 2020 (UTC)[reply]
Citation needed. Mass copyright violation, REALLY who are you fooling! Thanks, GerardM (talk) 11:27, 12 April 2020 (UTC)[reply]
@Thryduulf: There simply isn't mass copyright violation. That's bullshit, and really you are better than this. There are a handful of edge cases, that would be well-handled by renaming the images so they don't clash with the Commons names. (Something we ought to be doing and ought to have been doing anyway). Jheald (talk) 12:02, 12 April 2020 (UTC)[reply]
There is nothing stopping the bot as currently programmed from committing mass copyright violations, and given the bot operator does not see this as a problem with the bot, then I'm sorry but the encyclopaedia is better off without the bot. Thryduulf (talk) 12:07, 12 April 2020 (UTC)[reply]
@Thryduulf: The bot isn't committing "mass" copyright violations. I don't know if you've looked at the bot's contribution history and scrolled back through the last 7 days to get an idea of the number of projects and contributors using this bot to organise and present their workflows, but it's quite a number. So no, the encyclopedia is not "better off without this bot". The present block is a wildly disproportionate over-the-top response to deal with a tiny handful of edge cases that shouldn't exist anyway if en-wiki had been doing its job. Jheald (talk) 12:37, 12 April 2020 (UTC)[reply]
@Jheald: the correct number of copyright violations is zero. Any bot making greater than that many copyright violations is better off blocked regardless of what else it does - if it is important then the bot will be fixed or someone else will code a replacement bot that doesn't violate core policies. It is always the responsibility of a bot operator to ensure that it operates in accordance with policy and consensus, it is never the job of the English Wikipedia to change policies or practices to make allowances for a badly coded bot. Thryduulf (talk) 12:49, 12 April 2020 (UTC)[reply]
@Jheald: fixing the ping. Thryduulf (talk) 12:49, 12 April 2020 (UTC)[reply]
Others have dealt with your fatuous "mass copyright violation" claim. Who gets to decide which tasks are "non-essential"? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:29, 12 April 2020 (UTC)[reply]
A task is essential if (a) the encyclopaedia will cease to function if it ceases, or (b) the encyclopaedia or its editors will suffer real-world harm if it ceases. So removing copyright violations is essential, adding them is not. That applies whether you regard repeatedly introducing multiple copyright violations to multiple pages for multiple years as "mass" or not. Thryduulf (talk) 15:50, 12 April 2020 (UTC)[reply]
I note that you didn't answer my question, but instead indicate that virtually nothing is essential, in your own recokoning, and by logical extension it OK to inconvenience - to greatly inconvenience, in this case - anyone and everyone not working on that very limited set of tasks that you deem essential. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:58, 12 April 2020 (UTC)[reply]
What else could reasonably be deemed essential? Inconveniencing people, greatly or otherwise, is unfortunate but as Headbomb points out this does not make complying with policy optional. Thryduulf (talk) 18:40, 12 April 2020 (UTC)[reply]
I need to use Listeria to replace my manually edited page List of Catholic churches in Salvador, Bahia. I can't do this manually anymore, it's too difficult. How do I get started with Listeria? Thanks! Prburley (talk)
That would not have been perimtted even before ListeriaBot was blocked. * Pppery * it has begun... 14:43, 14 April 2020 (UTC)[reply]
@Pppery: Why not? It's just too overwhelming to edit these lists manually, and information on historic heritage sites is a vital part of the WP mission--not to mention that they're being damaged or destroyed frequently. Prburley (talk)
@Prburley: Because ListeriaBot is not approved for that. Headbomb {t · c · p · b} 17:09, 14 April 2020 (UTC)[reply]
@Headbomb: What's the process for having it approved? The solution below will work, but I think Listeria is what I'm looking for. Prburley (talk)
@Prburley: See WP:BOTAPPROVAL. Headbomb {t · c · p · b} 17:40, 14 April 2020 (UTC)[reply]
@Prburley: It could use a bit of tidying-up, but as a basic Listeria list, if you put something like this on a page in your own user-space, Listeria would then update it to give you something like this. Having hand-checked it, you could then copy & paste it to a live page in article space (and similiarly with updates down the line). As Pppery says, Listeria currently isn't allowed to edit or update article space on en-wiki directly (unlike pt-wiki where this week this Listeria list made "featured list" status), but as I understand it, there is no objection to using Listeria to create a list in your own user-space and then copying that to main-space, so long as you have personally hand-checked it, and it is appropriately referenced.
There are various tweaks that could be added to the pretty quick and rough example above -- for example the coordinates could be presented more attractively, a notes column could be added and populating by adding described by source (P1343) statements to the Wikidata entry for each church, some English labels or English sitelinks may be missing, we could probably do better for "location", etc; and there may be entries missing from the list because they don't have items, or aren't currently identified as churches; or don't currently have the right diocese (P708) information. But I hope it gives some idea at least of what is possible. And of course, having curated it for English wikipedia, the data is then also immediately available for anyone wanting to get Listeria to make a version of the page in Portuguese or any other language. Jheald (talk) 15:24, 14 April 2020 (UTC)[reply]
Also probably worth noting that the page above was generated on Wikidata. If it was generated on en-wiki, then blue-links would be to en-wiki articles, with a choice of red-links or plain text for items not matched to en-wiki. Jheald (talk) 15:36, 14 April 2020 (UTC)[reply]
Version with what Wikidata has at the moment for location (P276) now added [1]. Currently a bit sparse, but could easily be improved. Jheald (talk) 15:33, 14 April 2020 (UTC)[reply]
Thank you for your amazing suggestions! I really appreciate it. Prburley (talk)

Forking the bot

Everyone seems to agree that the bot should be updated to fix the issue, but that isn't possible since the operator is inactive and hasn't fixed the issue the other times it's been brought up. The solution then seems to be forking the bot and implementing a patch. Since the source is public on bitbucket it shouldn't be too hard. Anyone who would volunteer to take on the task? ‑‑Trialpears (talk) 10:01, 12 April 2020 (UTC)[reply]

@Trialpears: I'm not quite sure who the "everyone" is that you are referring to. I see a number of editors above saying that the more appropriate solution would be to deal with the handful of files with names that shadow Commons, that it would be worth fixing anyway. Jheald (talk) 12:48, 12 April 2020 (UTC)[reply]
@Jheald: The point repeatedly made and ignored is that we already do deal with those images, and that no other bot has any issues with the status quo. Thryduulf (talk) 12:51, 12 April 2020 (UTC)[reply]
@Thryduulf: If en-wiki already is dealing with these images with shadow names, then the block here is even more a gratuitous unnecessary overkill than was first apparent. If even the handful of problems that have caused this fuss are already getting resolved within a few days, then why block the bot? If the issue is already in hand, and gets resolved routinely, then why all this fuss? Jheald (talk) 13:59, 12 April 2020 (UTC)[reply]
Corollary - user is found adding copyrighted content to a page. They are reverted and warned. They do it again, and they are reverted. Repeat ad nauseam. Let's say every time they add the copyrighted content they are reverted within a few days.
After how many reverts and warnings would we block the user? 1? 3? 10? My personal experience says 3, and from what I've seen the bot has previously done this on a single page more than a dozen times.
Yes, the images the bot is trying to place are being removed. HOWEVER, the bot shouldn't be placing them in the first place. It shouldn't be editing in article space (which is a secondary/minor issue being brought up again in this thread). "Living" editors get blocked all the time for this sort of behaviour, regardless of how otherwise useful their edits may be. Thus, it only makes sense to block a bot that is performing in the same manner. Primefac (talk) 14:04, 12 April 2020 (UTC)[reply]
@Primefac: Oh I'm sorry, I thought when User:Thryduulf said "we already do deal with those images" he meant that something was actively being done to prevent the problem recurring, by renaming the badly named local images on en-wiki. That fixes the problem. The purpose of this bot is to show what the SPARQL query returns, producing a facility that hundreds of users (or thousands, if you include all wikis) are using. If you're just removing the images from the page, then you're not solving the problem. On the other hand, if you solve the actual problem, by renaming the image (which ought to be renamed anyway), rather than covering up the conflict, then the issue goes away for good, and no change is needed to the bot. Jheald (talk) 14:20, 12 April 2020 (UTC)[reply]
At this point I think we're talking past each other. Yes, the images can/should/are being renamed when they are found. I'm not saying that shouldn't happen, because it already happens. But the bot should not be adding them to pages anyway. Why does it have to be one or the other? Why can't it be both? Primefac (talk) 14:36, 12 April 2020 (UTC)[reply]
(edit conflict) @Jheald: The reason for the block has been explained several times. The copyright issue has been ongoing and ignored since at least 2017 - that's far, far, far longer than anyone has a right to expect to ignore a problem without sanction, and that ignores the other issues of the bot operator not responding to multiple other complaints about editing beyond its authorisation. There are two ways that shadow images can happen, the first is for a new file to be uploaded to enwp but this does not happen as only sysops can do that and they get a warning about it. The other way is for a new file on Commons to be uploaded with the same name as a file here - there is no way that en.wp can be anything other than reactive to that situation (technologically or otherwise) so we have implemented a mitigation strategy that seemingly works for every single other bot on the project and, as multiple people more knowledgeable than me about coding, have said would be trivial to implement. This is an issue that would not have arisen had the operator of listeriabot operated in accordance with the rules every other bot operator has to follow. If you choose to base your workflow on a bot that operates outside of policy then that's a risk you choose to take. Thryduulf (talk) 14:16, 12 April 2020 (UTC)[reply]
@Thryduulf: As you say, there are multiple people more knowledgeable than you about coding. I suspect that such change would not be as trivial to implement as some people may have airily pronounced above (without any diff to back up their assertions), because the lists on the Listeria pages are not being generated from SQL queries that could be extended, but directly from WDQS via a SPARQL query -- and moreover, not a SPARQL query specified by the bot creator, but from whatever SPARQL query the user creating the list chooses to submit. Converting the results of that query (plus a couple of helpful macros) straight to wikitext is pretty straightforward, and makes for a nice clean straightforwardly-coded bot. Having to fish around in those results to see whether any of the columns returned are for a file, then to run a SQL query to check each file for a list that may be several hundred entries long is a significant coding overhead, adding unnecessary messiness and complication, non-negligible additional load on the servers, make it more likely that a particular update may as a result fail for a given list, and make the cause of such failures harder to diagnose.
I submit that that is not worth the candle for an issue which is unintended and transient and having its underlying cause already being taken care of by other bots. In fact, it sounds as if, by surfacing these filename collisions, which are then fixed, the bot in its present form may actually be doing some useful service.
In recent years Magnus has been being extraordinarily productive and creative, constantly producing and refining a non-stop stream of tools that are now underpinning a vast quantity of projects and work across Wikipedias, Commons, and Wikidata, as well as personally maintaining Mix'n'match which has now reached 3,500 different catalogues of identifiers, all being actively matched and cross-referenced. Listeria works. It does what it is meant to, displaying the results of a SPARQL query on a wiki page. If in the process it exposes some bad en-wiki filenames so that they can then be fixed, then so much the better. Given how much Magnus is achieving with his time at the moment, and how many new sorts of work he is making possible, and how useful Listeria is as it currently is, I would not seek to waste one moment of his precious limited time on an issue that is transient, is exposing fixes to filenames that need to be made anyway, and which according to you already get dealt with, when there is so much more he could be achieving doing other things. Jheald (talk) 15:14, 12 April 2020 (UTC)[reply]
Your entire argument is predicated on the problems the bot is causing being trivial. They are not. Copyvios, even transiently, are a big effing deal. A bot editing pages it is not authorised to edit is a big effing deal. If Magnus is not able to properly maintain the bot then he shouldn't be operating it, exactly the same as any other bot operator. Why they are not able to properly maintain it is irrelevant. Nobody's time is too valuable to edit in accordance with consensus, and if they think it isn't then they should not be editing at all. Good contributions elsewhere do not, and cannot, justify disruption. Claiming that these errors should be allowed to stand because they cause work for other editors fixing problems that may (or may not) need to be fixed by others is disrupting Wikipedia to prove a point. Thryduulf (talk) 15:43, 12 April 2020 (UTC)[reply]
Off-topic discussion. Relevant discussion moved below this close
Your entire argument is predicated on the problems being caused by the bot; they are not, as has been explained to you here and elsewhere, ad nauseam. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:08, 12 April 2020 (UTC)[reply]
Are you seriously trying to argue that a bot making copyright infringing edits to the encyclopaedia is not a problem with the bot!? I'm sorry but that's the most ridiculous argument I've seen so far! 18:43, 12 April 2020 (UTC) - — Preceding unsigned comment added by Thryduulf (talkcontribs) 19:43, 12 April 2020 (UTC)[reply]
I'm seriously suggesting that you don't know what you're talking about. HTH. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:49, 12 April 2020 (UTC)[reply]
This is a blatant personal attack which must lead to a block, but unfortunately Pigsonthewing is unblockable on the English Wikipedia.--Ymblanter (talk) 19:22, 12 April 2020 (UTC)[reply]
That's not the first time you've falsley accused me of making a personal attack; I suggest you desist. To continue to do so would betray a lack of understanding of what constitutes a personal attack. In other words, it would demonstrate that you don't know what you're talking about. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:04, 14 April 2020 (UTC)[reply]
I think it is pretty clear that you continue your uncivil behavior only because I said several days ago that I am involved with you and will not block you on the English Wikipedia, and you clearly expect that your wikifrends will cover this behavior, as it happened multiple times in the past. It does not make it more civil though.--Ymblanter (talk) 12:11, 14 April 2020 (UTC)[reply]
I think it is clear who is being uncivil; and making personal attacks; and bring a grudge from elsewhere; and is not here to contribute. And it is not me. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:29, 14 April 2020 (UTC)[reply]
Let's all calm down a bit.
Andy: could you please elaborate on how the issues are not "being caused by the bot"? As I see it, there is a very clear link between the bot making edits and non-free content appearing in a manner prohibited by policy. Mdaniels5757 (talk) 19:50, 12 April 2020 (UTC)[reply]
As I said; this has been explained previously. The bot makes a good-faith edit, at the request of a random editor, to link to a free image that exists on Commons. This fails, because this project stupidly allows a different, non-free, image to exist, using the same file name. (Notwithstanding that in one case a non-free image had apparently been placed on Commons by someone else; you can no more blame the bot operator or requesting editor for that, than you would a human who inadvertently included it on a page.) Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:04, 14 April 2020 (UTC)[reply]
@Thryduulf: Wikipedia has a reputation for being sparing with non-free content, which is hard-won and has been achieved with some pain, but is very very valuable and useful. But let's dial the rhetoric back to reality, because over-dramatisation really isn't helpful. In terms of legal risk construed narrowly, such as might "cause the encyclopaedia or its editors [to] suffer real-world harm", that's quite the extreme bit of knicker-wringing above, and we would do better to keep the discussion here rather more grounded in reality. In narrow legal terms, most of what the bot might inadvertantly add due to the filename confusion would probably be protected as fair use or fair dealing anyway, even though it would fall outwith the narrower limits of policy. Moreover, because of the notice-and-takedown protections granted to content hosts, legally the clock only starts running once a notice has been received, if the site is slow on acting on it. That is unlikely, given (i) that, according to you, procedures are in place to fix the name conflicts as soon as they become visible; and (ii) even if they had slipped though, an insta-fix (renaming the file) would be available as soon as any notice was achieved to bring it to awareness. So legal consequences are far-fetched. That leaves the potential for reputational consequences. I in no way dismiss the significance of such consequences just for being reputational -- I think most of us would agree that Wikipedia's overall reputation is orders of magnitude more important than whether we happen to be upheld or not in a single legel case. But I think there is room to differ on whether allowing Listeria to surface a handful of files transiently for a few days until an organised procedure fixes the problem by renaming them (a renaming which I think we agree is desirable anyway in its own right) actually has any reputational significance. I frankly don't see it. Others might differ, but if we do have an organised procedure which is identifying and fixing these filename collisions that have the potential to confuse users, even at the cost of those files as a result sometimes being made visible for a few days where they shouldn't, then to me I think that's actually quite positive to Wikipedia's reputation: we have identified a problem of potential filename confusion, and have an active and effective system in place that helps identify cases and deal with them. To me, that actually seems reputation-positive, rather then reputation-negative. Looked at cold, I don't think there is either a legal or a reputational risk here, so long as the filename issues being exposed are indeed getting rapidly dealt with.
Further on that point, it's increasingly clear that when Listeria surfaces one of these filename collisions, that is actually helpful -- because in general these collisions are rather hard to find, whereas the intersection of files that are non-free and files that are on Listeria pages are rather easy to identify, with a cause that is very clear, making this rather a useful way for them to be surfaced, so that they can then be fixed (which is something we want to do). So it would be helpful if bots that auto-remove non-free content would ignore Listeria pages, so that this files with this very specific issue can be left in place, so that they can then be identified and picked up by the established procedure that is specifically appropriate for them. Shooting the messenger, or killing the canary in the coalmine, is not actually helpful if what we are wanting to do is to identify and fix these collisions.
Finally, it is worth noting that the Listeria currently operates across 71 different wikis (keeping in all over 66,000 different lists actively updated), all from the same code. From a maintenance point of view it is not good design to make the code more complicated than it needs to be. It is not good design to make the bot mask filename problems rather than expose them, so they can be more readily identified and fixed. And, particularly when thinking about when a deep maintenance change may be required (such as recently when the wbterms table was retired), it is absolutely not good design to fragment that single piece of code into a multitude of different scripts, each specialised to a different wiki, that then all have to be updated separately. Such a change is not something to enter into lightly.
Luckily in this case no such change is actually necessary, because (as discussed above) in the present case the way the bot is surfacing filename issues is not just tolerable, it is actually useful, and should be retained. Jheald (talk) 23:41, 12 April 2020 (UTC)[reply]
That's a hell of a lot words to say "Please can we knowingly violate the non-free content policy because the bot is useful and fixing it would be a lot of work?". The answer to that can only be "No. Complying with policy is not optional.". Your comment about set the of non-free files on Listeria lists being visible to bots is also interesting, because that requires bots to easily be able to distinguish free and non-free files - a task that those arguing for this policy exception claim is very complicated and/or impossible for Listeriabot. Which is it? You make grand noises about safe harbour, protection from legal harm, etc. but that's not the point at all - those only apply because we take reasonable steps to minimise the likelihood of copyright violations happening in the first place (that's the point of the NFCC), we can't abandon that and still claim protection. Finally, you say that the images appearing in the lists are probably fair use anyway - no, they aren't. A non-free image in a list cannot be being used for critical commentary or parody of the image. Thryduulf (talk) 00:39, 13 April 2020 (UTC)[reply]
@Thryduulf: WP:IAR: Understand the purpose of rules, and do what is best for the encyclopedia, rather than apply them blindly.
In this case, where the root problem is the name collision, the community has taken the view that it needs human intervention to hand-choose appropriate new names, so a bot can't fix the underlying problem.
Identifying images that are on Listeria pages and in the non-free category after Listeria has operated (and referring them for human intervention) is comparatively easy -- and, I am putting to you, the preferable option in any case, because then we fix the actual underlying problem. But it relies on Listeria having made the edit, so that the images are on the page, and can therefore be found by the SQL service as being on a Listeria page.
Changing Listeria so it doesn't add the image at all can't use this simple approach, would fragment the Listeria code with the consequences discussed above, and -- the key point -- is undesirable in its own terms because it doesn't end up with the underlying problem of the bad filenames getting fixed.
As to the narrow legal point, what you assert above is simply not the law. Unless you are facilitating piracy on the scale of something like The Pirate Bay, where assisting piracy is the very purpose of the site, the obligation laid on platforms by the law is to deal promptly with asserted copyright infringements as soon as the site is made aware of them by a DMCA notice or its equivalent. It is hugely to Wikipedia's credit, and to the huge benefit of our reputation, that we go way way beyond that, and as a result get very very few infringement notices. But we should look at this with clear eyes. Allowing a file (or at most a very small handful of files) to briefly surface in the wrong place, which we then rapidly fix by renaming the file, thereby definitively removing the possibility of any further confusion between the two files down the line, is a responsible course of action which is not going to damage WP's reputation, or in any way weaken the standing of the NFC policy. So long as the name collision is picked up quickly and then rapidly fixed, there is no more significance here than our practice, say, of leaving files briefly in place and in context while their appropriateness is reviewed at WP:FFD. So long as there is a efficient mechanism in place that is dealing with the name collisions once Listeria surfaces them, as you assure me there is, then Listeria's action is actually helping a useful process, and the prospect of legal or reputational harm by that process is non-existent. Asserting otherwise does not reflect reality. Jheald (talk) 08:19, 13 April 2020 (UTC)[reply]
Hat removed. @Headbomb: This is not a red-herring discussion. It directly pertains to what is the right way forward here: what are the actual costs and benefits of the different ways forward that might be persued; and, indeed, whether the bot surfacing these shadow filename issues is actually a problem at all, or whether it may actually be helpful, as a beneficial element of fixing files with this issue. Those are very germane issues, worth wider discussion. Jheald (talk) 10:09, 13 April 2020 (UTC) [reply]
Sorry, IAR is only for uncommon situations where the encyclopaedia would definitely be improved. Non-free images in a list is not an improvement to the encyclopaedia, ignoring a rule on an ongoing basis is never acceptable (if the reason for wanting to do so is a good one then you will have no trouble getting consensus to change the rule in such a way that you don't have to ignore it), and it being easier to ignore the rule than comply with it are all reasons to say "no, you may not ignore this rule". Thryduulf (talk) 11:15, 13 April 2020 (UTC)[reply]
The fact that this non-issue was ignored for so long, is telling us that it is not an issue. The underlying issue needs to be solved better, but it is already sufficiantly taken care of so as not to be a real problem. I suspect the whole storm is not whipped up because of the non-free images issues, but using that as an excuse to shut the bot down because some people do not agree with the source of its data. If we want to have a functioning wikipedia in 10 years time we need to be foreward looking. Looking forward means also not to close your eyes to the solution. Agathoclea (talk) 12:01, 13 April 2020 (UTC)[reply]
@Agathoclea: I can't speak for everyone of course, but I'm one of the most vocal in support of this bot's block but I'm also a strong supporter of Wikidata and of the benefits its provices. The reason I'm making a fuss now is that this is the first time I've been aware that the non-free files problem has existed. Unlike the bot controller who has been aware since at least 2017. Thryduulf (talk) 12:15, 13 April 2020 (UTC)[reply]
"to say 'Please can we knowingly violate the non-free content policy..?'". That is not what is being said, and it is wrong and misleading of you to suggest that it is. What is being said is more like "If a bot very occasionally and inadvertently causes a thumbnail of a non-free image to be shown on a non-mainspace page, because this project stupidly allows file names that duplicate those of different images on Commons, please can we deal with that in a sensible manner, by renaming the errant image, rather than damaging the project by hysterically over-reacting and blocking the bot". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:16, 14 April 2020 (UTC)[reply]
You want the bot to be allowed to display a non-free image on a non-mainspace page. Displaying non-free images on non-mainspace pages is explicitly against the non-free content policy. You therefore want the bot to be allowed to violate the non-free content policy. You can try to weasel out of it by blaming others for not making fundamental changes to the core software, writing additional bots and/or taking other actions that mean this one bot wouldn't need to be fixed, but that does not alter the fundamental nature of your request. Thryduulf (talk) 14:22, 14 April 2020 (UTC)[reply]
I've just told you that your claim was wrong and misleading, and your response is to make another claim that is false and misleading? I want nothing of the kind. I want - as I just said - us to resolve such rare issues in a sensible manner, by renaming the errant image, rather than damaging the project by hysterically over-reacting and blocking the bot. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:35, 14 April 2020 (UTC)[reply]
No, you asserted that my claim was false and misleading. I simply demonstrated that it was accurate. It's true you only want to be allowed to do it temporality, but that doesn't change that you want to do it all. Yes you want the "errant" image renamed, but that request is being discussed at WT:NFC and is unrelated to what is being requested here. What is being requested here is permission to display the non-free image until it is renamed to something that does not shadow a free image from Commons. Doing that would require an exemption from the policy prohibiting the display of non-free images outside mainspace. Preventing images on en.wp and Commons having the same file name (the only way that shadowing can be prevented) would require a change to MediaWiki software - something that is completely outside the power of en.wp to implement and so irrelevant for the purposes of this discussion. Thryduulf (talk) 15:42, 14 April 2020 (UTC)[reply]
Note The above discussion has nothing to do with possible forks of ListeriaBot, do not re-open. Move it to WP:VP if you have to. Headbomb {t · c · p · b} 15:33, 13 April 2020 (UTC)[reply]
@Headbomb: It's not for you to close a discussion you are party to.
And yes, it is absolutely appropriate to look at the negative consequences that could arise from forking ListeriaBot, as well as the practicality of the suggestion. That's why we have discussions here, so people can raise and work through exactly such points.
So, reverted. Jheald (talk) 15:36, 13 April 2020 (UTC)[reply]
I am not a "party to this discussion", I'm a BAG member and as a BAG member, I can tell you that nothing above is pertinent to a forking discussion, nor do they tackle the "costs and benefits" of forking. There is basically three paths forward a) fixing ListeriaBot, which requires Magnus to communicate and update their code b) forking it so someone else can update the code and run it instead of Magnus c) fixing all filename collisions and preventing them from happening in the future. A) would be quick if Magnus gets around to fixing their code and let us know they've done so. B) Can be quick, someone just needs to fork the publicly-available code and make a WP:BFRA C) is the least-likely to occur, given it involves identifying and fixing all collisions, and preventing them from happening in the future. c) is not impossible to do, but it's a much bigger and slower effort than either a) or b) solutions. Headbomb {t · c · p · b} 15:34, 14 April 2020 (UTC)[reply]
You being a member of BAG gives you no authority to manage the discussion here as you have attempted to do, including closing discussions immediately after you have participated in them. You have just closed another section - [added] to which you were very much a party - where someone has falsely accused me of trolling, after I pointed out they were making provably incorrect claims, leaving me no avenue to refute that accustaion. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:46, 14 April 2020 (UTC)[reply]
Refute it elsewhere. The conversation was going nowhere and is cluttering up the more-valid discussions about the bot and its activities. Primefac (talk) 16:53, 14 April 2020 (UTC)[reply]
@Headbomb: Other paths might include:
d) recognising that these NFC glitches are few and far between, get flagged for clearance pretty quickly already by Green C bot, and are essentially harmless.
e) creating a script specifically to identify and mop up these occasional glitches, perhaps along the lines suggested here, to run either periodically, or soon after each Listeria edit
f) some other approach that some experienced bot author may suggest here.
It is valuable to flag and recognise that complicating or forking Listeria are not zero-cost options. Listeria has infrastructure-level significance across 70 wikis, allowing any arbitrary SPARQL query to immediately be presented as a fully-formatted list on the wiki, that will then be kept updated. This currently supports 65,000 live pages across those wikis, including wide use for project management, wide use for individual curation projects by individual users, and wide use to present the results of collaborative curations with external partner institutions. So this is code that it is important to keep as clean and maintainable and unified as possible. It is not code to mess around with lightly, not code to add complexity to unnecessarily; and it is code to avoid fragmenting as far as we possibly can, so that any fixes or updates immediately apply everywhere, and so that it continues to reliably work in the same way with the same syntax wherever it is used, so that the pages calling Listeria can continue to be portable between one wiki and another immediately without change.
Do these considerations trump all others? Not necessarily. But they are not small things either. Jheald (talk) 16:57, 14 April 2020 (UTC)[reply]
Those are all variations of c). Headbomb {t · c · p · b} 17:07, 14 April 2020 (UTC)[reply]
Not really. (d) suggests that this whole issue is a de minimis trifle, and not worth further worrying about; (e) suggests leaving Listeria unchanged, but identifying the issues case-by-case as they come up on the fly, which is rather different to up-front mass-fixing all filename collisions, and could be rather smaller and easier than (a) or (b); (f) might be something entirely different again. Jheald (talk) 17:22, 14 April 2020 (UTC)[reply]
That was going to be my next suggestion - to extract out the code that does the wiki-list updates and run that via another bot. I have no experience in this area, but I'm sure it can be done. Advanced thanks to anyone and everyone who can help with this. Lugnuts Fire Walk with Me 11:56, 12 April 2020 (UTC)[reply]
This is just a suggestion. I think the tool can be placed in Maintainer needed section in Phab. Adithyak1997 (talk) 12:09, 12 April 2020 (UTC)[reply]
If someone wishes to fork out the bot and put through a new BRFA, please do. Having an active bot operator and a clearly-defined bot task is much preferred over the current situation. Primefac (talk) 13:28, 12 April 2020 (UTC)[reply]

Break

@Anomie, Jarry1250, TheSandDoctor, and Dreamy Jazz: sorry to bother you guys, I've pinged you since you four are the only active bot operators with experience in PHP that I know of and I was wondering if any of you would like to help out forking the now blocked Listeriabot. The problem is that it can accidentaly display non-free images where fair use doesn't apply when a non-free file on English Wikipedia shadows a file on commons the bot is trying to include in a list. Source code is available at BitBucket. My suggested implementation would be getting the intersection of Category:Wikipedia files that shadow a file on Wikimedia Commons and Category:All non-free media at the beginning of the run and then check that each added image is not on that list simply not adding it if it is. Thanks for considering it! ‑‑Trialpears (talk) 22:47, 14 April 2020 (UTC)[reply]
Trialpears, thanks for your efforts to find a new operator. Best, Barkeep49 (talk) 01:57, 15 April 2020 (UTC)[reply]
I have only programmed bots using python (pywikibot), but I'll give the source code a look over. Dreamy Jazz 🎷 talk to me | my contributions 08:36, 15 April 2020 (UTC)[reply]
I think using Category:Wikipedia files that shadow a file on Wikimedia Commons may be unreliable. It is added via a bot weekly so an image might not be detected if using this category for a week. Therefore, I think it would be best to have the bot check for the non-free category on the local version (if a file with the same name on the wiki where the image will be posted exists, check if it is in Category:All non-free media (or similar for other wikis)). Dreamy Jazz 🎷 talk to me | my contributions 09:03, 15 April 2020 (UTC)[reply]
Perhaps this issue can be mitigated by making the GreenC bot task a daily one ... GreenC, do you think that can be done? Jo-Jo Eumerus (talk) 09:08, 15 April 2020 (UTC)[reply]
Yes and multiple times a day is also available. -- GreenC 13:01, 15 April 2020 (UTC)[reply]
@GreenC, Jo-Jo Eumerus, and Dreamy Jazz: If we catch shadowed files more quickly by Green C bot, and perhaps auto-move them (even as a temporary measure while adding another tracking category for later human checking), would that solve the specific issue with this bot and mean that it could start running again as-is? It doesn't solve the wider problem I was trying to address with the RfC, but for this specific issue? Thanks. Mike Peel (talk) 17:33, 15 April 2020 (UTC)[reply]
Well, not as it stands, because AFAIK GreenCbot only tags the files, it wouldn't stop Listeria adding them to the lists. Unless, as you say, GreenCbot could be rewritten to move them as it finds them - though that would need another BRFA (which would probably be very quick). Black Kite (talk) 17:53, 15 April 2020 (UTC)[reply]
@Black Kite: I can trivially write a bot that would do the moves (as I already offered at the RfC), and would be happy to do so. Thanks. Mike Peel (talk) 17:59, 15 April 2020 (UTC)[reply]
So you'd have to have a suite that runs GreenC bot -> move bot -> Listeria. You couldn't leave time between them (and even if they ran consecutively then you still might get the odd edge case). You know, this is all great, but it does make me think that the only way of properly fixing the issue is to fix Listeria, especially as the fix is quite basic. Black Kite (talk) 18:08, 15 April 2020 (UTC)[reply]
@Mike Peel: No; not always is moving the file the correct solution and how does the bot know what new name to use? Increasing the rate at which GreenC bot tags the files (as discussed by GreenC above) makes it easier to resolve the problems before Listeriabot trips up on them, though. Jo-Jo Eumerus (talk) 18:27, 15 April 2020 (UTC)[reply]
I think a variation of this could be a possible solution. First of all having a separate bot to temporarily move them is not a good idea. GreenC is active and if this is to be implemented GreenC bot should do it. That avoids most of the timing issue. Secondly if it is to move files it should just append something like (local) and put a template on the description page to indicate that someone should review the move. If that is done I think this is a fine solution since finding someone to fork the bot turns out to be quite difficult. ‑‑Trialpears (talk) 18:58, 15 April 2020 (UTC)[reply]
@Black Kite and Jo-Jo Eumerus: So we move the file and add it to a category for human checking to check that it was the correct solution (and to either approve it if it was correct, or fix it if not)? Doing it immediately via GreenC bot would be optimal, but I could run a move script via pi bot as often as you like, with the caveat that the more often it runs the more server resources it uses. I code pi bot using python/pywikibot, so I'm not sure that's compatible with GreenC bot's code directly. The only solution to 'fix Listeria' that I've seen is for it to run an additional check for *every* image it includes *every* time it runs, which involves a hell of a lot more server resources than fixing the edge cases. Thanks. Mike Peel (talk) 19:03, 15 April 2020 (UTC)[reply]
GreenC bot can move pages trivially, if required. There is another bot task related Wikipedia:Bot_requests#Follow_up_task_for_files_tagged_Shadows_Commons_by_GreenC_bot_job_10 by User:Philroc but I don't think they took it to BRFA yet. It is a complicated because what happens is someone uploads a file to Commons, it triggers a shadow match by GreenC bot, then Commons deletes it as a copyvio and the shadow no longer exists. So Philroc's bot removes the shadow tag added by GreenC bot if there is no longer a shadow happening ie. the file no longer exists on Commons. If I recall, one reason GreenC bot is running less than daily is to give time for the shadowing to sort itself out, for Commons and Enwiki to resolve copyvios and deletions before declaring a shadow exists. -- GreenC 19:36, 15 April 2020 (UTC)[reply]
It's still very much a WP:CONTEXTBOT situation, there is no simple rename pattern and the ones I use are something I make up on the fly. An automatic file rename is not really the way to go here. Jo-Jo Eumerus (talk) 19:43, 15 April 2020 (UTC)[reply]
@Jo-Jo Eumerus: That's new to me, and seem crazy. However, the examples don't seem to apply to this situation, I'm just proposing a bot that moves files from one maintenance category to another here, they would still need human intervention (the RfC is wider, but that's what not what we're talking about here). THanks. Mike Peel (talk) 20:09, 15 April 2020 (UTC)[reply]
As a practical demonstration, [2] was by bot [3]. Thanks. Mike Peel (talk) 21:11, 15 April 2020 (UTC)[reply]

If we are to pursue the route of GreenC bot automatically moving shadowed files to a temporary file name I would suggest the following workflow: GreenC bot starts with appending (temporary) to the file name to temporarily solve the shadow problem. GreenC bot then adds the following template based on ((Shadows commons)) and associated tracking category to the description to explain the issue.

Editors continue to review these files as normal but without having a name conflict in the meantime. Do you think this would be a workable system Jo-Jo Eumerus? If so I think we should see what other people working with shadowed files to see if there are any issues and then implement it. ‑‑Trialpears (talk) 22:05, 16 April 2020 (UTC)[reply]

@Trialpears:Honestly, I think this would be a waste of effort. Quite aside from the fact that there is no universal pattern for new file names - meaning that one would have to re-rename the file once again in many if not most instances - in many instances file moving is not the correct solution at all. Jo-Jo Eumerus (talk) 08:50, 17 April 2020 (UTC)[reply]

I think I'll leave this to a bot operator who is running php bots already. If no one wants to take up the task, I can, but it might be a while before I have working code. I'll probably do a rewrite in python using pywikibot (my experience with PHP is limited to websites). I think I do have some kind of way to fix this in the source. In "shared.inc" add a call to a function to check if the file stored in the property is shadowed on line 958ish in the method "renderCell". This new function loads the properties for the filename on the wiki the bot is currently processing and checks if the page ID for that file is zero / if the file exists. If it is, then the image is not shadowed. If the page ID is not zero / the file exists, then the file is shadowed. Then check if the shadowed image is in Category:All non-free media, if it is then don't add the image and if it isn't then still add the image. I think any person wanting to run a fork of this bot will need to fork / download both the listeria and magnustools repos from bitbucket, as listeria uses magnustools. Dreamy Jazz 🎷 talk to me | my contributions 11:03, 15 April 2020 (UTC)[reply]
It's probably appropriate to add a ':c:' to replace the file with a link to the file on Commons whenever a Commons file is shadowed, without the check of Category:All non-free media, which is likely to be more expensive, and more likely to change from wiki to wiki. Jheald (talk) 11:24, 15 April 2020 (UTC)[reply]
It's a shame that it's not possible to use something like [[c:File:blah.jpg]] to always use the Commons file (it seems that just results in a link to the file). Thanks. Mike Peel (talk) 17:35, 15 April 2020 (UTC)[reply]
That would be useful but it's not something that can happen without developer input, so it needs to be requested at Phabricator (if it hasn't been already, if it has a link from here would be useful). It's also not something that is likely to happen with any great rapidity so it's almost certainly going to be quicker and easier to just fix Listeriabot. Thryduulf (talk) 19:53, 15 April 2020 (UTC)[reply]
@Thryduulf: I agree that this solution would need mediawiki developer input. I've proposed my preferred solution above - I still don't see any quick or easy solution that lets us fix Listeriabot without simultaneously solving the bigger issue of non-free files shadowing free files. Mike Peel (talk) 20:15, 15 April 2020 (UTC)[reply]
Even if you think that Listeria checking for non-free isn't a good idea (I'd disagree, but whatever), a trivial way of fixing Listeria so it could start working again would be for it to insert a link to the image in the list, rather than the image itself. Black Kite (talk) 21:09, 15 April 2020 (UTC)[reply]
I am now working on a rewrite in python. Dreamy Jazz 🎷 talk to me | my contributions 22:22, 15 April 2020 (UTC)[reply]
However, this rewrite will take some time (probably long enough that the problems here are fixed). Dreamy Jazz 🎷 talk to me | my contributions 22:54, 15 April 2020 (UTC)[reply]
The rewrite is going faster than expected. I may be done in the next few days. I'll file a BRFA when I'm done if the bot has not been unblocked yet / the issues are fixed. Dreamy Jazz 🎷 talk to me | my contributions 23:06, 16 April 2020 (UTC)[reply]
There can be a description page without a file (for example File:Australia satellite plane.jpg, page ID 1258205) - https://en.wikipedia.org/w/api.php?action=query&titles=File:Australia_satellite_plane.jpg&prop=imageinfo has "imagerepository": "shared" which would be "local" for a file that exists in Wikipedia. Peter James (talk) 14:24, 16 April 2020 (UTC)[reply]

"Inactivity" of the operator

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


Split from #Forking the bot above.

"the operator is inactive" Maybe you could get a clue about what your fellow volunteers contribute, before you disparage them with such ignorance? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:04, 12 April 2020 (UTC)[reply]
@Pigsonthewing: He could be sweating like a galley slave elsewhere, but as far as en.wp is concerned, he hasn't been active since (early) February. Which means: as afar as leaving a talk-page message in the traditional fashion goes, the likelihood of receiving a reply is receding rather than improving. HTH. ——SN54129 18:40, 12 April 2020 (UTC) ——SN54129 18:40, 12 April 2020 (UTC)[reply]
But that wasn't the claim made. Regarding your latter point, have you looked at his talk page? It appears that not one of the people loudly complaining about the bot has taken heed of the guidance there. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:47, 12 April 2020 (UTC)[reply]
Unfortunately, not many editors here know (or probably care!) about bitbucket; why should they? Afterall, en.wp has plenty of ways and places itself for on-wiki communication. Specifically, saying Beats spreading [messages] over half a dozen talk pages is slightly disingenuous: there's only one talk page he needs to worry about, and it's that one. ——SN54129 19:04, 12 April 2020 (UTC)[reply]
@Pigsonthewing:, it has been a longstanding principle of bot operation that enwiki users don't need to register elsewhere to take their concerns to a bot operator. Magnus hasn't been active on enwiki in 2 months. While the threshold of what exactly is "inactivity" will differ from people to people, saying that Magnus has been inactive in the past two months is hardly "disparaging them" or being "ignorant". So instead of complaining here that enwiki editors prefer to keep enwiki issues on enwiki, you could contact Magnus on BitBucket yourself if you think this will lead to a speedier resolution. Headbomb {t · c · p · b} 00:02, 13 April 2020 (UTC)[reply]
((citation needed)) And don't attempt to close discussions immediately after posting to them. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:13, 13 April 2020 (UTC)[reply]
Citation: WP:BOTCOMM: "Bot operators should take care in the design of communications, and ensure that they will be able to meet any inquiries resulting from the bot's operation cordially, promptly, and appropriately. This is a condition of operation of bots in general. At a minimum, the operator should ensure that other users will be willing and able to address any messages left in this way if they cannot be sure to do so themselves." WP:BOTACC: "All policies apply to a bot account in the same way as to any other user account." Thryduulf (talk) 12:26, 13 April 2020 (UTC)[reply]
Those are indeed citations. Just not for the claim that was made. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:00, 13 April 2020 (UTC)[reply]
Those citations support the claims immediately preceding your request for citations. If you were requesting citations for something else then you need to actually specify what that something else is, we cannot read your mind. Thryduulf (talk) 20:05, 13 April 2020 (UTC)[reply]
"Those citations support the claims immediately preceding your request for citations" They do not. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:52, 14 April 2020 (UTC)[reply]
Were he the operator of just one bot, on just one project, your point might have a shred of validity. As he is not, it does not. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:14, 13 April 2020 (UTC)[reply]
Incorrect, the same rules apply to everybody: if you want to operate a bot on the English Wikipedia you must be available to respond to issues on the English Wikipedia. What the operator does or does not do on other projects is irrelevant. If an operator is unable or unwilling to deal with issues related to their bot on the English Wikipedia then they will have their operator privileges for the English Wikipedia withdrawn, regardless of why they are unable or unwilling to follow basic policy. Thryduulf (talk) 12:20, 13 April 2020 (UTC)[reply]
Poppycock; try reading the post I was replying to. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:24, 13 April 2020 (UTC)[reply]
I read, and have re-read, the post you were replying to. There is only one talk page he has to worry about regarding the English Wikipedia. If he has to pay attention to other talk pages for other business that's his choice, but it doesn't make either my or Serial Number 54129's posts incorrect. If there is too much for him to keep track of then he needs to either stop something or hand it over to someone else who can resolve the issues. 20:05, 13 April 2020 (UTC)[reply]
I note that the claim you now make is not the claim to which I replied; both 54129's and your earlier post are incorrect. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:52, 14 April 2020 (UTC)[reply]
Andy, I am making the same claim in both posts just using different language because you apparently misinterpreted it the first time. Both make the same claim that I understand SN54129 was making. Additionally your comments in these discussions are getting increasingly towards a style of "I'm disagree with something you said, but I'm not going to tell you what it was or why I disagree with it, because I'm right and you are wrong." This is not how to resolve a dispute. Thryduulf (talk) 14:16, 14 April 2020 (UTC)[reply]
You are indeed making the same claim more than once. However it is different to the claim made by 54129, which you wrongly said I was incorrect to describe as not valid. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:56, 14 April 2020 (UTC)[reply]
On the asumption you've indented correctly, yes you were relying to me. Incorrectly. As, I repeat, it does not atter how many bots he runs or where he does so: what he does on the Engish Wkipedia will be discussed on the Engish Wikipedia, there are literally no other two mays about it. You are either accidentally or deliberately misunderstanding what (multiple) editors are telling you; since you are clearly competent, it can only be assumed that the latter applies. ——SN54129 14:28, 14 April 2020 (UTC)[reply]
I see that you, too, are now making a different claim to the one I originally described as invalid. The confusion, deliberate or otherwise, is not mine. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:56, 14 April 2020 (UTC)[reply]
I can't imagine wy you feel the need to troll the discussion, but, here we are. ——SN54129 15:05, 14 April 2020 (UTC)[reply]
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Recent coronavirus-related publicity for the bot

See this Hacker News thread on a Listeriabot-created list, of Wikidata-notable people who have died from the Coronavirus. It would be helpful if we could get this issue resolved so that the bot can continue keeping the list up-to-date; it is of current interest and rapidly changing. —David Eppstein (talk) 19:20, 13 April 2020 (UTC)[reply]

David Eppstein, That list is on wikidata and thus is not effected by the bot being blocked on English Wikipedia. The underlying issue doesn't really effect other sites since non-free content isn't used to the same extent (or at all) on other wikis. ‑‑Trialpears (talk) 20:01, 13 April 2020 (UTC)[reply]
Yes, I was just coming back to post a clarification about this. I'm skeptical that other Wikipedias don't use non-free content, but maybe one fix would be to migrate the various Listeriabot redlink lists to Wikidata? Or would that be unacceptable as the redlinks are targeted at a specific Wikipedia (the one for which they are redlinks)? —David Eppstein (talk) 20:04, 13 April 2020 (UTC)[reply]
I guess that would work as a temporary measure if something is important to have updated several times a week. I am quite confident that this will be resolved within a week and everything will be back to normal. ‑‑Trialpears (talk) 21:04, 13 April 2020 (UTC)[reply]

What a crock. Listeria does not link to any specific red links OTHER than to the red links on that very Wikipedia. Remember the same query will result in the same content. However, red links are different. Check out the COVID-19 deaths Listeria list on the Dutch Wikipedia. English Wikipedia is one of the exceedingly few Wikipedias that supports non-free imagery. This disaster is one of your own making. Thanks, GerardM (talk) 16:00, 14 April 2020 (UTC)[reply]

Gerard, this is factually incorrect. Among bigger projects, only Dutch, Spanish, and Swedish Wikipedias disallow fair use, and German is very restrictive. It is by far not the majority.--Ymblanter (talk) 16:20, 14 April 2020 (UTC)[reply]
And by self selecting you change the argument. We are talking Wikipedias not the biggest Wikipedias. We are talking about the qualities of English Wikipedia, it argues how important copyright is without explaining WHY this mode of operandi is actually effective and addresses a threat. It is easy enough to change the routine and NOT have local files take precedence. We could discuss four percent error rate that is dismissed because it is not opportune; it will change the outcome of this argument while improving quality to our readers. It is argued that this will blow over and ask yourself, what is it that Wikipedia stands for.. It has a dogmatic establishment unable to reflect on its strengths and weaknesses when challenged. — Preceding unsigned comment added by GerardM (talkcontribs) 03:54, 15 April 2020 (UTC)[reply]
This is the English Wikipedia, and bots that operate on the English Wikipedia must follow English Wikpedia policies. What's done on the Dutch Wikipedia is irrelevant here, and this attitude that enwiki only has itself to blame, or whatever the above is supposed to be, is unproductive at best. Headbomb {t · c · p · b} 04:46, 15 April 2020 (UTC)[reply]

That list is not on Wikidata - find it on nl.wp

The listeria list may be found here .. it is maintained by the ListeriaBot. Thanks, GerardM (talk) 10:59, 16 April 2020 (UTC)[reply]

How is that relevant to this discussion? Thryduulf (talk) 13:42, 16 April 2020 (UTC)[reply]

Hatting of comments

Off-topic (non-admin closure) ——SN54129 18:53, 14 April 2020 (UTC)[reply]
The following discussion has been closed. Please do not modify it.

My reply to Thryduulf that "Your entire argument is predicated on the problems being caused by the bot; they are not, as has been explained to you here and elsewhere, ad nauseam." is nether off topic nor irrelevant, but has been included in a section collapsed as such; including by an editor with whom I am in disagreement on that point, and by an involved admin who blocked the bot in question. Another section has been closed in a most partisan manner, by an editor whose earlier closure of that section was also reverted after he tried to use his closure to have the last word. It really is unacceptable for people on one side of a discussion to try to manage it in this manner. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:20, 14 April 2020 (UTC)[reply]

Oh wait, he didn't make the latest close to the latter; he just insrtered his comment after it was closed. Can we all do that, or is that too only for people on one side of the discussion? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:41, 14 April 2020 (UTC)[reply]
The above is not off-topic, and its hatting by one of the people whose actions are describled amply illustrates the point. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 19:18, 14 April 2020 (UTC)[reply]

It's been a week

It's been a week since ListeriaBot was (re-)blocked. Are we any closer to unblocking it or improving on it? As far as I can see, there are a few options:

  1. Unblock the bot so it can continue to operate as normal. Pro: lists here will continue to be updated as normal. Con: The file shadowing issue may reoccur.
  2. Move all local non-free files to a new filename. Pro: would solve this issue, as well as avoiding all other cases where a non-free file shadown a free file. Con: seems to be controversial
  3. Move non-free files that are blocking free files to a new filename. Pro: This would solve the issue. Con: Still controversial, wouldn't solve the wider problem.
  4. Wait for a new bot operator to rewrite it so that it avoids non-free files. Pro: This would solve the issue. Con: No-one has demonstrated that this is technically possible, and it still wouldn't solve the wider shadowing problem.
  5. Continue blocking it. Pro: This would solve the issue. Con: This stops us using lists from Wikidata to improve our content.

How should we move forward here? Thanks. Mike Peel (talk) 22:28, 17 April 2020 (UTC)[reply]

You forgot #6: Have the bot operator fix the issue, but since that seems like a non-starter given their complete lack of participation in this discussion, at the moment we're somewhere around 4 and/or 5. I have zero inclination to unblock a bot that we know can and will break our policies if given the correct circumstances. Primefac (talk) 22:47, 17 April 2020 (UTC)[reply]
@Primefac: That fell under "No-one has demonstrated that this is technically possible" (or perhaps "technically sane" - unless an "additional check for *every* image it includes *every* time it runs". is seen as sane). Mike Peel (talk) 22:51, 17 April 2020 (UTC)[reply]
So far, we're at #5 by virtue of #1 being a no-go. Unblocking a malfunctioning bot should not happen. As for which of #2/3/4 happens next, that's mostly up to the community. But #4 is clearly possible, and it's a fairly trivial thing to implement for anyone with coding skills. Headbomb {t · c · p · b} 23:46, 17 April 2020 (UTC)[reply]
Not to mention the even more trivial change that could be implemented - even as a stop-gap - which would be for the bot to insert a link to the image, rather than the image itself, in the lists. Black Kite (talk) 00:13, 18 April 2020 (UTC)[reply]
Options 4, 5, and 6 are the only options that would ever achieve consensus. Unblocking the bot as-is is not an option. TonyBallioni (talk) 23:56, 17 April 2020 (UTC)[reply]
I'm with Primefac here - the operator hasn't participated in the discussion at all, despite not being globally inactive with their last global edit being this week ((caution-slow link) - and they aren't asking for their bot to be unblocked - so there is no pressure here. No one should ever depend on someone else making a future edit, including edits they may make via their bot. — xaosflux Talk 02:25, 18 April 2020 (UTC)[reply]
For the reasons explained in the discussion at WT:NFC not only are 2 and 3 controversial, it's debatable at best whether they will actually solve the issues meaning that options 4-6 are the only viable ones. I fully agree also with Xasoflux that there is neither a rush nor a deadline. Thryduulf (talk) 10:29, 18 April 2020 (UTC)[reply]
I am aiming to do No. 4. I am rewriting the bot in python (as I have written bots in python before and I find it easier to deal with in a wiki context). I have managed to rewrite over half of it. I was going to submit a BRFA, but I thought it would be best to finish the code before I did. Dreamy Jazz 🎷 talk to me | my contributions 12:47, 18 April 2020 (UTC)[reply]


OK, I was just made aware of this discussion (I am running ListeriaBot).

@Magnus Manske: at lot of the above has a lot of frustration caused by a block that threw the baby out with the bathwater, and people who want to ignore policy because the bot does a lot of good. That said, I see nothing above that claim Wikidata is evil, so just ignore the general orneriness and focus on the actual issues. One the bot's logic gets updated, ask for the unblock, and things should get back to normal. I will point out that the bot's talk page is the proper place to raise issues about malfunctioning bots on Wikipedia, so if you don't monitor that page regularly, I suggest enabling email notifications when someone leaves you a message on User talk:ListeriaBot. Also feel free to review the recently updated WP:BOTCOMM, which makes explicit was was implicit before. Headbomb {t · c · p · b} 17:39, 18 April 2020 (UTC)[reply]

Update: I added a safeguard that should prevent ListeriaBot from using images that are "local" (not from Commons). Of course, I can't test it here, because the bot is blocked, and it's hard to find an actual example (the ones listed as links above turn out to be not applicable). --Magnus Manske (talk) 22:45, 18 April 2020 (UTC)[reply]

Hyperactive bots

Moved from talk page so it gets archived properly.

I've said my piece at User talk:Mifter#This is ridiculous. Narky Blert (talk) 23:23, 31 March 2018 (UTC)[reply]

Question regarding wikiproject open tasks bots

Moved from talk page so it gets archived properly.

On the tea house I asked if there was a bot that could organize a wikiproject's backlog into an open tasks page. They directed me here. Can you help me? — Preceding unsigned comment added by Kaiser Kitkat (talkcontribs) 05:15, 17 December 2019 (UTC)[reply]

I don't know of any such bot, but perhaps they can tell you at WP:BOTREQ? Jo-Jo Eumerus (talk) 09:42, 17 December 2019 (UTC)[reply]
@Kaiser Kitkat: Try this external tool: find your WikiProject and then link to that page. --Izno (talk) 18:01, 17 December 2019 (UTC)[reply]

Help with finding a Wikipedia bot

Moved from talk page so it gets archived properly.

I've just got a new computer. There was a bot that you could manually update all the WikiProjects. The URL was something like: wp10.toolserver.org etc etc. Where can I find it? Adamdaley (talk) 21:20, 30 March 2020 (UTC)[reply]

I don't know what you're looking for, but I'd recommend starting at WP:1.0. --Izno (talk) 23:01, 30 March 2020 (UTC)[reply]