The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Request Expired.

Operator: Hazard-SJ (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 01:33, Tuesday May 28, 2013 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): Python

Source code available: GitHub

Function overview: Fixing citation style

Links to relevant discussions (where appropriate): bot request

Edit period(s): Periodic

Estimated number of pages affected: thousands

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): Yes

Function details: Currently, it only fixes Help:CS1_errors#Wikilink_embedded_in_URL_title, but I might implement more fixes as well (would separate approval be required in that case, since it's still fixing CS1 errors?).  Hazard-SJ  ✈  01:33, 28 May 2013 (UTC)[reply]

Discussion[edit]

  1. Does it fix parameters in ((citation)), or only those that start with "cite" or "web"? (noob question based on line 52)
  2. Does it fix parameters that contain templates (e.g. this edit), or should that be done in a separate bot task?
Thanks! GoingBatty (talk) 01:58, 28 May 2013 (UTC)[reply]
Approved for trial (50 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. ·Add§hore· Talk To Me! 08:33, 28 May 2013 (UTC)[reply]
Trial complete. with 52 edits. Also, would I need separate approvals per what I mentioned above?  Hazard-SJ  ✈  00:29, 29 May 2013 (UTC)[reply]
I don't know if it requires separate approvals or not, but if it is all the same type of errors, couldn't you just trial an assortment of errors, BAG folk? -68.107.136.227 (talk) 03:13, 29 May 2013 (UTC)[reply]
Seems useful, catches an error, although the explanation template is too much to read. -68.107.136.227 (talk) 22:43, 28 May 2013 (UTC)[reply]

These are surely not a good idea: [1] [2] [3]

This is also not ideal [4], a better outcome would be to shift the subscription flag to the citation's via parameter, e.g.: "SAUDI ARABIA,UNITED STATES : Saudi's Al Jouf University Chooses Cisco WebEx, Offered in the Kingdom in Partnership With STC, for E-Learning". Mena Report. 12 May 2011. Retrieved 23 August 2012 – via HighBeam Research.

Dragons flight (talk) 03:51, 29 May 2013 (UTC)[reply]

This commit fixes the subscription issue (and potentially others?). I'm still considering about the ((lang)) issues. Should I just remove the template, but leave the value of |2= which is the text itself, or put the entire citation template in the ((lang)) template (I think this is unwise)? Otherwise I'd probably have to either just skip those errors, or set |2= to a null value before the citation template, leaving the actual value in the citation template. What do you suggest?  Hazard-SJ  ✈  07:02, 1 June 2013 (UTC)[reply]
I suggest converting ((lang)) to |language= and leaving the value of |2= in the |title= parameter (e.g. this edit), and then deleting any duplicated |language= parameter. GoingBatty (talk) 15:16, 1 June 2013 (UTC)[reply]
I have an uncommitted fix for it, but I'm trying to work out this issue (see bug 2700).  Hazard-SJ  ✈  01:24, 2 June 2013 (UTC)[reply]
Fixed in this commit.  Hazard-SJ  ✈  02:52, 2 June 2013 (UTC)[reply]
Is this ready for another trial? please ((ping)) me with your response :) ·addshore· talk to me! 09:12, 2 June 2013 (UTC)[reply]
@Addshore: No, I'm not yet ready, I'd like to improve the code and add a few more features first. I'll keep you updated.  Hazard-SJ  ✈  22:59, 3 June 2013 (UTC)[reply]
@Addshore: (diff) I added some more features and did some code clean-up, so I think I'm ready again. As a side note, these edits were accidentally made, though with an outdated version of the code.  Hazard-SJ  ✈  02:36, 7 June 2013 (UTC)[reply]
Approved for trial (50 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. ·addshore· talk to me! 08:17, 7 June 2013 (UTC)[reply]
Trial complete. (edits)  Hazard-SJ  ✈  04:20, 11 June 2013 (UTC)[reply]
Approved for trial (50 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. ·addshore· talk to me! 18:46, 16 June 2013 (UTC)[reply]
Hazard, I'm not a pywikipedia expert and don't pretend to be, but doesn't page.namespace() just return the namespace number? I bet I'm just missing something, but an explanation'd be great. Thanks! Theopolisme (talk) 04:35, 17 June 2013 (UTC)[reply]
You're correct about it returning the namespace number, but remember that if 0 in Python returns False, and other digits return True, so in other words, if the namespace number is not zero, it continues to the next page.  Hazard-SJ  ✈  05:54, 18 June 2013 (UTC)[reply]
*headdesk*, duh :p Theopolisme (talk) 14:18, 18 June 2013 (UTC)[reply]
Trial complete. (edits); I haven't checked them all as yet (it somehow only did 49, though), but an obvious problem so far is the comments being copied from archiveurl the url where present.  Hazard-SJ  ✈  17:14, 22 June 2013 (UTC)[reply]
I only checked half the edits. Some of these edits may be garbage in, garbage out, but they look strange:
I have raised similar concerns on your talk page and disabled the task for good measure. Graham87 14:13, 23 June 2013 (UTC)[reply]

The ongoing errors are a bit of a worry for me, especially as this bot is running in the article space. I'm leaning towards denying this task. --Chris 13:20, 24 June 2013 (UTC)[reply]

I will leave out the part of the code that moves templates out of citation templates (maybe the language replacements are okay, since that's specifically hard-coded?). Also, I can code the bot to not make replacements in ref tags (as I did on a recently approved task). Also, I will have it check for |deadurl= as well.  Hazard-SJ  ✈  02:41, 25 June 2013 (UTC)[reply]
((BAGAssistanceNeeded)) In response too what I've said, may I have another trial please?  Hazard-SJ  ✈  00:35, 3 July 2013 (UTC)[reply]
Approved for trial (50 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. ·addshore· talk to me! 12:01, 21 July 2013 (UTC)[reply]
Started, only did these so far, I'll try to finish when I get back online.  Hazard-SJ  ✈  04:02, 24 August 2013 (UTC)[reply]
In this edit, the bot added |archivedate=02 March 2012 - the leading zero is not needed. It would also be nice (but too much to ask for?) if the bot could have detected that the reference already had an archivedate and was just missing the pipe. GoingBatty (talk) 14:07, 24 August 2013 (UTC)[reply]
This change should strip the leading "0" if available. Also, as for the pipe issue, there might be (hopefully) few of such cases, and though I'm not sure if all such mistakes would all be in the same format, but if it's a frequent issue I could get a pattern to attempt it (running from false positives here).  Hazard-SJ  ✈  02:45, 25 August 2013 (UTC)[reply]
I thought this task was about removing links and templates from citation templates, but these edits are fixing archive link errors...:Jay8g [VTE] 18:07, 25 August 2013 (UTC)[reply]
This task focuses on errors related to citation templates, which include the archive links. As for the templates, it's better to have that part more specific (hard-code for certain templates only, because, as seen from the above, can cause many problems). IIRC, there isn't a problem with links. However, thanks for the mention, it caused me to look back at the code and notice that I disabled the entire link/template section rather than just that part that isn't specific as it pertains to templates (which, as I said, can be very troublesome). That has now been fixed in the code.  Hazard-SJ  ✈  05:15, 28 August 2013 (UTC)[reply]
I'll also be adding this, and as I mentioned before, possibly others in the future.  Hazard-SJ  ✈  06:07, 28 August 2013 (UTC)[reply]
OK, here it is.  Hazard-SJ  ✈  06:25, 28 August 2013 (UTC)[reply]
I just resumed the trial, and from the above code, got these. I however, stopped the trial to disable that part for now, so I can get some of the other parts involved.  Hazard-SJ  ✈  06:30, 28 August 2013 (UTC)[reply]
Trial complete. OK, continuing from above, I resumed with a trial that actually made 49 more edits. It would have actually been 50, had the attempt to edit Georgia O'Keeffe actually been successful. The attempt was:
- On January 10, 1977, [[President of the United States|President]] [[Gerald R. Ford]] presented O'Keeffe with the [[Presidential Medal of Freedom]], the highest honor awarded to American citizens.<ref>((cite web |archiveurl=http://web.archive.org/web/20071024122700/http://www.medaloffreedom.com/GeorgiaOKeefe.htm |title=Georgia O'Keeffe|archivedate=October 24, 2008 |accessdate=June 1, 2010))</ref> In 1985, she was awarded the [[National Medal of Arts]].
+ On January 10, 1977, [[President of the United States|President]] [[Gerald R. Ford]] presented O'Keeffe with the [[Presidential Medal of Freedom]], the highest honor awarded to American citizens.<ref>((cite web |archiveurl=http://web.archive.org/web/20071024122700/http://www.medaloffreedom.com/GeorgiaOKeefe.htm |title=Georgia O'Keeffe|archivedate=October 24, 2008 |accessdate=June 1, 2010|url=http://www.medaloffreedom.com/GeorgiaOKeefe.htm))</ref> In 1985, she was awarded the [[National Medal of Arts]].
which, according to my checks, failed because of a spam filter for medaloffreedom.com on MediaWiki:Spam-blacklist.
Error-wise, I picked up things like [9], [10], [11], and [12], all of which are as a result of the bot not having a record of those language codes (well, at least one of them was invalid), and this, which the bot wouldn't have been able to correctly fix. As for the first issue, I'll simply fix it by, firstly, updating the list of languages it's aware of, and secondly, to avoid repetition of this, the bot will no longer add ((#language:xx|en)), but rather, either leave the ((lang)) template or remove it there's already a language parameter set. I hope this request is more promising now. Thanks,  Hazard-SJ  ✈  07:39, 28 August 2013 (UTC)[reply]

Language[edit]

@GoingBatty: ((lang)) and |language= do different things. Please read up on the former, which should not be removed without thought. @Hazard-SJ: this page is difficult to read because of your garish sig. Please tone it down. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 19:58, 28 August 2013 (UTC)[reply]

@Pigsonthewing: - Template:Lang states "This template also includes a categorisation link when used by main namespace pages, therefore it should not be included inside a wikilink." Since the |title= parameter of ((cite web)) contains a wikilink, using the ((lang)) template within the |title= causes a visible categorisation error in the reference: see Compagnie des Transports Strasbourgeois for an example. Your thoughts on the best way to fix these errors would be appreciated. Thanks! GoingBatty (talk) 23:28, 28 August 2013 (UTC)[reply]
Correct.  Hazard SJ  08:17, 29 August 2013 (UTC)[reply]
@GoingBatty: Good: ((lang|fr|[[Zut alors!]])). Bad: [[((lang|fr|Zut alors!))]]. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:16, 29 August 2013 (UTC)[reply]
@Pigsonthewing: - Also Bad: <ref>((cite web|url=http://www.zutalors.fr|title=((lang|fr|Zut alors!))))</ref> — Preceding unsigned comment added by GoingBatty (talkcontribs) 17:20, 29 August 2013‎
@GoingBatty: How else do you suggest that we mark up the titles of non-English works, such that the emitted HTML complies with HTML and WCAG standards? (And where would be a better place to discuss this issue; which is probably drifting from relevance here?) Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:30, 29 August 2013 (UTC)[reply]
@Pigsonthewing: What exactly is your issue with |language=?  Hazard SJ  02:18, 30 August 2013 (UTC
@Hazard-SJ: None whatsoever. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:29, 30 August 2013 (UTC)[reply]
@Pigsonthewing: - I was suggesting that this bot be coded so it would change my example above to <ref>((cite web|url=http://www.zutalors.fr|title=Zut alors!|language=French))</ref>. If Hazard-SJ doesn't want to include that in the scope of this bot, then I agree we should stop discussing it here. GoingBatty (talk) 16:48, 31 August 2013 (UTC)[reply]
@GoingBatty: Yes; and my point is that that contains nothing that indicates tat the phrase "Zut alors!" is not English. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:10, 31 August 2013 (UTC)[reply]
@Pigsonthewing: - The |language= parameter indicates that the reference text (and therefore probably the title too) are not English. I'm open to alternate suggestions that do not produce errors. GoingBatty (talk) 17:40, 31 August 2013 (UTC)[reply]
@GoingBatty: The use of |language= may suggest a probability that the title is in another language, but it does not guarantee it; and it does not indicate it in the emitted HTML, as does ((lang)), though the use of the appropriate HTML attribute, as described in the latter's documentation, to which I referred you earlier. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:58, 31 August 2013 (UTC)[reply]
@Pigsonthewing: - Could you please provide an alternate suggestion for Hazard-SJ for removing the errors generated by using ((lang)) in the |title= parameter? GoingBatty (talk) 18:17, 31 August 2013 (UTC)[reply]
@GoingBatty: What errors? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:21, 1 September 2013 (UTC)[reply]
@Pigsonthewing: - Sorry I haven't been able to explain this properly, so let me try again. Up above on June 1, I provided this edit, which fixed the Wikilink embedded in URL title error on reference 140 on the Death Magnetic article. Similarly, this edit by another editor fixed similar errors on references 21 and 42 on the Compagnie des Transports Strasbourgeois article. Both of these examples demonstrate that using ((lang)) in the |title= parameter of a citation template produces a visible error and categorizes the article in Category:Pages with citations having wikilinks embedded in URL titles, and that my suggestion to Hazard-SJ for fixing them is to remove ((lang)) and use |language= instead. GoingBatty (talk) 00:30, 2 September 2013 (UTC)[reply]

@GoingBatty: Thank you; I wasn't aware of you having attempted an explanation previously. ((lang)) does not emit or cause to be emitted a link in the text which it contains; the issue appears to be the emitted category. The solution would therefore seem to be one of: ask for that template to not emit a category when used in a reference; change the way in which it emits a category;have a sister template for use in references, which dos not emit a category; or have the functionality embedded in the citation template core itself. the later is probably the most elegant solution. As I said above, |language= does not have the same functionality as ((lang)) and is not a workable alternative to it. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:07, 2 September 2013 (UTC)[reply]

P.S. In the interim, the template could be commented out, allowing its later reinstatement., Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:08, 2 September 2013 (UTC)[reply]

I seem to have missed out a lot here, but to put it simply, my bot has already had that feature for some time now. The current version works for both ((lang)) and ((lang-xx)) style templates, as should be seen from the trials. In that case, if there's anything I missed, please let me know.  Hazard SJ  01:29, 3 September 2013 (UTC)[reply]

Experienced wikilink error remover offering help[edit]

I have fixed about 5,000 of the 8,000+ articles that were in Category:Pages_with_citations_having_wikilinks_embedded_in_URL_titles. I have often thought about having a bot do the tasks, but they are so varied that it seems like less than 50% of them should be fixed by a bot. I'll be happy to have a discussion with you here or elsewhere about the fixes I have been making and the kinds of edge cases I have encountered. You can also look at my contribution history to see how I have handled any number of cases. – Jonesey95 (talk) 14:33, 30 August 2013 (UTC)[reply]

@Jonesey95: - You've done great work cleaning up this category! I hope the bot could fix the most common cases, but we'll still need passionate editors such as yourself to manually fix the edge cases. Thanks! GoingBatty (talk) 18:21, 31 August 2013 (UTC)[reply]
@Jonesey95 and GoingBatty: My bot already supports removing wikilinks from in titles, but the category also lists those with templates in the titles (I believe, though I'm not sure), which is where an issue has been raised. I've therefore have to hard-code this to handle specific instances of this (currently supports ((lang)) (see section above), and ((subscription required))), so the rest would need manual review, unless they are general simple cases, which I could also hard-code. As we've established from the earlier trials, I can't just remove random templates, because there are far too many false positives.  Hazard SJ  01:37, 3 September 2013 (UTC)[reply]
The templates that add articles to this category are those that generate wikilinks, either to articles, e.g. ((sic)), or to categories, e.g. ((fr icon)) and ((lang)) and ((nihongo)). If you're interested in bot-fixable editing of these citations, here's my advice:
  • Unilaterally change ((sic)) to an appropriate form that conforms to WP:MOS, if there is one. I haven't looked. Maybe [sic] or [sic].
  • Move ((subscription required)) outside the cite template but before the closing </ref> tag. Make sure to put a space between the closing braces for the cite template and the opening braces for the subscription template.
  • ((lang)) can be dealt with by stripping the lang template from the title parameter (or commenting it out), but you need to make sure that there is an appropriate language=XXXXX (using the full name of the language) parameter in the citation. As described in the discussion above, there does not appear to be an error-free way to indicate "This title (as opposed to the publisher's name or the work's name) is in language X" without generating a Lua error. Maybe someone will modify the cite template to make that option available.
  • ((nihongo)) and ((nihongo2)) should follow the same rule, except that the language is always Japanese. ((nihongo)) and its ilk take multiple parameters separated by pipes, so they may not be bot-fixable. This template often requires the addition of the trans_title parameter to make it look right, and I don't think there is a general way for a bot to decide how to fix it in each case.
  • When I find ((XX icon)), where XX is the two- or three-letter code for a language, I have been moving the template outside the cite template but before the closing </ref> tag. Make sure to put a space between the closing braces for the cite template and the opening braces for the XX icon template.
  • Note that many ((XX icon)) templates have redirects of the form ((XX)), such as ((fr)) for ((fr icon)). Not all two-letter versions of this template are redirects to the ((XX icon)) template, however.
  • Another note: for some reason, most articles that use ((ru icon)) have the incorrect parameter value language=ru instead of language=Russian. If your bot wants to fix those, that would be great.
  • There are other templates that cause problems, but I don't think they are bot-fixable. – Jonesey95 (talk) 19:19, 3 September 2013 (UTC)[reply]
If you want your bot to handle some or all of these situations, here are some options:
  • For ((sic)), you could add |nolink=y to suppress the wikilink.
  • You could change (subscription required) to |subscription=y.
  • You could also change ((XX icon)) to the appropriate |language= parameter.
Thanks! GoingBatty (talk) 16:47, 4 September 2013 (UTC)[reply]
Thanks. For ((subscription required)) I've been switching them to |via=.  Hazard SJ  04:43, 5 September 2013 (UTC)[reply]
For the Citation Style 1 templates that have been converted to Lua (((citation)), ((cite AV media)), ((cite book)), ((cite conference)), ((cite encyclopedia)), ((cite journal)), ((cite news)), ((cite press release)), ((cite sign)), and ((cite web))) a better solution to ((subscription required)) issues is to set the CS1 template parameter |subscription=yes. When ((subscription required)) contains |via=, then also set the CS1 template's |via= parameter. — Preceding unsigned comment added by Trappist the monk (talkcontribs) 13:14, 5 September 2013‎
The Lua based CS1 templates now support ISO639-1 codes in |language=. This provides the same categorization as the ((xx icon)) templates.
—Trappist the monk (talk) 10:00, 25 September 2013 (UTC)[reply]

((OperatorAssistanceNeeded)) There's been several improvements and extended trials. Is there any outstanding issues left? Hasteur (talk) 01:12, 21 September 2013 (UTC)[reply]

Well, with what is in mind from the last trial, with the fixes I said, that's basically it. There are probably one or two cases of the wikilink removal instances above I could hard-code as well, but, again, as far as I'm aware, that's about it.  Hazard SJ  03:16, 25 September 2013 (UTC)[reply]
((OperatorAssistanceNeeded))
I read that response as: since the last trial there have been code changes. Is that right? Josh Parris 10:33, 5 November 2013 (UTC)[reply]

Operator has edited on four days in the last two months and has become unresponsive. I'm expiring this without prejudice; the operator is welcome to re-open. Request Expired. Josh Parris 07:40, 17 November 2013 (UTC)[reply]

The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.