Operator: Hasteur (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 00:29, Thursday, June 11, 2015 (UTC)
Automatic, Supervised, or Manual: Automatic
Programming language(s): Python
Source code available: Pywikibot with special driver above it. Driver file is [1]
Function overview: Source referenced in many pages has relocated to a new server and changed their page location format. The new format is somewhat predictable but requires poking to figure out which day the old page points at.
Links to relevant discussions (where appropriate): Wikipedia:Bot_requests#Replace_links_from_a_referenced_site_for_WP:ANIME
Edit period(s): One time run, but may need to be ran again if a large collection of new links pops up.
Estimated number of pages affected: Accorging to the requesting user, 136 pages.
Exclusion compliant (Yes/No): No
Already has a bot flag (Yes/No): Yes
Function details: Bot asks for all the weblinks that are *.okazu.blogspot.com, It then goes through to evaluate if the page should be adjusted and what exemptions are appropriate. Exempt pages include BotReq, User pages, and any page that is an "Archive". Once the exemptions are dealt with we gather the text of the page and do a regex to find any string where the site is mentioned (extracting the year, month, and nomnitive title). We create a compound key based on the 3 pieces of information and look it up to see if we've already searched for that reference in the new site, and if so, don't bother asking the site again for the same information. If we haven't found the new location of the reference, we go and brute force ask the site "For this Year, Month, Day, and Title, do you have a page?" The site will return a 404 if we haven't guessed right, but returns a 200 when we do guess right. We store the successes url into our cache of already searched for replacements and return the new url so that the string can be replaced in the text. The last step would be to save the page with an appropriate notice (Something to the effect of "HasteurBot 10 Replacing okazu.blogspot.com refs with yuricon.com equivilants"). Once the bot task is ran, there should be no need to run it again as the maintenance levels will be much more managable. This task is not exclusion eligible as we're fixing links to make them point at something that is working correctly.
CC @Nihonjoe: as the editor primarily championing this cause. Hasteur (talk) 00:35, 11 June 2015 (UTC)[reply]
The number of edits is small. Let's give it a try. -- Magioladitis (talk) 22:20, 11 June 2015 (UTC)[reply]
Approved for trial (30 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. -- Magioladitis (talk) 22:20, 11 June 2015 (UTC)[reply]
Hasteur do you think it has to be automated then then review every single edit then? -- Magioladitis (talk) 06:39, 12 June 2015 (UTC)[reply]
Approved for extended trial (30 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Hasteur Let's complete the bot trial. If everything works fine. I can approve for fully automated run. I'll need you around because they are more links taht need fix. -- Magioladitis (talk) 11:57, 12 June 2015 (UTC)[reply]
Approved. Hasteur I trust you to check th edits while the bot is running or after the bot is done. It's clear that are some edge cases e did not cover wth the bot trial. -- Magioladitis (talk) 20:56, 13 June 2015 (UTC)[reply]