Operator: ~ AmeIiorate U T C @
Automatic or Manually Assisted: Automatic
Programming Language(s): AWB (possibly assisted by a C# app)
Function Summary: Escaping categories in user space pages
Edit period(s) (e.g. Continuous, daily, one time run): Sporadically Most likely once/twice a week.
Already has a bot flag (Y/N): N
Function Details: AilurophobiaBot escapes article categories in userspace pages, it does this by replacing and then filtered through AWB, although I am toying with the idea of using an app to automate this (unless someone can think of a better way to generate the list?)
It will not escape any category starting with "Wikipedia" (which includes "Wikipedian/s") or "User".
AilurophobiaBot escapes article categories that to not belong in userspace pages. It does this by replacing
[[Category:
with [[:Category:
The regex includes filtering to prevent it removing legitimate userspace categories, it ignores any category containing Wiki, Bot, User, task force (inc. taskforce), Possible, Candidates, proofreaders, translators, workgroup, admin or proxies (cases are insensitive). It also ignores specific categories; Category:Vandalism Control Network members, Category:Non-talk pages that are automatically signed and Category:The IC Star Recipients.
The lists of pages to edit are generated using
<categorytree namespaces=User mode=pages>, an example of this is User:AmeIiorate/Badcats/Living People (a list of all userspace pages in the Living People category) api.php, example. The categories I will focus on are those that appear with a lot of pages on User:Ilmari Karonen's Badcats table. Basically, I will create a handful of categorytree pages (some can include multiple listings, such as all the "XXXX births" on one page) and check the pages periodically, when a category returns a sizeable backlog I will then have the bot clear it up. I wrote a C# app to alert me about new pages in Category:Articles with invalid date parameter in template, so I will change it to also alert when userspace pages are added to article-categories (such as Living people).
Discussion[edit]
- I don't think there's consensus to remove all non-Wikipedia categories from userspace. Most, yes, but all? Category:Exclusion compliant bots comes to mind. Are we sure that Category:Pages where template include size is exceeded and Category:Possible copyright violations are never appropriate in userspace? I don't think we can have a bot blindly remove these. Besides, categories like Category:Jewish Wikipedian seminary students, Category:Third opinion Wikipedians, and Category:Anonymous Wikipedians don't start with "Wikipedia". Even if you allowed all /.*Wikipedia.*/ categories, you would occasionally be removing legitimate categories from userspace, and trust me: people get pissed when bots alter their userspace. – Quadell (talk) 12:35, 18 August 2008 (UTC)[reply]
- You are right about that, so a change of scope; how about if it included categories rather than excluding them. So instead of escaping everything except X, it ignores everything except categories that are definitely article-only. It will just take a bit of time to set up the right categories. ~ AmeIiorate U T C @ 19:43, 18 August 2008 (UTC)[reply]
- Sounds fine to me. In theory all categories appropriate for userspace should be tagged with ((wikipedia category)) and friends, but in practice we've still got some way to go there. Just remember to be careful not to make any mistakes, use friendly and informative edit summaries (something simple like "adding colons to category links" would be a good start), make it easy for users to find your talk page and try to stay calm when, inevitably, someone doesn't understand what your bot did to their user page and comes angrily complaining to you. In my experience (see e.g. here and here) most people won't mind helpful edits to their user page at all, but if you edit enough of them, eventually someone will complain. —Ilmari Karonen (talk) 21:58, 18 August 2008 (UTC)[reply]
- The edit summary I had in mind was: escaping category link to prevent this page appearing in main article categories more info ... - question? (if this is approved the bot's userpage will be redone to explain in-depth what it does). I realise people are often quite possessive of their userspace but I am confident I can deal with any hostilities that arise. ~ AmeIiorate U T C @ 22:52, 18 August 2008 (UTC)[reply]
- Scratch that, using defined categories is not feasible. Therefore, I propose to use Special:Recentchangeslinked to produce lists. For example, Changes related to "Category:Living people" (filtered to userspace). I will then readopt my original plan to replace
with [[Category:
with [[:Category:
. At the moment the list of pages to edit will be generated manually through Special:Export
[[Category:[[:Category:
along with some filtering to ignore certain categories, this is 'safer' because if a userspace page has been caught by the RecentChangesLinked result of an article-only category then it should be safe to bulk remove the categories (ignoring the incontestably userspace-only cats.) %27%27'~ AmeIiorate%27%27' U
T C @ 11:47, 20 August 2008 (UTC)[reply]
Hmmm... is there a reason why a workflow like this wouldn't work?
I could help you with step 1: I have both a toolserver account and some scripts for grepping database dumps. You can probably find people to help you with step 3 if there are too many categories for you to do it alone. —Ilmari Karonen (talk) 10:43, 21 August 2008 (UTC)[reply]
/[Ww]iki|[Bb]ot|[Uu]ser|[Tt]ask.?[Ff]orce|[Pp]ossible|[Cc]andidates|IP_address|[Ll]icen[cs]e|[Ss]ockpuppet/
or transclude any of ((User category)), ((Sockpuppet category)), ((Wikipedia category)), ((Educat)), ((UsersSpeak)), ((User language subcategory)), ((Userloc-2)), ((Userbox)) or any template beginning with "Usercat". The numbers tell how many user pages are in each category. —Ilmari Karonen (talk) 13:01, 21 August 2008 (UTC)[reply]
As a content editor, I'm aware that lots of people use userspace for sandboxes for articles before moving them into the mainspace when ready. It'd be irritating for them if your bot kept fiddling with Cats prior to page moving. You could address this by setting the bot to ignore User:Foo/Sandbox pages and subpages thereof. For those who use sandboxes but don't call them "sandbox" (!) you could assist by setting a time-related filter, to ignore recently created or perhaps those recently worked on? Just some thoughts. --Dweller (talk) 11:01, 21 August 2008 (UTC)[reply]
I have rewritten the full function details to clarify/outline what has been changed. ~ AmeIiorate U T C @ 10:47, 22 August 2008 (UTC)[reply]
Approved for trial (25 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. It could also link some templates that place the page in a category. BJTalk 19:19, 26 August 2008 (UTC)[reply]
((BAGAssistanceNeeded)) ~ AmeIiorate U T C @ 23:39, 31 August 2008 (UTC)[reply]