The following discussion is an archived debate. Please do not modify it. Subsequent comments should be made in a new section. The result of the discussion was Withdrawn by operator.


Operator: eskimospy

Automatic or Manually Assisted: Supervised Automatic

Programming Language(s): Python

Function Summary: Will request to delete pages obviously intended to be vandalism

Edit period(s) (e.g. Continuous, daily, one time run):Continuous

Edit rate requested: 1 edits per 10 minutes

Already has a bot flag (Y/N): N

Function Details: Very short pages with profanity along with lack of punctuation will be requested to be deleted (Csd)(AfD) if only 1 edit.

Discussion[edit]

I assume you meant CSD instead of AfD? Also, how will the bot detect the right CSD criteria? I think you would be better off with something like NPWatcher, which makes it very easy to tag new pages for speedy deletion. Jayden54 08:55, 15 April 2007 (UTC)[reply]

I think that anything going up for CSD needs a pair of human eyes first. Part of the speedy deletion process is that we should help the user who created the page to learn our policies and make a better new page in the future, hence we like humans to do new page patrol, so that you can explain to the creator why his/her page was deleted. Martinp23 12:46, 15 April 2007 (UTC)[reply]
CSD always gets a pair of human eyes: the admin who decides whether to delete or not. I'd be in favour of this if it can be demonstrated that false positives are very low; if they're not it would be wasting admin time (at a point in time when we have backlogs). Thus, the results of any trial need to be carefully scrutinised. Since many of the tagged pages will get deleted, that means the bot will need to keep a log. --kingboyk 16:21, 15 April 2007 (UTC)[reply]
This is how the bot will detect CSD criteria: it contains a few lines of the following.

"python replace.py -cat:uncategorized_pages "*user:lupin's profanities list*" "((db-vandalism))"

To ensure it is not a quote, example, or something not intended to be destructive, it includes:

-except:XYZ "quote"

-except:XYZ "said"

-except:XYZ "derived"

-except:XYZ "came from"

-except:XYZ "first"

-except:XYZ ";"

Remember, it only works on uncategorized pages. I don't see why we need people to make these changes, there is no way this bot can go wrong. --eskimospy(talk) 14:27, 15 April 2007 (UTC)[reply]

Note: This user has recently used an unapproved bot (to make mass changes from American spelling to British spelling, see also Wikipedia:Manual of style#National varieties of English and Wikipedia:Manual of style#Disputes over style issues). See also [1]. —Centrxtalk • 18:21, 15 April 2007 (UTC)[reply]

Thanks for the heads up. I'm also not seeing much in the way of a useful contribs history. Eskimospy, what is your reply to this? --kingboyk 18:32, 15 April 2007 (UTC)[reply]
I just wanted to do a test run of a pywikipedia bot just to see if I had set up everything correctly. For the test, all I had it do was change "honor" to "honour" (something I thought was harmless but apparently was not) and as a result, I did cause many incorrect changes to occur. However, that test was NOT Eskimospy Bot!!! And no, I won't make any foolish mistakes as I did earlier.--eskimospy(talk) 01:03, 16 April 2007 (UTC)[reply]
I still have my reservations on the validity of the proposal as a whole, and would like to see some community discussion at Wikipedia:VPP before I'd be happy to approve. When the bot marks a page for speedy deletion, it will need to warn the user who created the page, preferably using a template of its own which indicates that the message is from a bot, and which takes the article name as a parameter. Martinp23 11:25, 17 April 2007 (UTC)[reply]
I was going to propose some wider discussion. Personally I lean to finding the proposal a useful one, but we have little to go on in terms of whether the operator can deliver a solution like this and is qualified to run it. It might be a case where we need to ask the wider community. --kingboyk 11:34, 17 April 2007 (UTC)[reply]
"There is no way this bot can go wrong" -- Isn't that the premise behind Skynet and the WOPR? I'm worried about user biting on this one. If User:User Sample A creates a page this bot doesn't like, then User:User Sample B comes along and cleans something up on it, but not much; who would this bot send talk page messages to if enabled? — xaosflux Talk 12:17, 17 April 2007 (UTC)[reply]
Of course Martinp23, I will be working on a talk-page message-template for the bot. Also Xaosflux, in the message I will include "If you did not create the disruptive edits to this page, leave a message on the User Talk:Eskimospy Bot page." things like this wouldn't happen very often anyways because I will be monitoring the bot and deciding whether or not a page's intent is to vandalize (usually but not always, a commited wikipedian does not improve articles with the title "canadians suck").--eskimospy(talk) 15:17, 17 April 2007 (UTC)[reply]

Will it tag pages immediately after creation, or a bit later? MaxSem 19:04, 19 April 2007 (UTC)[reply]

The time the bot tags the page depends on when the bot detects the page, not when the page is created. Therefore the bot would usually tag the page a little bit after the page's creation.--eskimospy(talk) 23:18, 19 April 2007 (UTC)[reply]
In other words, it will tag them immediately after discovering? This raises Wikipedia:BITE concerns. MaxSem 05:19, 20 April 2007 (UTC)[reply]

Seems a bit iffy. Is it really possible to make this operate safely, but in such a way as to catch a worthwhile amount of vandalism? Some specific points:

I'm also a bit confused about the interaction with (un)categorisation. Everything in CAT:NOCAT will have (almost always) been edited at least twice, by at least two different users (one creating the article, one adding it to the uncat-cat). Is lack of categorisation useful as a predictor vandalism? And it's a large category: are you going to be traversing the same articles repeatedly? Alai 03:04, 20 April 2007 (UTC)[reply]

Thank you Alai for your questions, these are the answer:
  • The bot will be run in "newpage patrol" mode.
  • To me, very short applies to articles less than 2-3 lines long (in this case three)
  • I am having second thoughts about the punctuation part, however if were to include that, the bot would skip pages with a semicolon (;).
  • The bot will be monitored at first and if I see it is infallible it will be automatically run, however if it is not, it will stay as a monitored bot.
  • I do realize there are almost no articles on Wikipedia with one edit, so I have decided to cross out that part.
  • The bot will still only edit uncategorized pages
--eskimospy(talk) 04:33, 20 April 2007 (UTC)[reply]
Being short, or having no punctuation, are not valid grounds for the deletion of an article (preview Wikipedia:CSD). There are one or two criteria relating to article size, but these would need a human to review the content in the article relative to the subject matter. The main deletion reason here is CSD A7 - non notable biographies. A bot would be unable to detect such articles, so I am left wondering what sort of pages the bot would pick up, bearing in mind the criteria. Martinp23 23:52, 20 April 2007 (UTC)[reply]
Yes, being short or having no punctuation are not valid grounds for speedy deletion, however they are attributes most vandalism-intended pages have. This is just a way of narrowing down the articles so that the bot doesn't mistakingly mark an article for csd (there are many profane terms in articles such as the South Park article, however when the bot sees that the article is long (most vandals are too lazy to write a lengthy article) and well-written, the bot will know that the article is not a "vandalism" page.) I apologize if it was unneccessary to include what criteria the bot will use to tag pages.--eskimospy(talk) 02:18, 21 April 2007 (UTC)[reply]
I'm not a bag member, but a possible suggestion to move us away from theory would be to run the bot, and have it log what it *would* do. Have it report pages that it loaded and did nothing (and perhaps what rule trigged to make it do nothing). Same thing for when it would do something. You could run the bot for 24 hours or so, having it do no edits, but merely logging to a .txt file on your computer. Once you have that, you could then copy paste that .txt file to a bots subpage, say User:Eskimospy Bot/PreTrial1 or something. Then we can actually see how this bot works. From there we can see if the bot is ready for a "live" test on wikipedia. —— Eagle101 Need help? 04:43, 21 April 2007 (UTC)[reply]
Okay, that sounds like a good idea. I am going to start once I figure out how to make a log file.--eskimospy(talk) 13:29, 21 April 2007 (UTC)[reply]
That will definately be helpful. Thanks Martinp23 13:31, 21 April 2007 (UTC)[reply]

<-- Eskimospy, all you have to do is open a file, and append the latest action to the end of said file and save. There are multiple ways to do this though. Depending on what language you are using it should be a fairly simple thing to do. —— Eagle101 Need help? 21:02, 21 April 2007 (UTC)[reply]

I figured it out, with pywikipedia I just had to add a -log switch, but how do I tell the bot not to make any changes?--eskimospy(talk) 23:19, 21 April 2007 (UTC)[reply]
Still no luck, can't find anything on how to disable bot from making edits. I could do a test edit on a given article or given array of multiple articles (maybe only on talk pages?)--eskimospy (talkcount))) 02:20, 22 April 2007 (UTC)[reply]
See if you can comment out the parts of the bot that you don't want (in this case the parts that make it edit). —— Eagle101 Need help? 04:33, 22 April 2007 (UTC)[reply]
Just comment out the lines in (I think) wikipedia.py which save the page. I though that there was a "dry run" function, but that may have just been in a particular bot I was running. Martinp23 09:18, 22 April 2007 (UTC)[reply]

Just a note that even if the bot were running and really flagging pages for deletion, I'd prefer to see it keep a log (on wiki). Because, by definition, many of the pages it tags would get deleted, User Contribs would not be enough to keep an eye on it. --kingboyk 15:57, 22 April 2007 (UTC)[reply]

Okay, I think I have everything figured out...almost. I couldn't find where I was supposed to comment out the lines that save the page (I don't know a thing about python), but I made an external program that runs the touch.py file when vandalism is identified. So, I will probably run the bot sometime today and tomorrow (I am going to clean up the bot a little bit), that is Seattle time. I apologize if I took a long time.--eskimospy (talkcontribscount) 04:15, 23 April 2007 (UTC)[reply]
I wouldn't worry about the timescale: you've responded very promptly, and I'm sure everyone at BRFA would rather see a sensible outcome than a hurried one if more time is required. But I'm a bit puzzled about what precisely you're doing if you're not writing any new Python code (though I sympathise, it's not exactly "my" language either). So far as I'm aware what you're talking about isn't standard pywikipedia functionality: are you handling everything via driver scripts? I'd have thought that fairly close co-ordination between the two would be required... Alai 06:13, 23 April 2007 (UTC)[reply]

Eskimpspy: I just answered a question about perl from you on shadow1's talk page (he's on wikibreak, by the way, corrupted his drive from what I understand) - are you writing this in perl or python? ST47Talk 03:57, 24 April 2007 (UTC)[reply]

Well, I have decided to withdraw the request for this bot...for now. I don't think I am familiar enough with python; perlwikipediabot doesn't seem to let me download it. Also, I don't want to deal with people's complaints if the bot messes up, right now. However in the very near future I will probably re-request a more complete version of the bot. Goodbye.--eskimospy (talkcontribscount) 00:56, 28 April 2007 (UTC)[reply]

Withdrawn by operator. 12:27, 28 April 2007 (UTC)

The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.