Hello Wikipedians!

I read about Turnitin in a news article about a year ago and contacted them independently to see if they would be interested in donating some services to us. They liked the idea of helping Wikipedia, and it seemed, despite our different backgrounds, that we shared an interest in supporting "original authorship"--in their case catching plagiarism and in our case copyright violations. They were so enthusiastic that they offered to assess millions of articles on Wikipedia, a possibility that I was really excited to bring back to community. Just for the record, I'm not paid by Turnitin or otherwise affiliated with them in any way. I think this project might benefit us, but I have tried to consider all objections seriously, and whatever the community decides is best is ultimately what will happen. --Ocaasi t | c

Introduction

With nearly 100 employees, eight global offices, and a headquarters in Oakland, California, Turnitin is a leading provider of plagiarism detection services.

Attribution/Advertising

Wikipedia is non-corporate and non-commercial, and any threats to this status warrant serious consideration. Should the community see any corporatizing or commercializing impact arising from a collaboration with Turnitin, that impact would have to be weighed against potential losses from turning down such a collaboration. I don't see a particular problem in this area, but I want to address it up front.

False positives

Even if the attribution/advertising issue is resolved, Turnitin still has a major obstacle to overcome in actually designing a system that works. That, however, is something they are willing to test, develop, and execute completely on their end, with their employees and technical staff, their troubleshooting efforts, and their money. They're willing to invest in making this project work.

The question is whether having Turnitin's reports gives us another beneficial tool and improves upon our current copyright checking regime. We can test to see if it does, and if it does, then I think there's good reason to use it.

Comparing Turnitin to alternatives

Turnitin is not the only way to approach plagiarism-detection.

To determine which path forward is best, Turnitin needs to explain and demonstrate how they would approach analyzing Wikipedia content. Also, Coren, Madman, or others in the community would have to suggest or propose on-Wiki methods which were comparable. I think it's highly unlikely that an on-Wiki tool could design a system specifically for Wikipedia or have the resources and server capacity to check all of Wikipedia on a regular basis, but it's not impossible and I wouldn't put it past this community to develop such a system if they chose to do that.

Backlog

It may be the case that Turnitin reveals more copyright issues than we currently have capacity to fix. That may be a problem, but I think it is ultimately better than not knowing about those issues at all.

Proprietary software

As a community which shares many of the open source movement's goals, it may be ideal for Wikipedia to use only open source products. However, it may simply be pragmatic and beneficial for us, at least in the short-to-medium term, to collaborate with those who have the extensive time, capital, resources, and motivation that are frequently (but not exclusively) found in successful private companies.

Foundation resources

There would be some upfront and ongoing investment from the WMF.

In the end, we have to decide if that outlay of resources is worth having access to those intelligent text-matching reports from a company that specializes in doing so. I personally think the benefits would be worth the costs, but I'll leave that determination to the technical folks.

Legal and media concerns

There are some legal issues I'll largely skip over here, since the Foundation has already reviewed this proposal and crafted terms that they are comfortable agreeing to.

Conclusion

This is the best case I can make for why a collaboration is worth pursuing further. I'm sympathetic to concerns here and if we need to tweak our approach then I am willing to try and facilitate that as well. I think sometimes we can achieve more with the help of others--even a private company--than by acting alone, and this may be one of those cases. Ocaasi t | c 16:48, 22 July 2012 (UTC)[reply]