The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was  Approved.

Operator: T.seppelt (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 18:11, Wednesday, November 4, 2015 (UTC)

Automatic, Supervised, or Manual: automatic

Programming language(s): Java, own framework

Source code available: not yet

Function overview: removing ((Persondata)) in all articles, copying the information to a certain database which will be accessible on Tool Labs. See toollabs:kasparbot/persondata.

Links to relevant discussions (where appropriate): first RfC, second RfC, Bot request

Edit period(s): one time run

Estimated number of pages affected: 1.2 million

Exclusion compliant (Yes/No): no

Already has a bot flag (Yes/No): Yes

Function details:

-- T.seppelt (talk) 18:11, 4 November 2015 (UTC)[reply]

Discussion

[edit]

@Pigsonthewing, Magioladitis, Izno, GoingBatty, Hawkeye7, and Dirtlawyer1: -- T.seppelt (talk) 18:11, 4 November 2015 (UTC)[reply]

  1. Copy this data into a database elsewhere (totally uncontroversial - let the bot do this)
  2. Delete the data from here before the other database is established (why even do this?)
This proposal is premised on a database created in the future replacing the need of the current system. Why delete the current system before there is broad community approval of this other database, which does not even exist yet? Feel free to make the other database. After that database is appreciated, then make a second proposal to use it to replace the current system.
I fail to recognize any reason why these steps ought to be combined into a single proposal. What am I missing? Blue Rasberry (talk) 18:02, 15 November 2015 (UTC)[reply]
Oppose and agree with Bluerasberry's reasoning. Why would we make someone go to two places to look for that information? — Sctechlaw (talk) 01:00, 17 November 2015 (UTC)[reply]
@Sctechlaw: What do you mean by "two places"? — Earwig talk 01:30, 17 November 2015 (UTC)[reply]
The Earwig The two places being discussed are English Wikipedia and a project on Tool Labs. Blue Rasberry (talk) 14:38, 17 November 2015 (UTC)[reply]
The final home of these data is Wikidata, regardless. Intermediary but pursuant to the two RFCs (one to remove Persondata and the second to remove it by bot) would be a "holding pen" of sorts (hosted on Labs) for people to more easily assign the data to Wikidata items (through a person-centric UI). --Izno (talk) 14:43, 17 November 2015 (UTC)[reply]
Right, that was my understanding. I asked because I'm not clear how we are making people go to two places to look for information. As far as this task is concerned, if we keep ((persondata)) around for too long after the Labs database has been created, we're just encouraging them to go out of sync as people deal with both. I think it makes sense to start running the bot as soon as we are satisfied with the way the tool is structured. — Earwig talk 01:21, 18 November 2015 (UTC)[reply]
I would like to emphasize that the only alternative to T.seppelt's creation of a Persondata database and interface for review and transfer to Wikidata is the outright deletion of all existing Persondata with no further transfer of usable Persondata information to Wikidata. And I would also like to take note that T.seppelt's analysis below demonstrates that the statements made during the two previous RfCs -- that there remained no usable Persondata information that could be practically transferred to Wikidata -- were uninformed at best and outright misrepresentations at worst. Again, I commend T.seppelt for undertaking this project, and I urge editors who are opposed to this proposal to familiarize themselves with the two previous RfCs related to the removal of Persondata (May 2015 and September-October 2015) and prior bot request (June 2015). This is now the only game in town to preserve and transfer usable Persondata. Dirtlawyer1 (talk) 01:40, 18 November 2015 (UTC)[reply]

Update I am considering different ways of making the data accessible for user assisted import after the removal at the moment. In order to assess the options I did an analysis of the persondata which is accessible at the moment on enwiki. These are the results:

Persondata field Wikidata New ready New unparsable Conflict Conflict unparsable
DATE OF BIRTH P569 51269 4695 88790 5093
PLACE OF BIRTH P19 310575 32086 44907 27230
DATE OF DEATH P570 26379 2335 67835 2724
PLACE OF DEATH P20 90996 10654 14996 10737
ALTERNATIVE NAMES alias 101976 n/a
SHORT DESCRIPTION description 21417 135961
NAME label (in future alias) 54 244569

As you can see so far we have 479,219 statements which could be directly imported. The best option for this data is to me to give it to the Primary Sources Tool. For the conflicting statements, the unparsable data, the aliases, the descriptions and the labels I will provide a software solution. But please consider that we should start this removal process as soon as possible due to the long time it will take. There will be 1,295,269 user actions necessary to complete the import. The 479,219 statements can be accessible in the next days. Even to check these statements will take weeks, in the meantime the next parts of the dataset will be available through the proposed tool. Warm regards, -- T.seppelt (talk) 20:19, 15 November 2015 (UTC)[reply]

Great news You can use the Primary sources tool now to add place of birth and place of death statements to Wikidata. Tpt just uploaded the dataset ([2]). I am working on the tool for descriptions now. It will be available in the next hours or tomorrow. Warm regards, -- T.seppelt (talk) 19:41, 20 November 2015 (UTC)[reply]

Tool is launched in beta I worked on the proposed tool and it is ready for public testing now. You can find it here. Please check your contributions to Wikidata in order to find bugs. Let me know, if something goes wrong. Please come up with ideas for improvement. Warm regards, -- T.seppelt (talk) 15:25, 21 November 2015 (UTC)[reply]

Convenience break no. 1

[edit]

@T.seppelt: Hi, TS. I got a chance to test-drive your new tool today (December 2, 2015) for the first time, following our Thanksgiving holiday week here in the States. It's impressive, and appears to be exactly what you proposed above. I do have a few follow-up questions . . . .

  1. When I tried it out today, there were only a few hundred conflicting datapoints to be chosen -- are these the only conflicting datapoints remaining, or does this just represent the first batch of conflicting datapoints to be reviewed and selected?
    The tool contains at the moment only a fraction of the datapoints for testing. I am going to upload the rest of it soon. -- T.seppelt (talk) 22:04, 7 December 2015 (UTC)[reply]
  2. How many non-conflicting Persondata datapoints have been imported directly into Wikidata so far? If that process still has not begun, what is the timeline for it?
    Nothing is going to be directly imported due to the decision of the Wikidata community to not accept automatically imported Persondata data. The tool also contains datapoints which aren't conflicting. Those can be imported manually. -- T.seppelt (talk) 22:04, 7 December 2015 (UTC)[reply]
  3. How many conflicting Persondata/Wikidata datapoints have been manually reviewed thus far using your new tool?
    478 datapoints have been reviewed so far. -- T.seppelt (talk) 22:04, 7 December 2015 (UTC)[reply]
  4. In using your tool to review 100 or so conflicting brief descriptions, I noted that close to half were not so much conflicting as complementary -- e.g., American politician, Founding Father, signer of the Declaration of Independence. Is there any way we may add a Persondata brief description without replacing the existing Wikidata description?
    I have a function in mind which allows the user to edit the suggested description and add it on the spot. I am going to implement it as soon as I find time. -- T.seppelt (talk) 22:04, 7 December 2015 (UTC)[reply]
  5. Is there any way we can choose to review the conflicting datapoints for particular Wikipedia categories (e.g., Olympic swimmers of Germany)?
    I am storing only the Wikidata entry numbers and the Wikipedia article names. Fetching the categories would be quite advanced, but I am thinking about it -- T.seppelt (talk) 22:04, 7 December 2015 (UTC)[reply]
  6. What's your plan going forward from here?

Once again, thank you for devoting your time and skills to this endeavor. Dirtlawyer1 (talk) 05:24, 3 December 2015 (UTC)[reply]

The plan is to import all remaining datapoints to the tool and start with the removal. Everything is ready for it. -- T.seppelt (talk) 22:04, 7 December 2015 (UTC)[reply]

@Dirtlawyer1:  Done I imported all remaining datapoints. They are now available in the tool. As you can see there is a long way to go. Anyways, we can start with the removal now. -- T.seppelt (talk) 10:06, 8 December 2015 (UTC)[reply]

I have been playing around with this for a bit, and here are some thoughts:
  • Agree with Dirtlawyer that being able to edit the descriptions is essential.
     Working I will implement this soon. -- T.seppelt (talk) 21:30, 10 December 2015 (UTC)[reply]
  • There's no way to link to specific challenges, so I am using a screenshot instead. It's identifying a conflict where none exists. The persondata in Jo Marie Payton is "Albany, Georgia, U.S.", so I'm not sure why it is misidentifying that as Q137573.
    Whilst parsing , was interpreted as separator if the value didn't contain [[ or ]]. I think this is the best option for most cases. In general articles about pages don't contain ,, so , should be treated as separator between subdivisions (New York City, New York, USA etc.). -- T.seppelt (talk) 21:30, 10 December 2015 (UTC)[reply]
  • A direct reference to the guidelines for descriptions and aliases when I am using the tool would be helpful, to clarify the meaning of "You are kindly ask to decide which one is better." Better how? Is "Fooian sausage-maker" better than "sausage-maker from Foo"?
     Working I will implement this soon. -- T.seppelt (talk) 21:30, 10 December 2015 (UTC)[reply]
  • When dealing with descriptions, "No, take this!" would be clearer as "Keep this" or something clarifying that Wikidata won't be edited and this is essentially the "default" action. (Would remove the exclamation marks too, but that's just my opinion.)
     Done I changed it. -- T.seppelt (talk) 21:30, 10 December 2015 (UTC)[reply]
  • When one article has multiple conflicts, it would be good to deal with all of those at once. The checking necessary for two properties can be very similar (e.g. date and place of birth), so this would save time.
     Working I think about it. This could also cover permanent links to certain conflicts. -- T.seppelt (talk) 21:30, 10 December 2015 (UTC)[reply]
  • There's no way we're gonna make a noticeable dent in this any time soon.
  • When you do run the bot for removal, can it handle changes in persondata since this initial run?
    I don't plan to do it. I don't think that huge amounts of information were added to articles as templates after the last run. -- T.seppelt (talk) 21:30, 10 December 2015 (UTC)[reply]
— Earwig talk 23:17, 8 December 2015 (UTC)[reply]
I am going to apply the suggested improvements. Please keep on testing the tool. Warm regards, -- T.seppelt (talk) 21:30, 10 December 2015 (UTC)[reply]
I have also made suggestions at your WD talk page, TS. Let me know what you think either here or there. For others here, quoted:

Some comments:

  • Make the skip button bigger, red, and place it slightly more prominently. (Apparently you had this planned but I'll echo it; "I should make it maybe bigger" from November 24.)
  • Make unavailable the varied data which indicate a date earlier than 1920, per the brief discussion at Wikidata talk:Primary sources tool#Migration of enwiki Persondata. Or move it into a different workflow, or something. Earlier dates are a mess right now and I don't think this tool should exacerbate that fact.
  • You can probably improve the link to the article by using the Wikidata item's link rather than using Special:Search+Article+Name.
  • It might be nice to have a link available to the version of the article at the date of import. This way I can take a look to see if anything in the article is particularly disagreeable to the persondata as well as how different (whether for example a date is a refinement of the date elsewhere in the article, see e.g. [2] or more likely to have vandalized). On this point, a link to the history of the article would also be appreciated.

[...]

Also, there are a number of challenges where the page on the Persondata side is a disambiguation item and the other is an actual item where the titles on Wikidata are the exact same. Maybe these can be prefiltered? --Izno (talk) 16:24, 11 December 2015 (UTC)

--Izno (talk) 15:32, 13 December 2015 (UTC)[reply]

Update

[edit]

@Izno and The Earwig: Thank you for your feedback so far. I managed to implement some of the things. Others are still under development. What's done so far:

I am at the moment on the following things:

Since I didn't store the version ids of the articles while filling the database I would not like to establish this links to a certain version. It would be necessary to reassemble the whole database. I would also like to keep the links using ?title=... because I have some issues with urlencoding and special characters. Regards, --T.seppelt (talk) 11:56, 14 December 2015 (UTC)[reply]

?title is fine; I didn't realize you were using that construction (for some reason I went to Special:Search from one of those pages...?).

Would it be possible to get links to the unique ID of an article as of December 14, 2015 (or date of interest i.e. whenever you can get to it)? This would be Good Enough since no bots have started removing the Persondata (though there are varied editors--including myself--removing them by hand where we bump into them).

Thanks for the work! --Izno (talk) 13:45, 14 December 2015 (UTC)[reply]

I just want to let you know that I see a way to implement all the open improvements. Due to the upcoming holidays I won't be able to make any of the changes accessible until 2016. Warm regards and merry christmas,--T.seppelt (talk) 20:50, 18 December 2015 (UTC)[reply]

@Izno and The Earwig: I am almost done with the proposed changes. Descriptions can now be edited manually, more and more ?oldid=...-links are available and claims with constraint issues are excluded. I would like to do some test edits for the removal of the template. Are you okay with this? -- T.seppelt (talk) 08:30, 4 January 2016 (UTC)[reply]

We might as well. Approved for trial (50 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. We can do a larger trial afterwards when we confirm that all is well. — Earwig talk 08:31, 4 January 2016 (UTC)[reply]
Trial complete. I didn't notice problems with the replacement pattern. What do you think about the edit summaries? -- T.seppelt (talk) 10:27, 4 January 2016 (UTC)[reply]
I haven't looked through all of the edits yet, but the summary is on the right track; maybe change "related challenges" to "challenges for this article"? — Earwig talk 09:00, 7 January 2016 (UTC)[reply]
I changed the summary as you proposed. This is probably easier to understand. After this is approved we also have to update Wikipedia:Persondata... Regards, —T.seppelt (talk) 11:20, 8 January 2016 (UTC)[reply]

Observed problem

[edit]

@T.seppelt: Having used your new tool to transfer over 1,000 items of Persondata to Wikidata, I have observed a recurring problem in the tool and/or database's recognition/reconciliation of place names. The tool sometimes selects/suggests a more generalized location than that actually provided in the Persondata template; for example, suggesting the State of New York or the United States, when the birth place or death place actually provided in the Persondata specifically states "New York City". I have also observed that the tool will also sometimes suggest Wikidata disambiguation pages when the Persondata accurately provided the specific item. For examples, please see the Persondata and challenges for Jim Price (baseball manager) and Chase Lyman.

By the way, among those Persondata items imported into your database for further review, I have found absolutely no difference in the reliability of Wikidata vs. Persondata, and I have replaced as many Wikidata items as I have rejected items of Persondata. There remains a great deal of perfectly accurate Persondata to be transferred. What we desperately need now are more editors to review the available items of Persondata, and transfer them as appropriate, using your tool. Dirtlawyer1 (talk) 23:10, 6 January 2016 (UTC)[reply]

I am still investigating on the problem with the too general descriptions. I have a script for excluding disambiguation claims, but since I can't keep stable connections to the database servers on toollabs at the moment it is going very slowly. I'm going to rewrite the script in PHP and hope that it is more reliable.
I agree with you. We need more editors to work on this. I hope that dropping a hint as summaries in edit logs of about one million articles will increase the amount of interested users. As stated above we should also update Wikipedia:Persondata. I was also think about using banners. Dewiki did this to inform the community about the web link checking activity of GiftBot. —T.seppelt (talk) 11:35, 8 January 2016 (UTC)[reply]

Updates

[edit]

I just want to let you know what's new on the tool:

  1. The deciding pages for descriptions, aliases and places provide now matches for the proposed value in the Wikipedia article. They are highlighted in yellow. Key words (birth, born, died, death, place, also etc.) are highlighted in blue. In most cases you don't have to check the whole article manually anymore.
  2. Recent decisions can be accessed now. The page follows the style of Special:RecentChanges and allows you to inspect other editor's decisions.
  3. The exclusion of disambiguation pages as value for places is in progress. The script is stable. The decisions are marked as excluded by KasparBot. Have a look at them at the recent decisions. After excluding all of them. They are going to be available for user-assisted reparsing in order to make use of those approximately 30,000 claims. I am working on a page for this.

Thank you for testing the tool. Warm regards, — T.seppelt (talk) 10:25, 11 January 2016 (UTC)[reply]

(Numbered for ease of reference). #1's change pushed the buttons down off my screen (operating at 1920x1200, which is probably one of the standard res's now). It might be desirable to move the selection buttons above that content, either by a) docking them via CSS to the bottom of the viewport or b) (preferentially) just having them above that content in the HTML. This might take the form of proposed description -> buttons -> wiki text. Good change otherwise! --Izno (talk) 12:46, 11 January 2016 (UTC)[reply]
@Izno: I solved this problem by showing only a single match when the results are getting to long. More details are shown when clicking on Show more. -- T.seppelt (talk) 17:25, 12 January 2016 (UTC)[reply]
@T.seppelt: I'm not sure it worked. [4] shows more than 1 (what is the metric of too long?), and [5] should have resulted in Lithuanian politician being found, maybe? --Izno (talk) 18:12, 12 January 2016 (UTC)[reply]
A couple other comments:
  1. "You don't like this challenge at all? No problem. Skip it!" now seems extraneous to the Skip button and can be removed.
  2. "Your decision to accepted" -> should say "accept". I think there is a similar problem with "rejected"/"reject" also. "overwritten"/"overwrite" also.
  3. No notification is posted when the challenge is skipped. Should there be one? Probably.
  4. Perhaps, add titles to the buttons to explain the intent of the button. Less concerned about this--a help page would be an acceptable substitute about the meanings, or maybe at the bottom an unbulleted list.
Otherwise, I see little issue in deleting the template now. All of these are certainly quibbles. The Earwig, any other concerns from the run, or have you not had a chance to have a look yet? --Izno (talk) 18:18, 12 January 2016 (UTC)[reply]
@Izno: The problem with the buttons outside of the viewport is now ultimately solved. The buttons stick to the viewport if the document is higher than the window. I am not sure if it's working with all browsers. For Firefox 43 it's fine. Concerning your comments:
  1. This bar has been removed.
  2.  Done
  3.  Done
  4. Instructions for using the tool could be placed at Wikipedia:Persondata. I can add a link to this page and helpful titles.
-- T.seppelt (talk) 21:11, 12 January 2016 (UTC)[reply]
My inclination would be to just do it. I admit I'm still a bit uncertain about Blue Rasberry and Sctechlaw's comments from above; they are the only ones objecting and they seem to have gone MIA since their initial remarks. We're at the point where all of the data has been copied over to the database and it is sufficiently developed to be usable (i.e., the issues above are certainly not deal-breakers), but should we draw more attention to it from the community at large in order to get the "broad community approval" that Blue Rasberry mentions? Alternatively, waiting longer leads to increased likelihood of databases being out of sync and manual effort in the meantime going to waste. — Earwig talk 21:29, 12 January 2016 (UTC)[reply]
The Earwig Hello. My original objection was about deleting information when I did not understand why it needed to be deleted. If this bot is not deleting information then I do not object to anything. If it is deleting something then I want more information, either on-wiki or by a phone or video chat if that makes things easier. Blue Rasberry (talk) 21:32, 12 January 2016 (UTC)[reply]
I understand that persondata will be deleted. I am just not sure why this, as a preservation project, is proposed as the bot to execute that deletion. I support the preservation effort but fail to recognize the rationale for this bot to preserve then delete persondata after archiving it. Blue Rasberry (talk) 21:34, 12 January 2016 (UTC)[reply]
@Bluerasberry: The ultimate goal is—of course—to get all persondata off of Wikipedia and as much of it onto Wikidata as possible. As it stands, the storage format of persondata makes this very difficult, while the bot's database lets the migrators work more quickly. I think that much is clear. The reason we want to remove persondata rather than just leave it around is because energy is expended dealing with a template that is currently useless; it takes up space in articles, people try to update it, remove it manually, etc, all of which requires effort. A mass-removal clearly marks persondata as historical and freezes its information in one state that can be worked through without concerns of this desynchronization. As long as the bot's database exists and is being reviewed, no information is lost by removing it.
Maybe someone else can provide a more coherent argument, though. — Earwig talk 22:09, 12 January 2016 (UTC)[reply]
The Earwig I know that persondata needs to be removed. I just want you to explain why you think this project should remove the persondata before there is confirmation that it is backed up in this database. Why not just delay the deletion until everyone agrees, "Yes, this bot did a backup." I still am not understanding the urgency to do the deletion.
Can you not just collect the data as of a certain date, like 12 January 2015?
I am imagining a limbo in which the persondata is deleted and the database is not created. The work flow as described is capture data, delete data locally, then establish the other database. What if someone has a major problem with persondata in your database? Why not get agreement that your database works before deleting the data locally? What information am I lacking that you have that makes you sure that at the time of local deletion, everyone will be happy that the persondata is gone? Blue Rasberry (talk) 22:53, 12 January 2016 (UTC)[reply]
@Bluerasberry: You seem to be crucially mistaken; the database exists and has been functional for over a month now. (Note that I said in my above comment We're at the point where all of the data has been copied over to the database and it is sufficiently developed to be usable.) Feel free to try it out. It's also linked from the edit summaries of each removal from the trial. — Earwig talk 05:41, 13 January 2016 (UTC)[reply]
@T.seppelt: Could we get a look at the source code for the removal component? I want to see how it's doing that. — Earwig talk 07:43, 13 January 2016 (UTC)[reply]
In Géza, Grand Prince of the Hungarians, why does the "date of death" challenge seem to appear twice? — Earwig talk 08:05, 13 January 2016 (UTC)[reply]
@The Earwig: the source code is here. This is the calling part:
public static void main(String[] args) throws Exception {
		GlobalMediaWikiConnection global = new GlobalMediaWikiConnection();
		global.setBot(true);
		
		MediaWikiConnection wikipedia = global.openConnection("en", "wikipedia.org");
		wikipedia.login(Config.LOGIN);
		wikipedia.setEditInterval(10000L);
		new TemplateRemovalTask(wikipedia, "Persondata", "migrating [[Wikipedia:Persondata|Persondata]] to Wikidata, [[toollabs:kasparbot/persondata/|please help]], see [[toollabs:kasparbot/persondata/challenge.php/article/%article|challenges for this article]]", 0).run();
}
I am aware that there are some duplicate with small IDs (< ~ 1000). I am working on identifying them. There are about 100 of them because I didn't truncate the whole database after the testing period and imported some challenges twice. -- T.seppelt (talk) 14:07, 13 January 2016 (UTC)[reply]
@T.seppelt: My concern here is that the regex replacement will not work correctly if some persondata item contains an embedded template. I am not sure in practice how common this is (it is a mistake and surely very rare, but I don't know if we have some procedure in place that actively removes them)—either way, I don't think we can rely on it never happening. — Earwig talk 22:34, 13 January 2016 (UTC)[reply]
I know. I was not able to come up with a better pattern. I will check some guides on recursive patterns in Java. Do you have a solution in mind? --T.seppelt (talk) 05:13, 14 January 2016 (UTC)[reply]
@The Earwig: I checked the guides about regular expressions in Java on the internet. It seem to be impossible to define subrules (like this group should only contain templates) in Java. I would suggest to run the programme as it is currently on GitHub and see how many errors occur. Depending on the amount the script can be adjusted. --T.seppelt (talk) 14:49, 18 January 2016 (UTC)[reply]

More quibbles

[edit]

One more quibble TS: You seem to have done away with the verbs in item 2 above completely. I got the message "Your decision to the information was successfully processed". :) --Izno (talk) 12:16, 13 January 2016 (UTC)[reply]

In those same descriptions, I might suggest only bolding the verb of interest, rather than the entire sentence in which the verb appears. --Izno (talk) 12:23, 13 January 2016 (UTC)[reply]
On challenges with conflicts, we get the text "You are kindly ask to decide" -> "You are kindly asked to decide". --Izno (talk) 12:23, 13 January 2016 (UTC)[reply]
Last quibble: Dates pre-1920 are still appearing in challenges (whether as additions or as conflicts). Have you finished excluding them yet? --Izno (talk) 12:23, 13 January 2016 (UTC)[reply]
The verbs should be back and the only words which are bold. The text is now grammatically correct. I am working on the exclusion. It will be done until tomorrow. -- T.seppelt (talk) 14:24, 13 January 2016 (UTC)[reply]
everything  Done. They are not excluded but by using ORDER BY challenges with dates before 1920 will only appear when all other challenges are done. This is the fastest solution at the moment. -- T.seppelt (talk) 14:33, 13 January 2016 (UTC)[reply]
That's an acceptable interim solution. --Izno (talk) 16:23, 13 January 2016 (UTC)[reply]

Conclusion

[edit]

((BAGAssistanceNeeded)) I think everything is ready so far. I would like to start with the removal. Are there any further objections? –T.seppelt (talk) 20:19, 23 January 2016 (UTC)[reply]

For the potential nested template issue: the easiest thing for us would probably be to ignore pages containing \{\{\s*persondata[\{\}]*\{\{ and leave them for manual removal, right? There should be few enough to not cause problems. Can someone explain more about the issue with pre-1920 dates? I'm not aware of any other concerns, though I'd still like another BAG member to check over this due to the sheer number of edits. — Earwig talk 21:25, 23 January 2016 (UTC)[reply]
@The Earwig: Re-pre-1920 dates, the Gregorian calendar did not become "a thing" for all countries until that year, so exact dates prior to that year would be imported with incorrect values, or could be, without doing some research. As it is, I don't think the tool TS has built currently handles being able to import non-Gregorian dates (whether by using the Wikidata API which does provide for it, I believe, or by providing for the translation within the tool; and a UI has to be built regardless). --Izno (talk) 01:56, 24 January 2016 (UTC)[reply]
@The Earwig: Yes, I'd like to leave them for manual removal since none of us really knows how many they are. @Izno: As you know those challenges are currently deferred. There is no need to hurry; let's think about a good solution. I'd suggest the following: The Pre-1920 dates are available on a separate site which provides more guidelines (and is only accessible when you did a certain amount of regular imports???). If the person has any relations to a country this country is displayed. Besides deciding about import or rejection the user can chose the calendar (Gregorian / Julian). What do you think about this? -- T.seppelt (talk) 08:50, 24 January 2016 (UTC)[reply]
In response:
  1. Yes, a separate site.
  2. Also not sure about certain amount of regular imports. Probably not: There are going to be experts who just want to take care of 'their' page.
  3. Yes, definitely display the country if it is known. Or perhaps, the country of the location where the person was born and where the person was died, if we can, since a person can move from one place to another inbetween.
  4. Two ways to do it:
    1. Instead of offering to "import", have both calendar types displayed.
    2. When the user selects any choice where the value is imported, ask them the calendar before moving to the next challenge.
I can go either way on 4. 4.1 seems like it will clutter the UI (multiple import options -> multiple import options x 2) but 4.2 seems like it will add an extra step which could be avoided. --Izno (talk) 16:19, 24 January 2016 (UTC)[reply]
@Izno: Thank you for the input. I don't really get what you mean by #2. Apart from this I would go for #4.1. I will work on a page. -- T.seppelt (talk) 16:36, 24 January 2016 (UTC)[reply]
Requiring some number of imports for the pre-1920s dates seems like a bad idea, because there will be some article experts who will know the correct date and can just "handle" it. --Izno (talk) 18:20, 24 January 2016 (UTC)[reply]
Okay, fine. Now I get it. Thanks, -- T.seppelt (talk) 18:24, 24 January 2016 (UTC)[reply]
Hmm, can the bare minimum be some warning when trying to import pre-1920 dates to "be careful"? As long as we have some system in place (which can be developed later) and never lose or hide the data, I think we're fine. I'm going to approve this on January 30, barring any objections, if no one else gets to it first. — Earwig talk 06:59, 27 January 2016 (UTC)[reply]

 Approved. — Earwig talk 04:23, 31 January 2016 (UTC)[reply]

The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.