‘On this day in history’ is a popular thing on social media but the dates are often technically wrong because of the discrepancy between the Julian and Gregorian calendars. This isn’t a serious problem if you’re just tweeting factoids, but it is a serious problem if you want to create reliable historical data that people can use in their research. This post explains the problem and how Semantic MediaWiki can solve it.
The problem is that the Julian calendar has a leap year every four years, which is slightly too many. This wasn’t so bad when Julius Caesar introduced the new calendar (which was better than what went before) but because the Julian calendar was used in Europe for such a long time, it got increasingly out of sync with the solar year. Contrary to stereotypes connected with the myth of Galileo, it was the Roman Catholic Church that decided to fix the problem in the 16th century. In the new Gregorian calendar, years that divided by 100 would only be leap years if they also divided by 400, so 1600 was a leap year in both calendars, but 1700 wasn’t. Catholic countries (and some Protestant parts of the Netherlands) changed in 1582 by skipping 10 days, although the day of the week didn’t skip, only the number. Protestant countries were mostly more reluctant to adopt a ‘popish’ calendar even though different parts of Europe using different calendars was very inconvenient. Britain and Ireland switched in 1752, by which time the discrepancy had increased to 11 days (although ‘give us back our 11 days’ protests were probably made up by William Hogarth). Sweden’s failed attempt to make a gradual transition in the early 18th century was incredibly complicated and confusing, so I’m glad that’s outside the scope of my project! Eastern Orthodox countries were even slower to change. Russia didn’t adopt the Gregorian calendar until 1918, which is why the ‘October Revolution’ is now celebrated in November. (If you want more details, Wikipedia has a list of transition dates.)
This all means that documents created in Britain and Ireland during the 17th century usually have Julian dates, which are different from the Gregorian calendar that we use now, and that a lot of Europe used at the time. Most spreadsheet and database software only uses the Gregorian calendar. For my PhD research I didn’t need to worry about this too much because I was only dealing with England in the 1640s, so the leap years were the same, and I could pretend that the dates were Julian even though Microsoft Office thought they were Gregorian.
By The Sword Linked potentially has a wider scope and needs to be much more rigorous so that the data can be reused. Luckily, Semantic MediaWiki now has built-in handling for Julian dates. All dates are stored internally as Gregorian because that’s easier for the underlying database to deal with, but it also stores extra data to say which calendar was specified when the date was entered. The software can convert between the calendars, so that dates can be entered and displayed in either calendar. It doesn’t matter which calendar was specified when the date was entered, because query results can display either calendar or both.
The built-in date handling made my life much easier but it could still be better. Displaying a double date on the page where it was entered seemed to require the page querying itself twice, which is not recommended. And there’s a bug that displays the wrong day of the week for Julian dates. To improve things, I wrote my own tag extension. Writing MediaWiki extensions can be very hard, but a tag extension is just about the easiest. You just need some PHP code to define XML tags and do something with them. PHP usually comes with a library that can convert and format Julian and Gregorian dates so everything was very straightforward. I used this to define a tag that’s very similar to the TEI tag <date> and expand the entered date into a double date including the day of the week. For example, the tag <date when=”1642-10-23 JL” /> would be displayed as: Sunday 23 October 1642JL / 2 November 1642GR (which you might recognise as the date of the battle of Edgehill. You can see this working at the battle of Edgehill page on the By The Sword Linked wiki). In practice, dates are often entered into form fields and passed to the tag behind the scenes, so there isn’t even much need to know about the XML code, although it can be used in free text as well.
Semantic MediaWiki can cope well enough with 17th-century Europe but it still has limitations. There are more calendars than the Julian and Gregorian which are still used today, and even more that were used in the past. Even with Julian and Gregorian, it can only deal with countries that used one or the other, or that switched in one go. Early 18th-century Sweden is still very difficult to deal with because it started, and then abandoned, a gradual transition. Cultures of Knowledge’s EM Dates project promises to take a more sophisticated approach that will be able to deal with more calendars, although I don’t think they’ve yet announced many details of how they’re doing it. I had an idea that I abandoned as soon as I found out that Semantic MediaWiki could handle Julian dates: a linked data approach where each day is an entity, and its web page contains dates for that day in multiple calendars. This would have been difficult to implement in Semantic MediaWiki (even before I found out that it was unnecessary) as it would mean a huge number of extra pages for dates, and it would make it harder to use comparison operators in queries (for example, after this date and before that date).
The Old Style and New Style new year is a separate problem that I haven’t dealt with as rigorously. By The Sword Linked just follows the practice of most secondary sources by treating the start of the year as 1 January, and converting from the year stated in the original document if necessary.
Anyway, you can find out more about how dates work in By The Sword Linked on the help page about dates.