Progress, hiatus and future plans

One of the advantages of doing this project on my own with no funding or institutional support is that there are no deadlines. A disadvantage is that working on the project has to come third behind paid work and peer-reviewed publications. That’s why progress has been so inconsistent in the past year – it’s nothing to do with corona virus, because this is a project that I can do from home at the moment. In the first half of 2020 I made huge progress and got the size of the wiki to more than 20,000 pages (mostly settlements in England and Scotland). Then it stopped as other things got in the way. For the last seven months there have only been occasional manual edits, and long periods with apparently nothing happening. But I have been doing a lot of offline work that isn’t visible on the wiki.

The biggest task that I’m working on is preparing to import the Propositions lists from SP 28/131. These are three account books (one doesn’t even have a page yet) that list people’s names, addresses, and occupations, and details of the horses and arms that they loaned to Parliament in 1642 and 1643. This is an amazing source that has been a big part of my research on horse supply, and I want other people to be able to use it more easily. Among other things, it allowed me to identify Davy the horseman from Nehemiah Wharton’s letters, and even find the colour of his horse! Although I’ve made a lot of progress with record linkage, it still needs a lot more work before I can import it, and I don’t know when I’ll find the time because I’m lucky enough to have a good amount of paid work lined up, and a potential journal article that I can finish writing without any archive trips. Because of this, I might use any spare time I get to work on smaller, easier tasks that I can finish quickly, so at least something will be happening on the wiki, even if it’s not the most exciting thing. I’m deliberately being vague about what these tasks might be because it’s hard to predict what will be easiest to do in the time I’ve got.

Meanwhile, working on record linkage for the Propositions lists has led to a little bit of visible progress. Today I imported about 40 English settlements that were previously missing. The latest dumps of wikitext pages and RDF have been uploaded to Github.

Final overhaul (until the next one)

Every time I think I’ve finalised the data structures of the semantic wiki, I realise that something could be done better or that querying for a certain thing is too difficult. This week I’ve given the wiki a big overhaul, including:

  • upgraded software to Semantic MediaWiki 3.1.6, Maps 7.18.0 and Page Forms 4.9. As far as I know, this hasn’t broken anything yet.
  • imported about 750 regicides and MPs who were never peers. See Category:Agents for the latest list of historical people. Regicides are also listed in Category:Regicides of Charles I and MPs have personnel relationships with the House of Commons in the Short and/or Long Parliament. These imports are based on Wikidata items that I found via Wikipedia categories. There may be errors that I haven’t found yet.
  • pages for historical people, places, organizations, and events now show up to 10 linked sources or a message that there are no linked sources, so you don’t have to click a link to find out.
  • changed the properties that subobjects use to link agents to organizations, and participants to events, so that they’re not the same ‘has parent’ and ‘has subordinate’ properties used for command structure relationships. I think this should make queries simpler and more efficient, and avoid possible confusion, because there’s less need to check what the subobject is an instance of.
  • added a new property, ‘has allegiance‘, to the subobject for event participants to show which side they were on.
  • agents and units now use different types of subobject (but with all the same properties apart from ‘is instance of’) to link to events. This makes it easier to query for participants and simplifies the roles that need to be assigned to participants.
  • roles in events have been redefined. See Category:Event participant roles for the latest list.
  • royalist allegiance factions have been merged into one: royalist forces. This is simpler and allows for cases where we know that someone served as a royalist soldier but not when or which king they served.
  • where an agent is a member of an organization that can be assigned allegiance, the allegiance faction should also be entered as a second value in the personnel relationship. This makes queries for soldiers by which side they were on simpler and more precise. It also allows for cases where we know that a soldier was on a certain side but not which unit they were in.
  • updated Help:Data structures to reflect changes to properties and templates.
  • added Geonames IDs to some more Scottish settlements, so coverage is now about 1/3, but linking Geonames and Ordnance Survey data has turned out to be quite difficult.
  • fixed some broken redirects and missing categories.

Importing Scottish places

This month I’ve started working on importing large amounts of data for places in Great Britain. I’m writing this blog post as I go, so that I can easily remember what I did. Imports of authors seem like old news now, so I might not write any more detailed posts about how I did it. See below for more details of how I did the geodata. This is a very long post by today’s standards, so the ‘Too Long, Didn’t Read’ version is that I’ve imported pages for:

Continue reading

Quick update

I’ve finished working on The Power of Petitioning, and the shutdown is giving me plenty of time to work on By The Sword Linked, so here’s a quick summary of where I am and where I’m going.

The biggest news is that there are now wiki pages for 1,238 authors who weren’t alive during the civil wars (see Authors category for a full list). Of these, 1,049 are linked to Wikidata IDs. About 25% of authors imported so far are women. Not ideal but it may be a fairly accurate reflection of who has publications and theses relevant to the British Civil Wars.

With enough authors in place, I’ve been able to import more publications and theses. For example:

  • Midland History is a journal with links to about 60 articles. This is all of the articles that I think are relevant to the civil wars. Wikidata seems to have complete coverage of this journal up to 2017, which made the imports easier.
  • Helion Century of the Soldier is a series of monographs and edited collections with links to each volume that covers the civil wars. The volume pages have links to the publisher’s website.
  • Theses category lists a small selection of theses, mostly recent and mostly by women, which is encouraging for the future.
  • Open access category is a quick way to find sources that are free to view online, with subcategories for books, articles and theses.

Now that I’ve tested every type of entity at a big enough scale, I think I’ve finally finalised the data structures, although I can’t rule out minor changes if I come across something that needs fixing.

The current situation hasn’t derailed this project but it has changed my priorities for the future. The news that The National Archives of the UK are closed until further notice makes it especially important to share transcripts of Public Records that I already have copies of. Before I do that, I need to import more people and places to make it easier to link sources to subjects. I expect to be doing that for most of April. Then from May onwards I’ll try to share as much material as I can from SP 28. This will also demonstrate the value of the Open Government Licence. In between doing all that, I might write some more detailed blog posts about how I imported data for authors and publications.

Army Committee warrants added

I’m still quite busy with paid work (you can see some of the petitions I’ve been transcribing at British History Online) but I’ve just found time to update the wiki this week. To test the data structures for manuscript texts, I’ve imported a few hundred pay warrants and receipts created by the Army Committee in 1645 and 1646. The Army Committee was a committee of MPs chaired by Robert Scawen, which handled administration and supply for the New Model Army. The warrants and receipts that I’ve imported today are all for buying horses, saddles, and harness. The data originally came from my PhD research but I’ve checked everything against the original documents at Kew and corrected some errors (although I was relieved to find that most of my notes were accurate).

Some examples:

I think I’m now satisfied with the data structures for manuscript texts. I still need to test manuscripts that are divided into sections. After that I want to finish importing authors so I can test books, articles, journals and theses at a bigger scale.

Brief update

This is just a quick roundup of changes to the wiki since the last post in August.

  • there are now pages for every regiment of the New Model Army in the First Civil War.
  • there are pages for every meeting of the Short Parliament, linked to the location where it was held, and proceedings at British History Online:
  • you can now search events by date again. It should work properly now, and there’s an option to limit it to specific types of event.
  • search suggestions in the main search box (at the top of every page) now have accent folding as well as case folding, so if you type a character without an accent, such as e, it will also match accented versions of that character, such as é. This makes Gaelic and Welsh names easier to find. For example, if you type ‘sir fon’ it will match ‘Sir Fôn’. To do this I had to hack the TitleKey extension myself, but it was easier than installing ElasticSearch.
  • some more properties have been removed to simplify the data structures:
    • ‘Addressed from’ because there are many documents it doesn’t apply to, the way I tried to use it was too inconsistent, and ‘Mentions’ is good enough for record linkage.
    • ‘Received on date’ as it’s only known in a minority of cases.
    • ‘Has ARCHON ID’ because Wikidata ID and a link to an archive’s own website do everything that is needed.
  • there are pages for a couple of particularly useful books. If you drill down from work level there are links to scans at the Internet Archive:

The next thing I want to do is test manuscript texts on a bigger scale. I have some data from my PhD research for warrants paying for horses and saddles for the New Model Army, but I found some anomalies in the data that will need checking against the originals next time I’m at Kew (probably next week). Once I’ve done that, I should be satisfied enough with the data structures that I can start really big imports. I’m already working on data for about 1,000 authors and 1,600 peers and MPs. The method that I used for meetings of the Short Parliament will scale up to the Long Parliament and Protectorate Parliaments quite easily, so I may as well get that done as soon as I can. In practice I might not be able to start these big imports until the end of the year because I’m likely to be busy with paid work, but the wiki should move up to another level and become much more useful next year.

More changes

This week I’ve finished a big overhaul of the wiki. Changes include:

  • there’s now an external identifier for the The Scotland, Scandinavia and Northern European Biographical Database (SSNE). This free to view database created by Steve Murdoch and Alexia Grosjean is a very important source for the earlier careers of many civil war officers.
  • there’s a simple entity to represent blank pages in a manuscript. This is easier to use and less cumbersome than using the full data structures designed for manuscript texts or sections.
  • behind the scenes, some of the properties have been simplified. This won’t make any noticeable difference at the front end but it does affect the RDF output and writing custom queries.
  • documentation on property pages should all be up to date now.
  • battles and sieges have been overhauled yet again, and I think I’m finally satisfied. More details of this below.

Continue reading

Cataloguing SP 28

Lots of people who have researched the British Civil Wars will know of SP 28, also known as the Commonwealth Exchequer Papers, in The UK National Archives. It’s a very important, and mostly quite poorly catalogued, collection of financial records of the parliamentarian and Protectorate war effort. One of the main aims of this project is to gradually catalogue and index SP 28. I’ve now started importing catalogue data into the wiki.

Continue reading