The wiki is now up again. The address is the same but it’s hosted on a different server with newer versions of Mediawiki and extensions. This has fixed some minor problems, although there could be some new problems that I haven’t found yet. Before and during the upgrade, I made some changes to the data structures:
- Has URL is now used for units. This is their official website if they have one now (such as London livery companies). For regiments, it’s a link to the BCW Project Regimental Wiki, although that site is temporarily down.
- Units have new properties: Has clothing and Has symbol. These are mainly intended to represent coat colours and flags of military units, but they can also be used for coats of arms, livery etc.
- Removed external identifier properties for people, except Wikidata ID, which is now the spine for all other person identifiers.
- Book editions have been split into separate forms and templates:
- Early edition. Published before 1800. Corresponds exactly to an ESTC number and has no lower levels.
- Modern edition. Published after 1800. Formats are now represented by repeatable subtemplates within the edition page, and never by separate pages.
- Work level for books and articles is now only represented when needed: when a work has more than one version, or a Wikidata ID.
- Printed copies and printed text sections are no longer represented. They could be brought back in future, but for now the main focus of the project will be more on manuscripts than print.
- Property for author death year removed. This can usually be discovered via Wikidata.
- Removed property for gender status, but personnel roles are now classed by gender, which is a more efficient and flexible way of classifying people by gender.
- New property Is role in, used to link a personnel role definition to the types of organization that it can be used with.
- Manuscript texts no longer have properties for date signed or the date an original will was made. These are now covered by Has earliest known date.
- Ancestor command relationship now has more specific subclasses, which make it easier to query for things like all the settlements in a county.
The By the Sword Linked wiki will be temporarily unavailable because of a server move and software upgrade. I can’t predict exactly when it will go down and come back up but it should all be done within 2 weeks. I’ll post again to say when it’s back and what has changed. If anything goes so badly wrong that I can’t bring it back, the entire contents will still be available at Github under CC-BY-SA.
After long delays because I’ve been too busy, I’ve finally found a week to get back to this project. This hasn’t led to much new content, but I’ve made some changes behind the scenes which will make things easier in future.
- the biggest change is that the licence is now Creative Commons Attribution ShareAlike (CC-BY-SA). This will stop big companies from reusing my data in paywalled resources (not that they had been, and I doubt that they would think of it, but I want to be sure) and will increase the range of sources that I can import data from. The main disadvantage is that it will stop some other projects from reusing my data because their licences aren’t compatible.
- new property for historical people: Spelt own name. This records how they signed their names. Now page names have less need to reflect the original spelling, which is more flexible. Could also be useful to people studying literacy.
- new properties for serial publications: Contents last updated and New issues expected in. These will make it easier to keep track of whether journal contents are up to date.
- Collections can now have different values for Instance of. This makes it easier to include or exclude series of pay warrants when querying for sources, and will provide a way of indexing indemnity cases in SP 24 in future.
- forms and templates have been overhauled. Some of this will make the pages better structured, but a lot of it is behind the scenes and won’t make an obvious difference to reading pages or querying the data.
- removed spurious precision from WGS84 coordinates. They should now only have 5 decimal places, which is accurate to 1 metre.
- maps have been moved out of the main namespace. Pages that used to show a map now usually show a link that you can follow to view the map. This will make pages less cluttered and should save some resources on both the server and client sides. It’s also a possible workaround for a bug that stopped maps from displaying properly if the Litespeed cache was enabled, although the server load is so low at the moment that I don’t need a cache.
- the hierarchy of subject headings has been rearranged.
- units that never move can have a permanent location set instead of a repeatable template.
- Churches and cathedrals are now types in their own right. I’m still not planning to add any more specific types for buildings because defining a house or a castle or a fortification is so difficult.
- ID properties for Early Modern Letters Online and Six Degrees of Francis Bacon have been removed because Wikidata is already a spine for these and there’s no need to duplicate them.
- fixed some broken links.
- the Github repository has moved. This is partly because of the new licence, and partly to make it easier to share other data that isn’t wiki dumps but is still related to the project.
Things have been quiet here for longer than I expected because I’m busy earning money, which is the best thing to do at a time like this. Meanwhile, someone else’s project needs help, and it will also help By The Sword Linked in the long term.
Index Villaris is a project to create freely reusable geodata from a directory of places in England and Wales printed in 1680. The printed book lists 24,000 settlements along with their latitude and longitude, and the county, hundred and rural deanery they were in. This data will obviously be very valuable for By The Sword Linked and for many other things. It will allow me to:
- add every English settlement that existed. Currently I have 12,000 wiki pages for settlements that had administrative units named after them, and an offline list of another 1,000 names of units that I haven’t yet been able to match to settlements.
- add Welsh settlements more easily. So far I haven’t tried to tackle Wales.
- get more accurate coordinates for the locations of settlements in the 17th century. My existing coordinates are taken from Ordnance Survey data showing where the settlement is now. Some will have moved because of landscaped parks, coastal erosion or other reasons.
- link settlements to hundreds and deaneries. This will make my current practice of using settlements as proxies for parishes and townships more effective, and would make it easier to add parishes in future.
An alpha version of the data has already been released. The project team have been able to identify and locate 95% of the settlements in the list, but they need help with the other 5%. The instructions page gives details of how to help. You can do the task in a web browser on a computer. I’ve found it easy to use and have been able to offer a few suggestions. The unidentified settlements are often misplaced on the map, which is why the correct identification wasn’t always obvious. Sometimes this is just an error in the coordinates in the original printed edition. For example, Shenley Brook End and Warrington (both in Buckinghamshire) were listed under the correct county and hundred but had the wrong coordinates printed. They were both easy to correct because they are fairly well-known places. Printed coordinates of some places are so wrong that they appear in the wrong county or even in the sea! In other cases, the printed counties or hundreds may be wrong, and the place names may use archaic phonetic spellings. Some settlements may be very small and obscure. That’s why more people with detailed knowledge of local history are needed to help.
The wiki now has pages for ecclesiastical units in England and Wales. You can drill down from the Church of England to find provinces, dioceses, archdeaconries and rural deaneries. The relationships in this hierarchy are all referenced to Ecton’s Liber Valorum with links to page images at the Internet Archive. Parishes in London and Southwark will be added within the next few weeks. Parishes in other towns and cities that were divided into more than one parish will follow eventually, but I don’t know how long it will take. I have no plans to import rural parishes because I think settlements will be adequate for record linkage.
From now on I’m going to keep importing as much data as I can but I will also stop promoting the project for a long time because it’s still not ready to get much attention, and trying to keep people interested is too much of a distraction for me. I would rather do things in the order and at a pace that’s most convenient for me. This means that I won’t be posting much on this blog unless I want to share something especially important, and I’ve deactivated the project’s Twitter account. I hope that I’ll be ready for a big relaunch in the autumn of 2022, but it might take even longer if unexpected things happen. Meanwhile, the wiki and Github repository will still be available and will still be updated every so often.
Covid interferes with everything eventually but it has only interfered with this project in a very indirect way: I needed new glasses and I waited to get vaccinated before booking an eye test. Then when I got new glasses, I had to get used to varifocals. That’s all out of the way, and I’m making progress with the wiki again.
First of all, I’ve imported a few hundred more pages. These are mostly streets and buildings in London. We also have a page for every cathedral in England and Wales.
Some other changes that have been made in the last few months: Continue reading
One of the advantages of doing this project on my own with no funding or institutional support is that there are no deadlines. A disadvantage is that working on the project has to come third behind paid work and peer-reviewed publications. That’s why progress has been so inconsistent in the past year – it’s nothing to do with corona virus, because this is a project that I can do from home at the moment. In the first half of 2020 I made huge progress and got the size of the wiki to more than 20,000 pages (mostly settlements in England and Scotland). Then it stopped as other things got in the way. For the last seven months there have only been occasional manual edits, and long periods with apparently nothing happening. But I have been doing a lot of offline work that isn’t visible on the wiki.
The biggest task that I’m working on is preparing to import the Propositions lists from SP 28/131. These are three account books (one doesn’t even have a page yet) that list people’s names, addresses, and occupations, and details of the horses and arms that they loaned to Parliament in 1642 and 1643. This is an amazing source that has been a big part of my research on horse supply, and I want other people to be able to use it more easily. Among other things, it allowed me to identify Davy the horseman from Nehemiah Wharton’s letters, and even find the colour of his horse! Although I’ve made a lot of progress with record linkage, it still needs a lot more work before I can import it, and I don’t know when I’ll find the time because I’m lucky enough to have a good amount of paid work lined up, and a potential journal article that I can finish writing without any archive trips. Because of this, I might use any spare time I get to work on smaller, easier tasks that I can finish quickly, so at least something will be happening on the wiki, even if it’s not the most exciting thing. I’m deliberately being vague about what these tasks might be because it’s hard to predict what will be easiest to do in the time I’ve got.
Meanwhile, working on record linkage for the Propositions lists has led to a little bit of visible progress. Today I imported about 40 English settlements that were previously missing. The latest dumps of wikitext pages and RDF have been uploaded to Github.
I’ve been working on importing wiki pages for settlements in England. This post follows on from the one about importing Scottish places and will refer back to that instead of repeating all the details, but for England some things will be different.
Every time I think I’ve finalised the data structures of the semantic wiki, I realise that something could be done better or that querying for a certain thing is too difficult. This week I’ve given the wiki a big overhaul, including:
- upgraded software to Semantic MediaWiki 3.1.6, Maps 7.18.0 and Page Forms 4.9. As far as I know, this hasn’t broken anything yet.
- imported about 750 regicides and MPs who were never peers. See Category:Agents for the latest list of historical people. Regicides are also listed in Category:Regicides of Charles I and MPs have personnel relationships with the House of Commons in the Short and/or Long Parliament. These imports are based on Wikidata items that I found via Wikipedia categories. There may be errors that I haven’t found yet.
- pages for historical people, places, organizations, and events now show up to 10 linked sources or a message that there are no linked sources, so you don’t have to click a link to find out.
- changed the properties that subobjects use to link agents to organizations, and participants to events, so that they’re not the same ‘has parent’ and ‘has subordinate’ properties used for command structure relationships. I think this should make queries simpler and more efficient, and avoid possible confusion, because there’s less need to check what the subobject is an instance of.
- added a new property, ‘has allegiance‘, to the subobject for event participants to show which side they were on.
- agents and units now use different types of subobject (but with all the same properties apart from ‘is instance of’) to link to events. This makes it easier to query for participants and simplifies the roles that need to be assigned to participants.
- roles in events have been redefined. See Category:Event participant roles for the latest list.
- royalist allegiance factions have been merged into one: royalist forces. This is simpler and allows for cases where we know that someone served as a royalist soldier but not when or which king they served.
- where an agent is a member of an organization that can be assigned allegiance, the allegiance faction should also be entered as a second value in the personnel relationship. This makes queries for soldiers by which side they were on simpler and more precise. It also allows for cases where we know that a soldier was on a certain side but not which unit they were in.
- updated Help:Data structures to reflect changes to properties and templates.
- added Geonames IDs to some more Scottish settlements, so coverage is now about 1/3, but linking Geonames and Ordnance Survey data has turned out to be quite difficult.
- fixed some broken redirects and missing categories.
This month I’ve started working on importing large amounts of data for places in Great Britain. I’m writing this blog post as I go, so that I can easily remember what I did. Imports of authors seem like old news now, so I might not write any more detailed posts about how I did it. See below for more details of how I did the geodata. This is a very long post by today’s standards, so the ‘Too Long, Didn’t Read’ version is that I’ve imported pages for: