By The Sword Linked will end this year. There are more detailed explanations below the cut, but the main reason is that I don’t have time to give the project the work it needs. All contents of the wiki will be permanently archived at Github and reusable under a CC-BY-SA licence. The live wiki will be deleted but I’ll give plenty of advance warning once I decide the date for deletion. Until then there will still be some additions to the wiki. I will also make some other datasets available separately at Github because it won’t be practical to import them to the wiki.
Continue readingMore British Library manuscripts imported
Since the last post, I’ve added catalogue data for around 300 more volumes and multi-volume collections of British Library manuscripts. As before, many of these link to the relevant pages in the printed catalogues at the Internet Archive, and some are linked to subjects and creators, although some of these are still red links. The new imports include a lot of the Additional Manuscripts, which were mostly missing from the previous batch, and I’ve also expanded some of the other named collections. For a few manuscripts I’ve indicated, based on my own notes, whether they are select manuscripts or can be photographed, but this is nowhere near complete. In future I hope to add item-level descriptions for texts in some volumes, but I don’t know when I’ll have time to do it.
The British Library situation
If you have anything to do with historical research, you probably know that most of the British Library’s online services have been out of action since October 2023 because of a ransomware attack. The knock-on effect for By The Sword Linked is that outgoing links to UINs in the Main Catalogue, Archives and Manuscripts Catalogue record permalinks, ESTC numbers, and EThOS thesis IDs are all broken. This incident shows a problem with linked data: if one website goes down it also affects other sites. But it also shows that making Open Access publications and reusable datasets available in more than one place can limit the damage.
The good news is that the Shared Research Repository doesn’t seem to have been affected. The repository makes available Open Access books, reports, and datasets. This includes all the thesis metadata from EThOS, last updated in November 2023. In 2021, I used an earlier version of this dataset to add wiki pages for 600 theses relevant to the British Civil Wars, linked to pages for authors and subjects (and many of the authors are linked to Wikidata IDs). In many cases, the EThOS data includes a link to the full text of a thesis at an institutional repository.
This week I’ve imported wiki pages for some British Library manuscript collections and volumes. I compiled the data from my own research notes and the printed catalogues at the Internet Archive. Where I’ve found printed catalogue entries, the wiki page for the collection links to the Internet Archive (although you might find the links unreliable because the IA servers are overloaded). Some collections are linked to subjects and creators, but many aren’t because they’re too miscellaneous. The easiest way to see what’s there is to drill down from the page for Western Manuscripts. I’m working on more data for British Library manuscripts, but I don’t know when I’ll have time to finish it because I’m starting a new job soon. Meanwhile, I’ve put up a general list of interim catalogues and finding aids.
New content imported
Before and after the server move, I imported some new content to the wiki (at last!):
- Most peers and peerages of England, Scotland and Ireland, 1630-1669. There are probably errors and omissions because I mostly imported the data from Wikipedia. Many subsidiary and courtesy titles are certainly missing. As well as drilling down from the links, you can find individual holders of titles by typing in the search box. The page names start with the person’s surname, and there are redirects that start with the distinctive part of the title. For example, Essex, Earl of, 3rd (Robert Devereux) redirects to Devereux, Robert (3rd Earl of Essex). Arranging this data led to creating a new property: Ordinal. This is the number of a title holder such as a peer or baronet.
- MPs in the Long Parliament now have constituency links, referenced to Brunton and Pennington. There may be some mistaken identities where I had to rely on Wikipedia for disambiguation, and a few MPs are still missing because I couldn’t positively identify them. Things will improve once Andrew Gray has imported History of Parliament data to Wikidata. I’ve temporarily removed links to the Short Parliament because the data I had was so inadequate and it will be easier to redo it properly with nothing there.
- Every surviving Buckinghamshire loss account that I know of now has a page and is linked to the place it’s about, so they should show up in the query for linked sources on pages for the relevant settlement.
- All (probably) relevant articles from Journal of the Society for Army Historical Research. Most of these link to archived copies at JSTOR, except for the last two years, which are still behind the moving wall.
- The Making of the English Landscape series now has all the county volumes that were published (the series was never finished, so some counties are missing). These link to National Character Areas covered as well as counties.
- Contents of Midland History and Helion’s Century of the Soldier series have been updated to the end of 2023.
Maintenance finished
The wiki is now up again. The address is the same but it’s hosted on a different server with newer versions of Mediawiki and extensions. This has fixed some minor problems, although there could be some new problems that I haven’t found yet. Before and during the upgrade, I made some changes to the data structures:
- Has URL is now used for units. This is their official website if they have one now (such as London livery companies). For regiments, it’s a link to the BCW Project Regimental Wiki, although that site is temporarily down.
- Units have new properties: Has clothing and Has symbol. These are mainly intended to represent coat colours and flags of military units, but they can also be used for coats of arms, livery etc.
- Removed external identifier properties for people, except Wikidata ID, which is now the spine for all other person identifiers.
- Book editions have been split into separate forms and templates:
- Early edition. Published before 1800. Corresponds exactly to an ESTC number and has no lower levels.
- Modern edition. Published after 1800. Formats are now represented by repeatable subtemplates within the edition page, and never by separate pages.
- Work level for books and articles is now only represented when needed: when a work has more than one version, or a Wikidata ID.
- Printed copies and printed text sections are no longer represented. They could be brought back in future, but for now the main focus of the project will be more on manuscripts than print.
- Property for author death year removed. This can usually be discovered via Wikidata.
- Removed property for gender status, but personnel roles are now classed by gender, which is a more efficient and flexible way of classifying people by gender.
- New property Is role in, used to link a personnel role definition to the types of organization that it can be used with.
- Manuscript texts no longer have properties for date signed or the date an original will was made. These are now covered by Has earliest known date.
- Ancestor command relationship now has more specific subclasses, which make it easier to query for things like all the settlements in a county.
Downtime for Maintenance
The By the Sword Linked wiki will be temporarily unavailable because of a server move and software upgrade. I can’t predict exactly when it will go down and come back up but it should all be done within 2 weeks. I’ll post again to say when it’s back and what has changed. If anything goes so badly wrong that I can’t bring it back, the entire contents will still be available at Github under CC-BY-SA.
Update July 2023
After long delays because I’ve been too busy, I’ve finally found a week to get back to this project. This hasn’t led to much new content, but I’ve made some changes behind the scenes which will make things easier in future.
- the biggest change is that the licence is now Creative Commons Attribution ShareAlike (CC-BY-SA). This will stop big companies from reusing my data in paywalled resources (not that they had been, and I doubt that they would think of it, but I want to be sure) and will increase the range of sources that I can import data from. The main disadvantage is that it will stop some other projects from reusing my data because their licences aren’t compatible.
- new property for historical people: Spelt own name. This records how they signed their names. Now page names have less need to reflect the original spelling, which is more flexible. Could also be useful to people studying literacy.
- new properties for serial publications: Contents last updated and New issues expected in. These will make it easier to keep track of whether journal contents are up to date.
- Collections can now have different values for Instance of. This makes it easier to include or exclude series of pay warrants when querying for sources, and will provide a way of indexing indemnity cases in SP 24 in future.
- forms and templates have been overhauled. Some of this will make the pages better structured, but a lot of it is behind the scenes and won’t make an obvious difference to reading pages or querying the data.
- removed spurious precision from WGS84 coordinates. They should now only have 5 decimal places, which is accurate to 1 metre.
- maps have been moved out of the main namespace. Pages that used to show a map now usually show a link that you can follow to view the map. This will make pages less cluttered and should save some resources on both the server and client sides. It’s also a possible workaround for a bug that stopped maps from displaying properly if the Litespeed cache was enabled, although the server load is so low at the moment that I don’t need a cache.
- the hierarchy of subject headings has been rearranged.
- units that never move can have a permanent location set instead of a repeatable template.
- Churches and cathedrals are now types in their own right. I’m still not planning to add any more specific types for buildings because defining a house or a castle or a fortification is so difficult.
- ID properties for Early Modern Letters Online and Six Degrees of Francis Bacon have been removed because Wikidata is already a spine for these and there’s no need to duplicate them.
- fixed some broken links.
- the Github repository has moved. This is partly because of the new licence, and partly to make it easier to share other data that isn’t wiki dumps but is still related to the project.
Index Villaris needs help to identify places
Things have been quiet here for longer than I expected because I’m busy earning money, which is the best thing to do at a time like this. Meanwhile, someone else’s project needs help, and it will also help By The Sword Linked in the long term.
Index Villaris is a project to create freely reusable geodata from a directory of places in England and Wales printed in 1680. The printed book lists 24,000 settlements along with their latitude and longitude, and the county, hundred and rural deanery they were in. This data will obviously be very valuable for By The Sword Linked and for many other things. It will allow me to:
- add every English settlement that existed. Currently I have 12,000 wiki pages for settlements that had administrative units named after them, and an offline list of another 1,000 names of units that I haven’t yet been able to match to settlements.
- add Welsh settlements more easily. So far I haven’t tried to tackle Wales.
- get more accurate coordinates for the locations of settlements in the 17th century. My existing coordinates are taken from Ordnance Survey data showing where the settlement is now. Some will have moved because of landscaped parks, coastal erosion or other reasons.
- link settlements to hundreds and deaneries. This will make my current practice of using settlements as proxies for parishes and townships more effective, and would make it easier to add parishes in future.
An alpha version of the data has already been released. The project team have been able to identify and locate 95% of the settlements in the list, but they need help with the other 5%. The instructions page gives details of how to help. You can do the task in a web browser on a computer. I’ve found it easy to use and have been able to offer a few suggestions. The unidentified settlements are often misplaced on the map, which is why the correct identification wasn’t always obvious. Sometimes this is just an error in the coordinates in the original printed edition. For example, Shenley Brook End and Warrington (both in Buckinghamshire) were listed under the correct county and hundred but had the wrong coordinates printed. They were both easy to correct because they are fairly well-known places. Printed coordinates of some places are so wrong that they appear in the wrong county or even in the sea! In other cases, the printed counties or hundreds may be wrong, and the place names may use archaic phonetic spellings. Some settlements may be very small and obscure. That’s why more people with detailed knowledge of local history are needed to help.
Ecclesiastical units
The wiki now has pages for ecclesiastical units in England and Wales. You can drill down from the Church of England to find provinces, dioceses, archdeaconries and rural deaneries. The relationships in this hierarchy are all referenced to Ecton’s Liber Valorum with links to page images at the Internet Archive. Parishes in London and Southwark will be added within the next few weeks. Parishes in other towns and cities that were divided into more than one parish will follow eventually, but I don’t know how long it will take. I have no plans to import rural parishes because I think settlements will be adequate for record linkage.
From now on I’m going to keep importing as much data as I can but I will also stop promoting the project for a long time because it’s still not ready to get much attention, and trying to keep people interested is too much of a distraction for me. I would rather do things in the order and at a pace that’s most convenient for me. This means that I won’t be posting much on this blog unless I want to share something especially important, and I’ve deactivated the project’s Twitter account. I hope that I’ll be ready for a big relaunch in the autumn of 2022, but it might take even longer if unexpected things happen. Meanwhile, the wiki and Github repository will still be available and will still be updated every so often.
Spring/summer cleaning
Covid interferes with everything eventually but it has only interfered with this project in a very indirect way: I needed new glasses and I waited to get vaccinated before booking an eye test. Then when I got new glasses, I had to get used to varifocals. That’s all out of the way, and I’m making progress with the wiki again.
First of all, I’ve imported a few hundred more pages. These are mostly streets and buildings in London. We also have a page for every cathedral in England and Wales.
Some other changes that have been made in the last few months: Continue reading