I’ve been busy for longer than I expected, but this week I’ve had a chance to get back to work on the wiki. This is a quick summary of what has changed so far and what will change next.
My main concern is to simplify things by reducing the number of forms and semantic properties. I’ve now got rid of these properties:
- ‘Written in language’ was only used for manuscripts, not printed texts, anyway. I decided it’s not valuable enough to be worth the extra work of finding out what languages a manuscript uses and entering the data. Getting rid of this property also saves having to import pages for every possible language.
- ‘Has access condition’ and ‘Has photography permission’ would be useful things to know about manuscripts and copies of early printed books, but I realised that keeping the data complete and up to date would be practically impossible. You can still find out how to access material and whether you can photograph it by going to the website of the archive that holds the material. Some archives are vague about photography permissions, but that’s another reason why adding it to the wiki would be difficult.
- ‘Has EMLO work ID’ could only be used to link manuscript texts to catalogue entries at Early Modern Letters Online if they were letters. Although I expect to add a lot of letters to the wiki, they will still probably be a minority. Also reconciling against EMLO IDs would be a lot of extra work and probably not worth it.
I’ve also rearranged the form and template for articles so that they can represent work level in the same way as books (see my last post for more details of how that works).
Next week I want to rearrange the forms and templates for battles and sieges. This will involve:
- merging battles and sieges into one form/template, probably called ‘Combat event’.
- giving every event WGS84 coordinates as well as a location relative to addresses.
- removing the semantic properties for sides, but keeping them as template parameters and displaying them in a table.
- possibly removing the template parameters for field signs and words. These can still be added as free text. It remains to be seen whether they are known in enough cases to make it worth storing them as structured data.
- adding an option to enter and display either a single date (for a battle that took place on one day) or start and end dates (for battles, sieges, and raids that spread over more than one day). In either case, the semantic properties behind the scenes will always store a start and end date, even if they’re the same as each other. This should make it easier to query for events by date. The existing way of doing it makes battles and sieges fundamentally incompatible with each other.
- using the property ‘Has parent’ to represent potentially infinite hierarchies of events. This will be more flexible than the old way which only allowed two levels: siege as parent and battle as child. As well as the existing ability to link assaults and sallies to the siege they were part of, this will allow linking captures of individual forts to the battles of Lostwithiel, and making the battle of Chalgrove a part of the raid on Chinnor but still distinct from it.
Once the new data structures are in place and working, I’ll need to re-import the pages for battles and sieges, which will also be a chance to add extra information. They will still have the same page names and URLs.
Then I need to test every entity type by importing around a hundred pages of that type. I’ve already done enough of these:
All the other types could do with more examples before I’m satisfied that the data structures and page layouts are alright. When I am satisfied, I can get started on importing bigger batches.