RN WW2 Observations index

Conversion to IMMA

For scientific use and inclusion in ICOADS the observations must be converted into International Maritime Meteorological Archive (IMMA) format. I used a Perl module for IMMA data when processing the data.

  1. I've sorted the observations (in the format as received from NCDC) by ship. (With this script). The sorted observations are here.
  2. I've converted the observations for each ship into IMMA format. (With this script.)
  3. I've done positional quality-control checks on the observations for each ship by editing the IMMA files using Google Maps.
  4. Most of the ship records have no latitude and longitude given. I've estimated positions for many of these records by interpolation between records with positions. (With this script).

Issues with the conversion

  1. The letters marking whether the longitude is East or West (E or W in column 86) and the latitude is North or South (N or S in column 80) are often missing. But I've had good success in guessing these (assume the same as the last time there was a value).
  2. The longitudes (less often the latitudes) are sometimes out by a factor of 10. In many cases the longitude errors can be detected and corrected using the observation time zones (details). Most of the remaining errors have been corrected during positional QC.
  3. Many of the observations were made with the ship in port. For these observations, the port name was given, but no lat or long. The digitised obs contained 1773 different port names but the vast majority of names are used only a few times. (Names and number of occurrences). I used the fuzzy gazetteer, Google and Wikipedia to estimate positions for 123 of the most common port locations, and Dave Croxall continued the work to find positions for most of the rest. I put these positions into the IMMA records where the ships were in those ports, but the mapping was not always posible: many port names could refer to two or more places (Devonport in Plymouth or Devonport in Auckland?) and others were not traceable to anywhere in particular. So several of the original port locations had to be changed or deleted during positional QC. So, in the IMMA records, the positions fall into 5 categories:
    • Digitised
    • Obtained from meta-data (port name and location)
    • Interpolated (from two digitised positions)
    • Missing while at sea (bad digitised value - deleted during QC, or no digitised position given and no nearby positions from which to interpolate.).
    • Missing while in port (couldn't map port name to a position).
  4. The IMMA format uses a 9-character ship identifier. Many of the ship names are longer than this. There is no other obvious identifier, however, so I've truncated the names to 9 characters and used the truncated name as the identifier. A check shows that the truncated names are never duplicates.

Checking the conversions

For each ship, I've made a file of IMMA observations, and 3 diagnostics:
  1. A plot of the ship's track,
  2. A position plot of each observation, showing which are digitised, which interpolated, and which from port metadata,
  3. A summary of the obs (made with RDIMMA).
These diagnostics show that the vast majority of the obs can be converted and used. Getting positions from interpolation and port metadata works well.