After parallel-processing cruises using IEH-based data formats & newer sta.csvs in 2011 (CalCOFIs 1104 & 1108). SIO-CalCOFI transitioned from (our 27yr) IEH-format data processing file format to a sta.csvs format which integrates bottle and CTD data. Prior to this, the bottle sample processing was separate from CTD data processing. CTD temperatures have been used & imported into the IEH-generating files since 1993 but other CTD data were processed separately. Replacing a bottle salt or oxygen with CTD value was done manually by editing the data files.
IEH History
IEH-format data processing was implemented in 1984 when the expanded IEH 128-column data format was adopted, replacing 80-col SD2 format. Fortran source code routines were run on a VMS mini computer until data processing migrated to IBM-PCs running DOS in 1990. DOS batch files were used to loop the executables in similar fashion as (slum) looping was performed on VMS. When processing each individual datatype file(salts, oxygen, chlorophyll, nutrients, prodo)  from 66 to 109 stations, the executables processed one file at a time. So batch files (.bat) were used to loop through all the numerous files for each datatype. Once each data file was processed, they were inspected for data quality and missing values. Then all the data files for each datatype were merged into '00', '20', or '22' files (H00100, H00120, H00122; H00200, H00220, H00222; H00300, H00320, H00322...) which were then used to create individual report files (R_001, R_002, R_003...) & IEH files labeled w3### (w3001, w3002...; one per cast or sta). W3s were IEH-formatted individual cast files that could be plotted for point-checking. When bad data were found, the individual specific datatype (H00102 salt file, for example) file was corrected and a new w3 file created. Once all w3 files were populated with final bottle data, a single, merged IEH file was created, including all the stations in 'report' order (vs station order). This final IEH ('bigIEH' or 'YYMMarch.ieh') was 'terpled', calculating & adding interpolated (quadratic or linear interpolations) standard level (depth) data.
Footnotes and data quality codes were added to the terpled IEH ('bigieh.t' or YYMMt.ieh') before generating data reports. Some additional executables were run on the IEH to add 'integrated 14C', 'integrated chl-a', 'sigma-theta', and 'oxygen saturation'. Finally, two data reports, hydrographic and primary productivity, were generated from the final terpled IEHs (hydrographic IEH & prodo-cast only IEH). Further annotations and foot-noting were performed on the data & prodo reports before assembling into a cruise data report. Note: before our 1993 CTD-rosette adoption, typically 20-bottle hydrographic bottle casts were done on each station. A separate 6-bottle (6 depth) cast was also performed daily for primary productivity 14C uptake incubation experiment on most cruises. These casts were processed, & qc'd into a separate primary productivity, non-terpled, IEH file & published as its own data report. Our complete final cruise Data Report included Title page, table of content, personnel, introduction, cruise track & station map, horizontal & vertical contour figures, tabulated hydrographic bottle data by depth, tabulated primary productivity data by depth, & macrozooplankton displacement volume table.

CODES, CalCOFI Oceanographic Data Entry System, was developed in 1990 to assist in the key-entry of data from sample forms. The different modules of CODES would assist the key-entry of each data type and save the data into a properly formatted 'H' file (ie H00103 O2 file) required by the individual data processing executable. Later, CODES and eventually DECODR, were programmed to automate the batch processing of each datatype. All sample analyses eventually became computerized so filling out forms and key-entering data was not necessary so CODES was retired. These notes describe some of the earliest PC data processing routines & formats which are still at the core of many data routines.

IEH Data Process using Fortran-DOS executables, pre-CTD adoption

  1. Seawater collection - ~20 reversing thermometer-equipped Niskin, Nansen or 'Wally' bottles are hung on the wire for a hydrographic cast. A second, daily, noon-time cast of 6 'prodo-clean, no-metal' bottles was hung on the wire for a primary productivity 14C incubation experiment. Seawater samples are collected for salinity, dissolved oxygen, chlorophyll, & nutrient analysis from all bottles on both casts.
  2. Data acquisition - seawater samples are analysed onboard during the cruise; reversing thermometer temperatures are tabulated on paper then key-entered.
  3. Sample data processing -
    1. Temperature files (H00101, H00201, H00301...) are created by key-entry of reversing thermometer tabulations. HydTP.exe generated corrected temperature files (H00101.out, H00201.out, H00301.out...). Additional support files: TAS (temperature arrangement sheet), CSTINF (cast info with wire angle & winch ), StaCst, Hydfmt.dat, thermbin. Unprotected thermometer temperatures, the terminal depth, and wire angle were used to calculate a corrected bottle depth that accounted for winch wire readout & wire angle. (Importing the CTD temperature sensors significantly simplified things in 1993)
    2. Salinity files (H00102, H00202, H00302...) are created by the DOS salt analysis program or key-entered from salt sample tabulations. Individual station files are run by Hydsa.exe, generating .out files (H00102.out, H00202.out, H00302.out...).
    3. Oxygen files (H00103, H00203, H00303...) are created by key-entry of O2 titration data (manual titrations); HydOx.exe generates .out files (H00103.out, H00203.out, H00303.out...)
    4. Nutrient files (H00104, H00204, H00304...) are created from the nutrient auto-analyzer files (YYMM001, YYMM002, YYMM003...); Beer's law executables are also run generating on beer##.txt files creating beer##.out files. Beers law for each nutrient type was used for curve fitting output data. HydNU.exe generated .out files for each run, which may combine stations (H00104.out, H00204.out, H00304.out...).
    5. Chlorophyll files (Chl001, Chl002, Chl003...) were created by key-entering fluorometric Rb & Ra readings tabulated from Turner Design fluorometers. Chlor.exe was run and generated .out files (Chl001.out, Chl002.out, Chl003.out...)
    6. Prodo (primary productivity) files (PRD001, PRD007, PRD010...) were tabulations of prodo metadata (light levels & incubation times) and scintillation counts. These files were generate ashore using scintillation count files. Prod.exe generated .out files (PRD001.out, PRD007.out, PRD010.out...). Note: since prodo stations were one cast per day, the file numbering is not consecutive. Typically 15 prodo incubations were performed each cruise.
    7. Weather data were key-entered from the weather form generating a single weather data file per cruise. This file was not processed individually but used by other programs to add header information to the IEH files.
    8. Station Cast Description (or StaCst) data were key-entered from the Station Cast Description form, generating a single StaCst file per cruise. This file was processed by Stacst.exe, generating a .out file (Stacst.out). This file was primarily used for IEH header information like latitude, longitude, date & time, CalCOFI Line & Station, etc.
  4. Sample file merging -
    1. 00/20/22 files - once all the individual sample files were cleaned up and data considered ready to point-check, another run of each datatype's executable was run with "update" toggle on which outputed the data to the "00" files (H00100, H00200, H00300...), "20" files (H00120, H00220, H00320...) or "22" files (H00122, H00222, H00322...). "00" files include pressure, temperature, salinity, oxygen, NO2, NO3, PO4, SIO3 data, one row per bottle. "20" files contain cast, sample number, and temperature decimal accuracy (not much). "22" files contain cast, sample number, chlorophyll-a, phaeopigment, & primary productivity data.
    2. MSTRW3.EXE - generated individual IEH (w3001, w3002, w3003...) using the 00, 20, 22, weather, stacst files. But at this point in the process, these IEHs are ready to point-check and check data quality using the following programs.
    3. HYDRP.EXE was run to generate 'Hydro Report' files (R_001, R_002, R_003...) which were intermediate check files to check data quality of the w3 and look for missing data.
    4. ANDYPLOT.EXE - "Andyplots" were a point-checking tool that plotted each station's data versus depth. Mistrips, bad, and questionable data could be visually identified. We still do visual plotting and point-checking but using Matlab. HPPlot.exe was also used for plotting data as well as a variety of MatLab scripts which evolved over the years.
    5. TERPLE.EXE - once the IEH data were final (by either editing the specific data file or more typically, by correcting the w3), Terple.exe was run. This program calculated interpolated standard levels (aka depths) for horizontal plots (10m, 100m, 500m).
    6. GAUNTLET.EXE - check for formatting issues with the IEH files. It could be run on individual w3 IEH files or merged IEH files.
    7. MERGEW3.EXE - merges all the individual w3 files into a single IEH; MRGW3S format file lists the station order. The working IEH is usually in order occupied order; the final IEH is in 'report' order which orders stations in lines N to S, E to W. 'Report' order is named this since the data in the report are organized in this order. (Line 77 nearshore to offshore, Line 80 nearshore to offshore... Line 93 sta 120, furthest south & offshore being last.)
  5. Generating Final Hydrographic & Productivity IEHs And Reports
    1. ADDCIEH.EXE - calculates and adds an integrated chlorophyll value for each station into header line 2
    2. ADDPIEH.EXE - calculates and adds an integrated primary productivity value for prodo stations into header line 2
    3. ADDSTHOX.EXE - calculates and adds sigma-theta (density) and oxygen saturation to the extra columns ("wild" columns: cols 104-110, & 112-118)
    4. DAREH.EXE - generates a Hydrographic Data Report from the terpled IEH, considered our final product after footnotes and data annotations are added.
    5. DAREP.EXE - generates a Primary Productivity Data Report from the prodo IEH, incubation and light level footnotes are added manually before publication.

The process has evolved over time but many of the algortihms & methods to calculate our data are still used. Very little key-entry is required. All sample analyses are computerized, generating the appropriate files for individual data types. These are processed in very similar fashion although DECODR, our data processing suite, tags all the files for processing and loops through them automatically. 00/20/22 and stacst files have been consolidated into individual station csvs and a single casts csv which contains all the cast info. Weather are one of the last data types to be computerized but is now online (Jan 2019). An event log, which records all cruise activities, particularly those involving data, records date, time (PST & UTC), latitude & longitude for our stations.
Sta.csvs and casts.csv are the working files now; IEH files are still generated as a legacy data product along with Hydrographic & Primary Productivity Data Reports. But our main focus is on our hydrographic bottle and CTD relational databases.