IEH: "Is Everybody Happy"

CalCOFI's IEH format was developed in 1984 to accomodate additional data measurements added to the CalCOFI time-series. The previous format, sd2, was adequate for 35 years of hydrographic measurements. But in 1984, chlorophyll, primary productivity, & additional nutrients were added. IEH was developed to contain the additional data within the confines of 128 columns, the data width of a computer punch card. Data were stripped of decimal points to maximize the data accuracy in the available space, resulting in column-specific formats. The CalCOFI hydrographic data processing was performed on a mini computer running VMS until the early 1990s. In 1993, the VMS fortran code was ported by Dave Newton to run on an IBM-PC running DOS. In the late 90's, James Wilkinson ported the DOS Fortran code into Visual Basic for Windows. Coincidently, this change coincided with the transition from hanging Niskin bottles with reversing thermometer on the hydro wire to CTD-rosette to collect seawater samples and vertical profile electronic sensor data. As most of the fortran code was converted to Visual Basic to run natively under Windows 9x-7. Careful, meticulous comparisons were done at each transition - VMS to PC, Fortran to Basic to insure there were no changes in measurement calculations. Cruise data were processed using the old & new software and the results scrutinized to insure the values were identical.

Many user of CalCOFI data still use and expect the IEH-formatted data, using tools developed to import its unique format. But with the additional measurements & resolution provided by the Seabird CTD, other formats are available. Bottle-corrected, 1m bin-averaged CTD data improves the resolution of many physical parameters measured during the quarterly cruises. A relational database, which contains all +60 years of hydrographic data, allows data queries that can extract subsets from the time-series.

Data Processing

Data processing pre-2012 - processing bottle samples into IEH & Hydrographic Data Reports requires the analyses of individual data types: salinity, oxygen, chlorophyll, nutrients, and primary productivity. Each analysis generates data measurements housed in flat ascii files. Each data type is processed individually then merged into combined station files (00/20/22 files). Once merged, Andyplot vertical profiles (bottle data vs depth) are generated and the data point-checked. Odd or erroneous data, bottle mistrips for example, are flagged or removed from the data file. Once final, an IEH file is generated for each station. To generate standard depths, IEH data are interpolated using the observed values. This final, interpolated IEH, which contains both observed and interpolated data, is used to generate the CalCOFI Hydrographic Data Report. The IEH data were also converted to csv then imported into the CalCOFI Hydrographic Database. CalCOFI hydrographic data are archived in IEH formats: 'arch.ieh', observed data only, and 't.ieh', observed & interpolated data. Prior to 2000, the IEH and CalCOFI Hydrographic Data Reports were the core data products published by the CalCOFI Technical Group at SIO.

Data processing +2012 - processing bottle samples still requires the analyses of individual data types: salinity, oxygen, chlorophyll, nutrients, and primary productivity. Each analysis continues to generate flat ascii data measurement files. Each data type is processed individually with output files scrutinized for gross errors. After preliminary corrections are applied, all data are merged into station files (sta.csvs for depth data; casts.csv for station data). The main change from old to new at this stage, is the reduction in the number of data files. The sta.csvs include all bottle data plus matching-depth CTD data at bottle depths, combining "00", "20", "22" ascii files. For 75 stations, that reduces 375 files to 76, 75 sta.csvs & 1 casts.csv. The casts.csv combines stacst (Station Cast Description), weather, hdr, and metadata files into one. Casts.csv contains all cast & sta information, weather, & coefficients used to calculate & correct data.

The standard depth bottle data are interpolated using the observed values primarily for nutrient data unavailable from the CTD. CTD data are processed using Seasoft and bottle-corrected CTD sensor data imported into the sta.csvs at bottle & standard depths. These sta.csvs are converted to mat files using GPlot (Matlab) to plot & point-check bottle & CTD data. Odd or erroneous data such as mistrips, for example, are flagged as bad or questionable in the data file by the point-checker using GTool (Matlab). The point-checker, using GTool, annotates the sta.csvs data codes to indicate the best data for each data type - all bottle measurements are preserved. Using the data code index, the best data, from either bottle or CTD, are used to generate all subsequent data products - IEHs for legacy users, CalCOFI Hydrographic Data Reports, CalCOFI Hydrographic Database. The IEH shifts from being the primary data product to an ancillary, legacy data product. Sta.csvs & casts.csv will be archived but the primary archive will be the CalCOFI Hydrographic Database.