Organization: Natural Resource Ecology Laboratory
Project: IRC
Category: Whitepaper
Title: Description of the Input-Output Subsystem

Draft

Introduction

The IRC Grid-Cohort system requires the ability to read input and produce output for up to several tens of thousands of simulations. Additionally, the simulation system may be running on a single processor, or on a single machine with multiple processors, or on a distributed architecture. These requirements can strain the limits of many hardware and operating systems, and can pose a large bottleneck to processing speed of the simulations.

Sources of output can be at all levels in the grid-cohort framework structure. The simulation contoller produces an activity and message log file. Grid cells produce flux summaries at specified timesteps. Models (cohorts) can optionally generate output specific to each model instance. Additionally, extended implementations of the grid-cohort framework may generate output at the grid level and the from a cohort. At any level, particularly the model level, each object (e.g., model instance) may require multiple input/output channels (files or pointers to output objects).

Issues and Considerations

There are several issues concerning the I/O subsystem design that mostly are due to limitations in hardware and operating systems relative to the needs of the project.

Design of the Input-Output Subsystem

The design provides a common interface to the various elements of the grid-cohort system. There are four main components, whose core design and relationships are specified in base classes. The relationships between these classes and the simulation grid-cohort classes are shown in the accompanying UML class diagram (PDF file).

Input/Output Sink or Source

Any object that accepts input from, or generates output to a conduit, is a kind of sink or source object. Every I/O sink or source is associated with only one I/O conduit. An implementation also may be both a sink and a source. Responsibilities are to send or receive data, and to know and report its availability. Source classes must implement the Write function, while sink classes must implement the Read function.

Input/Output Conduit

Transfers data from its input source to its output sink. Responsibilities are to know its input and output objects, know the status of these two objects, perform any necessary translations in data format before sending the data to the output object. Each implementation of a conduit must define Read and Write functions. These functions are used to move data from the input to the output objects of the conduit.

Input/Output Manager

The I/O manager is a kind of I/O sink or source object, and handles the transfer of data from its owner to the I/O conduit. Responsibilities are to own the output sink object, create and own the conduit object, rearranging of output data before being sent to the conduit, if needed; for input data, and channeling the input into the correct destinations. Each implementation of an I/O manager must define the Read or Write function (or both, if it has a two-way conduit).

Output-only managers are provided for the simulation base, grid, grid cell, cohort and model objects.

Input/Output Device

The I/O device is a kind of I/O sink or source object. Accepts data from or provides data to the conduit. The actual device can be a disk file, and DBMS, a memory buffer, a network source or destination, or any other data source or sink. Responsibilities are to know the status of its data device; send data to or receive data from its device.

C++ Implementation Details

Their are four base classes which implement the four core functional components. These are listed the following table.

Class Name
Description
Responsibilities
TIOSourceSinkBase Base class for any other class that is a source or sink for I/O data. Know the type of I/O (input, output, or both).
Know its availability state.
TIOMgrBase Base class for managing input/output with an I/O conduit. Derived from TIOSourceSinkBase. Knowing the input/output objects and knowing their state.
Owning the conduit and knowing its state.
TGEOutputMgrBase Base template class for managing simulation output from grid elements including the TSimulationBase, TGrid, TGridCell, TCohort, TModelBase classes. Derived from TIOMgrBase. Packaging output.
TIOConduitBase Base class for the input/output conduits. A conduit controls the transfer of data from input source to the output destination. Has a pointer to each of one input source and one output sink.
Knows the status of the conduits.
Derived classes perform the transfer of data from input to output.
TIODeviceBase Base class for any kind of input/I/O device, including files, databases, buffers. Derived from TIOSourceSinkBase. Knowing device state, type, allowed access, and location.

Implementation of essential specific classes are also listed in the table above. TSimpleConduit provides for direct transfer of data from the input to the output of the conduit. Potential conduits include TMPIConduit for MPI communication2. Device implementations built and planned include TNetCDFBase for netCDF files4; TCSVFileBase for comma-delimited text files; TDBMSBase for interfacing to DBMS systems; THDF5Base for HDF5 files. Each grid level element has an associated output manager class: TSimOutputMgr for the simulation controller; TGridOutputMgr for the grid; TCellOutputMgr for each cell; TCohortOutputMgr for each cohort; TModelOutputMgr for the model. The simulation class has TSimOutputMgr for its output.

Using the Classes

Implementation Requirements

Summary

Additional Information

Bibliography

[1] E. Gamma, et al. Design Patterns (Addison-Wesley Longman, 1995).
[2] MPI Home Page: www-unix.mcs.anl.gov/mpi/
[3] PANDA Home Page: cdr.cs.uiuc.edu/panda/
[4] NetCDF Home Page: www.unidata.ucar.edu/packages/netcdf/

About this Document

Revision History

Sept 2002 - Version draft
Author: Thomas E. Hilinski
Aug 2004
Editor: Thomas E. Hilinski
Updated the document format, minor editing of content.

Document Source

This document:
www.nrel.colostate.edu/projects/irc/white_papers/Input-Output-Subsystem.htm

Project web page:
www.nrel.colostate.edu/projects/irc/

E-mail contact:
irc@nrel.colostate.edu

Copyright and Distribution Information

Copyright © 2004, Colorado State University. All rights reserved.
Developed with funding from the National Science Foundation grant NSF DEB-9977066.