You are here

Harvard Forest >

Harvard Forest REU Symposium Abstract 2011

  • Title: A Study of Scientific Workflow Systems and Their Interoperability
  • Author: Garrett M Rosenblatt (University of Rochester)
  • Abstract:

    A key principle of the scientific method is to make scientific data analysis reproducible. For an analysis to be reproducible, scientists must record every detail of the analysis applied to their data. However, In many studies, the complexity of the data analysis makes this endeavor extremely difficult and time consuming. In real time sensor networks data is being produced too rapidly for the data processing to be done by hand, so the processing, and its documentation, must be automated. In response to this difficulty, a number of tools for diagramming and executing scientific data analysis are being developed to aid scientists. These executable diagrams are called workflows, and the tools used to create and execute them are called scientific workflow systems (SWS). SWS monitor the execution of a workflow, and record the provenance of the data they produce. Data provenance – information about the origin and derivation of data – can be used to reproduce the analysis, and verify results. Several existing systems, including Kepler, Little-Jil, and Taverna, where investigated to assess their capabilities, including their ability to collect data provenance. Each of these systems were found to have their own strengths and weaknesses. It would be desirable to be able to combine them together into a single SWS so as to take advantage of all their strengths. In order to support interoperability, it is necessary that these SWS to have the ability to exchange information with each other, and that their data provenance records to be combined. A prototype workflow that integrated both Little-Jil and Kepler was created and tested.

  • Research Category: Ecological Informatics and Modelling