Novel workflow language tackles climate change computing challenge
A computing challenge encountered by the BBC Climate Change Experiment has led to an award-winning solution. Daniel Goodman from Oxford University won a best paper award at the UK e-Science All Hands Meeting (AHM) in Nottingham last month for devising a workflow language, Martlet, that enables the analysis of large datasets whose distribution is continually changing across a number of widely dispersed servers. Martlet uses an alternative style of programming model to that commonly used in workflow languages.
"This new approach builds on research in the computer science community over the past 40 years. For much of that time, many claimed this line of work was of academic interest, but of no practical relevance. Daniel's paper has shown how it has real application in tackling some of the key challenges facing the world today, such as climate change," says Professor Paul Watson of Newcastle University who chaired the AHM programme committee.
The BBC Climate Change Experiment is working with climateprediction.net, a major UK e-Science project funded by the Natural Environment Research Council. More than 200,000 people worldwide are participating in the experiment by donating spare capacity on their computers to run models of the Earth's climate.
As the dataset containing all model runs is too big to return to one location for analysis, it is stored on a number of servers in different locations worldwide. The challenge arises because the number of pieces this dataset is split into varies for a range of reasons, including the addition or removal of servers from the experiment, and the sub-setting of runs required for a given query. Climateprediction.net needed a way of analysing the data in situ that could also cope automatically with changes to the location or sub-division of data.
"Existing workflow languages are not up to the task because they implement a style of programming where the number of data inputs and the paths of data flow through the workflow are set when the workflow is submitted. This makes them unable to cope with subsequent changes to the dataset," says Daniel. He turned to constructs inspired from functional programming to solve the problem. These allow the workflow to adjust to the requirements of the data at run time and mean that changes to the way in which a dataset is split can be accommodated dynamically, so removing the need for users to keep adjusting their workflows.
Martlet has potential for use in many e-Science applications which distribute data between servers in a similar way to climateprediction.net. Its development also suggests that there could be other powerful new algorithms awaiting discovery once people start to think in terms of this alternative programming model. "Daniel's work has shown how work on core computer science can be used to meet the exciting challenges generated by e-Science applications. He has demonstrated how taking a different approach to organising the way in which tasks are executed can produce scientific results much more quickly," says Paul Watson.
Daniel Goodman, Oxford University Computing Laboratory, e-mail: Daniel.Goodman@comlab.ox.ac.uk, tel: 01865 273870
Professor Paul Watson, e-mail: email@example.com
Notes for editors
- e-Science refers to the science that can be done when researchers have access to resources held on widely-dispersed computers as though they were on their own desktops. The resources can include very large digital data collections, very large scale computing resources, scientific instruments and high performance visualisation.
- A grid allows these different resources to work together seamlessly across networks, enabling people to share resources, often across traditional boundaries, and form virtual organizations. The vision is to facilitate collaborative working in multi-disciplinary teams by making computing power as easy to access over the grid as electricity is over the power grid. e-Science has the potential to smooth out inequalities in research investment by making resources available to those who could not afford their own.
- The UK e-Science Programme is a coordinated initiative involving all the Research Councils and the Department of Trade and Industry. The Engineering and Physical Sciences Research Council manages the e-Science Core Programme, which is developing generic technologies, on behalf of all the Research Councils and the research communities they support.
- The UK e-Science Programme as a whole is fostering the development of IT and grid technologies to enable new ways of doing faster, better or different research, with the aim of establishing a sustainable, national e-infrastructure for research and innovation which meets the aims of the government's Investment Framework for Science and Innovation 2004-2014. e-Science and the e-infrastructure are thus contributing to the economic success of the UK.
- Further information at www.rcuk.ac.uk/escience, the National e-Science Centre (NeSC) www.nesc.ac.uk and the individual research councils:
- Arts and Humanities Research Council (AHRC) www.ahrc.ac.uk
- Biotechnology and Biological Sciences Research Council (BBSRC) www.bbsrc.ac.uk
- Council for the Central Laboratory for the Research Councils (CCLRC) www.cclrc.ac.uk
- Economic and Social Research Council (ESRC) www.esrc.ac.uk
- Engineering and Physical Sciences Research Council (EPSRC) www.epsrc.ac.uk
- Medical Research Council (MRC) www.mrc.ac.uk
- Natural Environment Research Council (NERC) www.nerc.ac.uk
- Particle Physics and Astronomy Research Council (PPARC) www.pparc.ac.uk
Last reviewed: By John M. Grohol, Psy.D. on 21 Feb 2009
Published on PsychCentral.com. All rights reserved.