WORKS 2023

18th Workshop on Workflows in Support of Large-Scale Science
November 12-13
Denver, CO, USA (and virtual) In conjunction with

Proceedings by

WORKS 2023 focuses on the many facets of scientific workflow management systems, ranging from actual execution to service management and the coordination and optimization of data, service, and job dependencies. The workshop covers a broad range of issues in the scientific workflow lifecycle that include: scientific workflows representation and enactment; workflow scheduling techniques to optimize the execution on heterogeneous infrastructures; provisioning workflows on different kinds of infrastructures; workflow enactment engines that deal with failures in the application and infrastructure; and computer science problems related to scientific workflows such as semantic technologies, compiler methods, fault tolerance, etc.

Workshop Program - Part I (Sunday 12, 2pm to 5:30pm, Rooms 501-502)

Time	Event
2:00pm-2:10pm	Welcome - Part I Silvina Caino-Lores, Anirban Mandal
2:10pm-2:42pm	Invited Talk: Workflow Building Blocks: The Success Story of Environmental Modeling, HPC, and AI for Predicting Farmed Seafood Bacteria Contamination Raffaele Montella
2:42pm-3:00pm	Paper: End-to-end Workflows for Climate Science: Integrating HPC Simulations, Big Data Processing and Machine Learning Elia, Scardigno, Ejarque, D’Anca, Accarino, Scoccimarro, Donno, Peano, Immorlano, Aloisio
3:00pm-3:30pm	Break
3:30pm-3:48pm	Paper: Scale Composite BaaS Services With AFCL Workflows Larcher, Ristov
3:48pm-4:06pm	Paper: A Systematic Mapping Study of Italian Research on Workflows Aldinucci, Baralis, Cardellini, Colonnelli, Danelutto, Decherchi, Di Modica, Ferrucci, Gribaudo, Iannone, Lapegna, Medic, Muscianisi, Righetti, Sciacca, Tonellotto, Tortonesi, Trunfio, Vardanega
4:06pm-4:16pm	Lightning Talk: Transcriptomics Atlas Pipeline: Cloud vs HPC Kica, Lichołai, Malawski
4:16pm-4:26pm	Lightning Talk: Patterns and Anti-Patterns in Migrating from Legacy Workflows to Workflow Management Systems Cassol, Froula, Kirton, Sul, Melara, Kothadia, Player, Sarrafan, Chan, Fagnan
4:26pm-4:44pm	Paper: Accelerating Data-Intensive Seismic Research Through Parallel Workflow Optimization and Federated Cyberinfrastructure Adair, Rodero, Parashar, Melgar
4:44pm-5:02pm	Paper: Laminar: A New Serverless Stream-based Framework with Semantic Code Search and Code Completion Zahra, Li, Filgueira
5:02pm-5:20pm	Paper: Optimization towards Efficiency and Stateful of dispel4py Liang, Zhang, Yang, Heinis, Filgueira
5:20pm-5:30pm	Wrap Up - Part I Silvina Caino-Lores, Anirban Mandal

Workshop Program - Part II (Monday 13, 9am to 12:30pm, Rooms 704-706)

Time	Event
9:00am-9:05am	Welcome - Part II Silvina Caino-Lores, Anirban Mandal
9:05am-9:37am	Invited Talk: FAIRIST of Them All: Meeting Researchers Where They Are With Just-in-Time, FAIR Implementation Advice Christine Kirkpatrick
9:37am-9:55am	Paper: A data science pipeline synchronisation method for edge-fog-cloud continuum Sanchez-Gallegos, Gonzalez-Compean, Carretero, Marin-Castro
9:55am-10:25am	Break
10:25am-10:43am	Paper: TaskVine: Managing In-Cluster Storage for High-Throughput Data Intensive Workflows Sly-Delgado, Phung, Thomas, Simonetti, Hennessee, Tovar, Thain
10:43am-10:53am	Lightning Talk: Leveraging Large Language Models to Build and Execute Computational Workflows Duque, Syed, Day, Berry, Katz, Kindratenko
10:53am-11:11am	Paper: Delivering Rules-Based Workflows for Science Marchant, Blomqvist, Jensen, Lilholm, Nørgaard
11:11am-11:29am	Paper: Julia as a Unifying End-to-End Workflow Language on the Frontier Exascale System Godoy, Valero-Lara, Anderson, Lee, Gainaru, Ferreira da Silva, Vetter
11:29am-11:39am	Lightning Talk: Scaling on Frontier: Uncertainty Quantification Workflow Applications using ExaWorks to Enable Full System Utilization Titov, Carson, Rolchigo, Coleman, Belak, Bement, Laney, Turilli, Jha
11:39am-11:57am	Paper: Distributed Data Locality-Aware Job Allocation Markovic, Kolovos, Soares Indrusiak
11:57am-12:15pm	Paper: Fluxion: A Scalable Graph-Based Resource Model for HPC Scheduling Challenges Patki, Ahn, Milroy, Yeom, Garlick, Grondona, Herbein, Scogland
12:15pm-12:25pm	Lightning Talk: The Common Workflow Scheduler Interface: Status Quo and Future Plans Lehmann, Bader, Thamsen, Leser
12:25pm-12:30pm	Wrap Up - Part II Silvina Caino-Lores, Anirban Mandal

Invited Speakers

Raffaele Montella

University of Naples “Parthenope”, Italy

Workflow Building Blocks: The Success Story of Environmental Modeling, HPC, and AI for Predicting Farmed Seafood Bacteria Contamination

Scientific workflows processing enormous amounts of data using distributed HPC systems or on-demand computational resources are a solid and reliable paradigm in data science. The orchestration of environmental models to produce simulations or forecasts is a more widespread routine production workflow application. This presentation concerns our vision of workflows as building blocks for environmental applications, combining numerical and artificial intelligence models to produce augmented environmental forecasts and predictions. DagOnStar is the workflow engine developed at the HPSC SmartLab of the University of Naples "Parthenope" for orchestrating environmental models used by the Center for Monitoring and Modeling Marine and Atmosphere (CMMMA) applications to orchestrate the weather and marine forecast production. The Center runs a routinary workflow application to predict the contamination by Escherichia Coli (E. Coli) in farmed mussels, augmenting the forecasted pollutant transport and diffusion (WaComM++ model) with an artificial intelligence model (AIQUAM model) trained with microbiological measurements. The first assessment and evaluation of the system demonstrate that the workflow application can predict E. Coli presence with an accuracy of more than 90%.

Raffaele Montella is an Associate Professor with tenure in Computer Science at the Department of Science and Technologies (DiST), University of Naples “Parthenope" (UNP), Italy. He got his degree in (Marine) Environmental Science at the University of Naples “Parthenope" in 1998. He defended his Ph.D. thesis on "Environmental modeling and Grid Computing techniques" earning a Ph.D. in Marine Science and Engineering at the University of Naples "Federico II". He leads the High-Performance Scientific Computing (HPSC) Laboratory and the IT infrastructure of the UNP Center for Marine and Atmosphere Monitoring and Modeling (CMMMA). His main research topics and scientific production are focused on: tools for high-performance computing, cloud computing, and GPUs with applications in the field of computational environmental science (multi-dimensional geo-referenced big data, distributed computing for modeling, and scientific workflows and science gateways) leveraging on his previous (and still ongoing) experiences in embedded, mobile, wearable, pervasive computing, and Internet of Things.

Christine Kirkpatrick

San Diego Supercomputing Center, USA

FAIRIST of them all: Meeting researchers where they are with just-in-time, FAIR implementation advice

Intellectual freedom, curiosity, and creativity are qualities of the academic landscape that appeal to many researchers. But a blank page in the wrong context can halt creativity, such as creating a data management and sharing plan. A goal for research support staff, as well as for researchers themselves, is to lower time spent on the mechanics of research to make more time for open-ended discovery. For data-driven science, this includes collecting and processing data so that one can find and combine research objects later. It also means preparing research objects, such as data, software, and workflows, for later reuse by others. The FAIR principles provide a conceptual framework for comprehensively ensuring research assets are accessible for reuse. Currently researchers apply the FAIR practices as best they can, based on community practices, lessons learned on the job, and other mentorship they may have received. This talk will explore how FAIR implementation practices – or any other practice that aids data management and sharing, can be provided to researchers and customized to their specific research tasks. Research workflows can be improved through new ways of sharing hard won knowledge, and through processes that allow for peer assessment. These data sources can be repurposed in existing tools or through new interfaces, such as the FAIR+ Implementation Survey Tool (FAIRIST).

Christine Kirkpatrick leads the San Diego Supercomputer Center’s (SDSC) Research Data Services division, which manages large-scale infrastructure, networking, and services for research projects of regional and national scope. Her duties also include a leadership role on the Schmidt Futures Foundation and NSF-funded Open Storage Network and as leader of the Data Core for the NIH-funded Metabolomics Workbench, a national data repository for metabolomics studies. Her research in computer science has centered on improving machine learning processing through research data management techniques. In addition to being PI of the EarthCube Office (ECO), Kirkpatrick founded the US GO FAIR Office, is PI of the West Big Data Innovation Hub, and Co-PI on an NSF Accelnet: Designing a Water, Data, and Systems Science Network of Networks to Catalyze Transboundary Groundwater Resiliency Research. She serves as the Secretary General of the International Science Council's Committee on Data (CODATA), co-Chairs the FAIR Digital Object Forum, is on the external Advisory Board for the European Open Science Cloud (EOSC) Nordic, and the National Academies of Sciences’ U.S. National Committee for the Committee on Data.

Important Dates

~~August 11~~ August 16, 2023 (final extension)
Papers and Abstracts Submission
September 8, 2023
Paper and Abstract Acceptance Notifications
September 29, 2023
Camera-ready Submissions
November 12-13, 2023
Workshop

All deadlines are Anywhere on Earth (AoE).

Call for Papers

Scientific workflows have been used almost universally across scientific domains and have underpinned some of the most significant discoveries of the past several decades. Workflow management systems (WMSs) provide abstraction and automation, which enable a broad range of researchers to easily define sophisticated computational processes and to then execute them efficiently on parallel and distributed computing systems. As workflows have been adopted by a number of scientific communities, they are becoming more complex and require more sophisticated workflow management capabilities. A workflow now can analyze terabyte-scale data sets, be composed of one million individual tasks, require coordination between heterogeneous tasks, manage tasks that execute for milliseconds to hours, and can process data streams, files, and data placed in object stores. The computations can be single core workloads, loosely coupled computations, or tightly all within a single workflow, and can run in dispersed computing platforms, from edge to core resources.

This workshop focuses on the many facets of scientific workflow management systems, ranging from actual execution to service management and the coordination and optimization of data, service, and job dependencies. The workshop covers a broad range of issues in the scientific workflow lifecycle that include: scientific workflows representation and enactment; workflow scheduling techniques to optimize the execution of the workflow on heterogeneous infrastructures; workflow enactment engines that need to deal with failures in the application and execution environment; and a number of computer science problems related to scientific workflows such as semantic technologies, compiler methods, scheduling and fault detection and tolerance.

WORKS23 will be held in conjunction with SuperComputing (SC23), Denver, Colorado, USA, at the Colorado Convention Center in Denver.

Topics for the workshop

WORKS23 welcomes original submissions in a range of areas, including but not limited to:

Big Data analytics workflows, AI/ML workflows
Data-driven workflow processing and stream-based workflows
Workflow composition, tools, orchestrators, and languages
Workflow execution in distributed environments (including edge, grid, HPC, clusters, and clouds)
Workflows integrating emerging technologies (e.g., quantum, neuromorphic)
FAIR computational workflows
Dynamic data dependent workflow systems solutions
Exascale computing with workflows
In situ data analytics workflows
Interactive/human-in-the-loop workflows and steering
Workflow fault-tolerance and recovery techniques
Workflow user environments, including portals
Workflow applications and their requirements
Adaptive workflows
Resource provisioning for workflows (elasticity, control, and management)
Workflow optimizations (including scheduling and energy efficiency)
Performance analysis of workflows
Workflow debugging
Workflow provenance
Serverless workflows and serverless orchestration

There will be two forms of presentations:

Talks - Full papers (up to 12 pages) describing a research contribution in the topics listed above.
Lightning Talks - Abstracts (up to 4 pages) describing a novel tool, scientific workflow, or concept.

Submission of a full paper may result in a talk. Submission of an abstract may result in a lightning talk. Each submission will receive at least three reviews from the workshop program committee.

Proceedings Publication

Accepted papers from the workshop will be published in the SC Workshops Proceedings volume and made available online.

Paper Submission Guidelines

Full papers: Submissions are limited to 12 pages. The 12-page limit includes figures, tables, appendices, and references.
Abstracts: Submissions are limited to 4 pages (including references). The 4-page limit includes the description of a novel tool/science workflow/concept, and a link of a repository in which the novel source-code of the tool is stored. This repository will need to specify all the instructions necessary to execute the tool, so reviewers can test it. Abstracts will be compiled into a single paper and published as part of the workshop proceedings.

The format of the paper should follow ACM manuscript guidelines. Templates are available from this link.
For Latex users, version 1.90 (last update April 4, 2023) is the latest template, and please use the “sigconf” option.

Submit Your Abstract or Paper

Organization

General Chairs

Silvina Caino-Lores

French Institute for Research in Computer Science and Automation (INRIA), France

Anirban Mandal

Renaissance Computing Institute (RENCI), UNC Chapel Hill, USA

Steering Committee

David Abramson

University of Queensland, Australia

Malcolm Atkinson

University of Edinburgh, UK

Ewa Deelman

University of Southern California, USA

Michela Taufer

University of Tennessee, USA

Program Committee

Ilkay Altintas

San Diego Supercomputing Center

Ivona Brandic

Technical University of Vienna

Jesus Carretero

University Carlos III of Madrid

Alberto Cascajo

University Carlos III of Madrid

Kyle Chard

University of Chicago

Tainã Coleman

University of Southern California

Ewa Deelman

University of Southern California

Vincenzo de Maio

Technical University of Vienna

Frank di Natale

Nvidia

Rosa Filgueira

University of St. Andrews

Daniel Garijo

Universidad Politécnica de Madrid

Sandra Gesing

University of Illinois Chicago

Tristan Glatard

Concordia University

William Godoy

Oak Ridge National Laboratory

Shantenu Jha

Rutgers University

Daniel S. Katz

University of Illinois at Urbana-Champaign

Tamas Kiss

University of Westminster

Jakob Luettgau

University of Tennessee

Ketan C. Maheshwari

Oak Ridge National Laboratory

Paolo Missier

Newcastle University

Raffaele Montella

University of Naples Parthenope

Bogdan Nicolae

Argonne National Laboratory

Paola Olaya

University of Tennessee

Tom Peterka

Argonne National Laboratory

Loïc Pottier

Lawrence Livermore National Laboratory

Radu Prodan

University of Klagenfurt

Omer Rana

Cardiff University

Ivan Rodero

University of Utah

Daniel Rosendo

French Institute for Research in Digital Science and Technology (Inria)

Raul Sirvent

Barcelona Supercomputing Center

Tyler Skluzacek

Oak Ridge National Laboratory

Renan Souza

Oak Ridge National Laboratory

Frédéric Suter

Oak Ridge National Laboratory

Domenico Talia

University of Calabria

Francois Tessier

French Institute for Research in Digital Science and Technology (Inria)

Douglas Thain

University of Notre Dame

Rafael Tolosana-Calasanz

Universidad de Zaragoza

Sean R. Wilkinson

Oak Ridge National Laboratory

Justin Wozniak

Argonne National Laboratory

Orcun Yildiz

Argonne National Laboratory

Contact

For information please direct your inquiries to sc-ws-works@info.supercomputing.org, or contact the workshop chairs:

Silvina Caino-Lores, scaino-lores@acm.org
Anirban Mandal, anirban@renci.org

WORKS 2023

Workshop Program - Part I (Sunday 12, 2pm to 5:30pm, Rooms 501-502)

Workshop Program - Part II (Monday 13, 9am to 12:30pm, Rooms 704-706)

Invited Speakers

Raffaele Montella

Workflow Building Blocks: The Success Story of Environmental Modeling, HPC, and AI for Predicting Farmed Seafood Bacteria Contamination

Christine Kirkpatrick

FAIRIST of them all: Meeting researchers where they are with just-in-time, FAIR implementation advice

Important Dates

August 11 August 16, 2023 (final extension)

September 8, 2023

September 29, 2023

November 12-13, 2023

Call for Papers

Topics for the workshop

Proceedings Publication

Paper Submission Guidelines

Organization

Silvina Caino-Lores

Anirban Mandal

David Abramson

Malcolm Atkinson

Ewa Deelman

Michela Taufer

Ilkay Altintas

Ivona Brandic

Jesus Carretero

Alberto Cascajo

Kyle Chard

Tainã Coleman

Ewa Deelman

Vincenzo de Maio

Frank di Natale

Rosa Filgueira

Daniel Garijo

Sandra Gesing

Tristan Glatard

William Godoy

Shantenu Jha

Daniel S. Katz

Tamas Kiss

Jakob Luettgau

Ketan C. Maheshwari

Paolo Missier

Raffaele Montella

Bogdan Nicolae

Paola Olaya

Tom Peterka

Loïc Pottier

Radu Prodan

Omer Rana

Ivan Rodero

Daniel Rosendo

Raul Sirvent

Tyler Skluzacek

Renan Souza

Frédéric Suter

Domenico Talia

Francois Tessier

Douglas Thain

Rafael Tolosana-Calasanz

Sean R. Wilkinson

Justin Wozniak

Orcun Yildiz

Contact