In situ

ParaView Catalyst (Catalyst) is an in situ use case library, with an adaptable application programming interface (API), that orchestrates the delicate alliance between simulation and analysis and/or visualization tasks. It brings the renown, scaling capabilities of VTK and ParaView to bear on the in situ use case. The analysis and visualization tasks can be implemented in C++ or Python, and Python scripts can be crafted from scratch or using the ParaView GUI to interactively setup Catalyst scripts.

Figure 1: Image produced using Catalyst.

Utilizing the in situ use case enabled through Catalyst, computational scientists and engineers can:

  • compress,
  • elicit features,
  • create extracts,
  • monitor and steer simulations, and/or
  • analyze data conveniently from main memory.

For high-performance computing (HPC) workflows, a post hoc (at a later time) use case for simulation produced data is to write the data during simulation to persistent storage, and then, at a more convenient time, read the data back into memory to perform analysis and visualization tasks. An alternative approach, becoming ever more necessary through the changing balance of computational architectures on the road to exascale1, is the in situ use case where analysis and/or visualization processing happens while the computed data products reside in memory.

Figure 2: Path to analysis and visualization results via in situ and post hoc use cases.

The in situ use case for analysis and visualization tasks isn’t new. However, computational scientists and engineers have been slow to embrace the in situ use case because they believe it:

  • Significantly Increases HPC time utilization per simulation,
  • Fundamentally escalates implementation/maintenance development effort, and
  • Likely unsuitable simulation domain decomposition for scaling analysis and visualization tasks.

But in situ analysis and visualization tasks are clearly a viable solution for mitigating petascale and future exascale I/O issues, because the simulated field and geometry data are readily available in memory.

Catalyst is a light-weight distilled version of the scaling ParaView framework designed to be simply embedded into simulation codes that goes a long way in addressing computational scientists and engineers preconceived issues with the in situ use case.

Performance

Catalyst and SENSEI2 coupled to the Parallel Hierarchic Adaptive Stabilized Transient Analysis (PHASTA) simulation code produced largest-known to date in situ simulation run. The run employed a 6.3 billion-cell unstructured grid and used 32,768 out of 49,152 nodes on Mira, an IBM Blue Gene/Q supercomputer from Argonne Leadership Computing Facility at Argonne National Laboratory. With 1,048,576 Message Passing Interface (MPI) processes on over 500,000 cores, the run quadrupled the size of the previous largest-known in situ simulation run. The previous run coupled PHASTA with Catalyst, but it did not utilize SENSEI.

Figure 3: Catalyst visualizes flow around a jet for the largest-known in situ simulation run.

In general, Catalyst avoids the need to save out intermediate results for the purpose of post hoc analysis and visualization tasks; instead post hoc can be replaced with the in situ tasks. This saves considerable time through I/O avoidance, as illustrated below in Figure 4.

Figure 4: The post hoc vs in situ use case comparison for CTH: (a) post hoc (b) Catalyst in situ. Results courtesy of Sandia National Laboratories.

Another performance consideration is whether these analysis and visualization tasks scale appropriately. If they do not, then a large-scale simulation may bog down during co-processing, detrimentally impacting total analysis cycle time. The computational scientists and engineers are concerned with increases in HPC time usage due to poorly scaling these tasks. Both of Catalyst’s underlying systems, VTK and ParaView, have been developed with parallel scaling in mind, and generally perform extremely well. Figure 5 shows two exceptional scaling plots for just two popular visualization algorithms: slicing through a dataset, and decimating large meshes (e.g., reducing the size of an output iso-contour).

Figure 5: Scaling performance of VTK filters: (a) slice and (b) decimate.

Such stellar performance is typical of ParaView/VTK algorithms.

Extracts

Catalyst provides access to a wide range of analysis methods and/or visualization techniques. These methods and techniques may produce images, statistical quantities, plots, derived information, and reduced, compressed or full data results.