VtkArray Changes for In-Situ Array Mapping

From ParaQ Wiki
Jump to navigationJump to search

There are plans to replace the vtkDataArrays with the vtkArray hierarchy. As we make this change, we should also consider the possibility of changing the layout of the memory used in the class. This is mostly important when considering using VTK in-situ with simulation. The solver may well store data in a layout different than that implicitly specified by VTK. We wish the vtkArray hierarchy to encapsulate the layout of the internal memory and make it straightforward to provide a new subclass of vtkArray that will naturally be used properly by filters without deep memory copies.

The Main Problem

The main problem with the vtkArray hierarchy with respect to switching memory array layouts is the existence of the method vtkDenseArray::GetStorage(). This method returns the raw array used in vtkDenseArray, and it is assumed to have a particular layout. The problem is that if one is to subclass vtkDenseArray to support a different layout, this class would have to override this method to allocate a new array and copy the data.

There are legitimate reasons to get the raw array from a vtkDenseArray. However, the interface should discourage this use unless actually necessary. The interface should make clear that an allocation and deep copy might occur. For the filter programmer, it should be easier (and efficient) to operate directly against the interface of the array itself.

To use vtkArray in place of vtkDataArray, the implementation (at least initially) will reimplement the vtkDataArray hierarchy to interally use vtkArray. The vtkDataArrays should also deprecate the GetPointer and GetVoidPointer methods for the same reason. The correct way to get these pointers should be to get the underlying vtkArray and getting the array from there.

Review of vtkArray Hierarchy

The hierarchy of vtkArray types is (conceptually) as follows.

This is a graph with borders and nodes. Maybe there is an Imagemap used so the nodes may be linking to some Pages.

For the purposes of this conversation, we will be talking exclusively about the vtkDenseArray branch. This is the type of array most likely used in data set processing as they typically store a field value per point or cell, and the dense array is the most efficient structure for that. It may make sense to do a similar thing for sparse arrays. Then again, something capable of iterating over a vtkSparseArray is pretty sure to also be able to handle a vtkDenseArray. In that case, it makes more sense to downcast only to vtkTypedArray, in which case it is straightforward to just create a new subclass of vtkTypedArray.

Approach 1: Specify Memory Layout in Subclass

In this approach, most of the functionality is moved out of vtkDenseArray. It becomes an abstract class for which subclasses specify the memory layout and provide the accessor methods. The simple class hierarchy is as follows.

This is a graph with borders and nodes. Maybe there is an Imagemap used so the nodes may be linking to some Pages.

The idea here is that VTK provides the class vtkDefaultDenseArray that uses the default VTK layout. When you call vtkDenseArray::New(), a vtkDefaultDenseArray is created. If a system needs to support some other layout to share an array with another code base, another subclass to vtkDenseArray could easily be made to share the array.

To get access to the raw pointer, the calling code would have to do a safe down cast to a vtkDefaultDenseArray. If that does not work, it would have to create a new vtkDefaultDenseArray and do a deep copy. (That reminds me. The current implementation of vtkDenseArray::DeepCopy is nonsensical and should be changed.) Both these operations could be wrapped in a method in vtkDenseArray, but the semantics would be weird because sometimes the array would be shallow copied and sometimes deep copied.

Pros:

  • Simple class hierarchy.
  • If virtual function calls are an issue, a clever type casting template could downcast to vtkDefaultDenseArray if possible or vtkDenseArray if not. This complication would be hidden from all but the casting template.

Cons:

  • The hierarchy might confuse users to downcast to vtkDefaultDenseArray instead of vtkDenseArray either because it is not clear which class to use or in a misguided attempt to get around virtual function calls.

Approach 2: Strategy Pattern

In this approach, vtkDenseArray remains at the bottom of the class hierarchy, but it uses a helper class to manage the actual data layout. The memory layout can be changed by simply changing this helper class. The class hierarchy looks as follows.

This is a graph with borders and nodes. Maybe there is an Imagemap used so the nodes may be linking to some Pages.

The use and ramifications are obvious to anyone familiar with the strategy pattern. The interface remains simple. Changing the memory layout requires changing the smaller vtkMemoryBlock class, which handles nothing more than data management.

Pros:

  • Simple external class hierarchy.
  • Shallow array copies become easy to implement (for example bug #9564).

Cons:

  • No getting around a virtual method call for each access.
  • Pollutes the namespace a bit with yet another class hierarchy for managing arrays.

Approach 3: Flexible Layout Implementation

In this approach, the class hierarchy remains exactly as it exists now. However, instead of supporting a fixed layout, vtkDenseArray provides a means of specifying how the data is lain out. For example, it could specify the strides for each dimension.

Pros:

  • No change to the class hierarchy.
  • No need for virtual function calls.

Cons:

  • Flexible math could negatively effect the time for each data access due to increased pointer arithmetic. (Both this and the virtual function call might be too small to care about.)
  • Memory layouts are limited to what is directly supported by the implementation of vtkDenseArray. Would it be able to handle the case of each component of a vector stored in a different array? Would it be able to handle the case where each slab in a 3D block was stored in a different memory location (we've run into that).
  • Legitimate uses of getting the raw data are complicated by having to deal with various memory layouts.

Acknowledgment

Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-AC04-94AL85000.

SAND 2010-8321 P