Auto Variable Loading

From ParaQ Wiki
Jump to navigationJump to search

Many readers, such as the Exodus reader, have a widget in the Object Inspector in which the user must select which variables to load, at which point they will become available to use elsewhere. Although this make sense implementation-wise, it is awkward from a user's perspective. To use a variable, you have to select it twice: once in the reader and once where you want to use it. If the user wants to look at another variable, she must go back to the reader and load in the new variable, apply, then do the action she wanted in the first place.

It would be better if the user could just select the variable she wanted to use in a filter or display and it would automatically load that variable. In the variable selection combo boxes, there would probably be some indication about whether a variable is currently loaded (e.g. different font weights). This is closer to how many other visualization systems allow users to select data.

This document proposes a possible design to simplify variable selection in this way.


Reporting Variables

The first key element of this feature is the ability for a reader (or source or filter) to report the field variables it could load/create, but has not yet. Under normal operation, a reader will not actually load any variables, but it will report all of the variables that exist so that the ParaView GUI and other downstream components and list and request them.

To implement this, vtkFieldData will support the idea of a "ghost" array. The significance of a ghost array is a placeholder to say that the data exists and can be provided on request, but that the data is not available right now.

I am not wedded to the name ghost. I think it effectively describes the function, but might be confused for the other uses of the word ghost in VTK (such as ghost cells for parallel processing). We could use the word unloaded, but it is a mouthful. --Ken 12:45, 20 February 2009 (EST)

Ghost arrays can be specified and queried with the following methods added to vtkFieldData:

 virtual int AddGhostArray(const char *name, int datatype)
 virtual int GetNumberOfGhostArrays()
 virtual const char *GetGhostArray(int index)
 virtual int GetGhostArrayDataType(const char *name)
 virtual void RemoveGhostArray(const char *name)

When an actually array is added to the vtkFieldData (for example, with the AddArray method), it is checked against the ghost arrays and any matching ghost array is removed.

In methods that pass data from one data set to another, such as vtkFieldData::PassData, vtkDataSetAttributes::CopyAllocate, and vtkDataSetAttriutes::InterpolateAllocate, the ghost arrays are passed following the same rules as the regular arrays. This might mean having more state for the ghost arrays such as copy flags.

Rational: Since the selection of arrays to be loaded come from the pipeline (see below) it initially seems to make sense to send information about available arrays in keys sent to the output information objects during a request information call. However, by specifying ghost arrays in vtkFieldData, which is in turn attached to the point data, cell data, etc. of actual data objects, we can simplify the processing of ghost arrays as they move down the pipeline. vtkFieldData and vtkDataSetAttributes will provide the bulk of the work for providing ghost array information for (hopefully) the majority of filters. Any filter that uses one of the vtkDataSetAttribute helper methods to pass/copy/interpolate data will automatically have the ghost arrays handled.

If the passing of ghost arrays were to be handled in one of the pipeline requests, it would be problematic to provide defaults. What would be the default? Pass everything? That would probably lead to filters reporting ghost arrays that they cannot generate. Pass nothing? That would mean we would have to make edits to just about every filter in ParaView.


Requesting Variables

Because there is no data associated with them, having a ghost array does little good unless a downstream filter can then request them to be created upstream. This is done by attaching a list of arrays needed to a REQUIRED_TYPE_ARRAYS key (where TYPE is FIELD, POINT, CELL, etc.) during the request update information phase of the pipeline execution. This can be done automatically for a filter whose input arrays are specified with the SetInputArrayToProcess method, which is pretty much a given for filters in ParaView. In this case, the code needs only set the input array and update the pipeline. The new array will automatically be loaded and used during the execution.

ParaView will, of course, incorporate the ghost arrays when allowing the user to choose input variables to filter. It will also list them when selecting a field to color by.

When the user selects an array to color by, that is usually considered a "fast" operation. However, if a ghost array is selected, it could take quite a while to load the variable being displayed. How do you convey to the user that the operation can take a while and prevent long loading of accidental mouse clicks? Popping up a dialog box is probably not a good idea: that can get real annoying real quick. --Ken 13:19, 20 February 2009 (EST)


Suggested Implementation

There will be some filters that have to add to their request data method the passing of ghost array information (and possibly passing in the other direction during request update information). However, the majority of the work will be in the implementation of readers.

These changes basically effect readers that allow you to select input arrays through methods GetNumberOfVarArrays, GetVarArrayName, GetVarArrayStatus, and SetVarArrayStatus. Many of these readers also manage the array status with a vtkDataArraySelection object. These readers should continue to do this and continue to expose the selection. (This may eventually be hidden in the ParaView GUI, but probably not.)

When the reader receives a REQUIRED_TYPE_ARRAYS key in its output information during a request update information call, it should check on the arrays requested. Requested arrays that don't exist should probably be flagged as warnings. Note that arrays should not be checked off. If the pipeline branches out, the separate branches could request different arrays. Turning arrays off could cause thrashing.

Because the pipeline could be changing the underlying state of loaded variables, the server manager/GUI will have to be more careful about updating the state. --Ken 13:19, 20 February 2009 (EST)

Rational: Keeping around the mechanism to load variables as properties will be helpful for other VTK applications that will not use the pipeline to load variables (just as many readers still allow you to select the time step directly). It is also probably a good idea to still allow the user to select variables to the load in the reader at least in the short term. Unlike time, there are plenty of opportunities for the variable requests to get lost, and being able to load them directly on the reader may still be necessary.

Issues

Bad Ghost Array Support

Regardless of how good our default ghost array passing mechanism is, there will be inevitable cases where a filter does not support it correctly. In this case, the new mechanisms become even more difficult to use than the old ones.

Fast Pass Implementation

When request data is called, that usually means the entire operation is redone. That means that requesting a new array will mean that the entire data set is loaded and processed down the pipeline again. There are probably many operations that could be skipped (such as reading only the new array from disk).

It would be much nicer if there was a convention to support that. The particle tracer has something sort of like that, but it has not gotten a lot of acceptance yet. Plus, this might require something slightly different.

Acknowledgments

This work was done in part at Sandia National Laboratories. Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy's National Nuclear Security Administration under contract DE-AC04-94AL85000.