ServerManager 2.0

From ParaQ Wiki
Jump to navigationJump to search

Huh? What's this all about?

ParaView ServerManager is the backbone of the ParaView architecture. By ServerManager we mean the vtkSMProxy and subclasses, and all the surrounding infrastructure that makes it possible to use the proxies to create pipelines from a ParaView application. Given the client-server/parallel nature of ParaView, the proxies and the actual vtkObject subclasses that do the real data-processing and rendering work can be on different processes. All the communication between the proxies and the corresponding processing-objects is abstracted away by the ServerManager.

Currently this communication is using raw binary streams called "ClientServerStreams" that are pretty much scripts being constructed in the vtkSMProxy (vtkSMProperty etc.) subclasses and being sent to the server-nodes to executed. (These scripts could very well have been VTK Python scripts, we just use binary ClientServerStream and our own interpretor implementation to achieve the same effect without having to pass textual messages around). Then there are mechanisms to fetch information from server processes as well via GetLastResult() and GatherInformation().

We are now working on adding support for collaboration in ParaView. Multiple clients will now be able to connect to a pvserver and then collaborate with each other for visualization. Think of this as multiple ServerManagers in-sync with each other via the shared resource, that is the server.

Given the existing nature of the client side proxies/properties constructing these command streams and then sending them to the server, it becomes impossible for the stream messages to be interpreted and propagated to the other ServerManagers for updating themselves. This requires that we build higher level synchronization scaffolding which adds to redundant communication. Now if only the messages between the ServerManager and the server were messages at a higher level of abstraction than raw streams so that we can simply send those messages to all participating clients and they can then update their server-managers to sync up. That is the underlying premise for this upgrade to the ServerManager.

Wow, that's a huge article! Do I have to read this?

This article is meant for people who are familiar with the underpinnings of the existing ParaView ServerManager. If your familiarity with ServerManager is limited to writing XML and then using the proxies in your client-side code, then you don't really have to worry about this, since for most part, you will be unaffected.

Note that this is just a design document. The actual implemented API is bound to be slightly different, but the gist will remain the same.

What do you mean by messages at a higher level of abstraction?

Okay, before we dive into that, lets see what happens when a simple proxy with a simple property gets created and updated. Typically, one starts by writing an XML for the proxy.

  
  <ProxyGroup name="my_group">
    <Proxy name="SimpleProxy" class="vtkSimpleObject">
       <IntVectorProperty name="Value"
                          command="SetValue"
                          number_of_elements="1"
                          default_values="0">
       </IntVectorProperty>
    </Proxy>
  </ProxyGroup>

Once the ProxyManager has loaded this XML, you can ask it to create the proxy for you using vtkSMProxyManager::NewProxy("my_group", "SimpleProxy"); The proxy manager creates a vtkSMProxy instance and then calls vtkSMProxy::ReadXMLAttributes() on it with the XML dom for the proxy definition. The vtkSMProxy instance starts parsing the DOM and creates vtkSMProperty subclasses as needed that in turn process the corresponding DOM elements to decide what's the command, how many number of values etc. Once all this parsing is done you get a usable vtkSMProxy instance.

Now when you call UpdateVTKObjects() on this proxy instance, the vtkSMProxy::CreateVTKObjects() constructs vtkClientServerStream message to create an instance of vtkSimpleObject and associate it with this proxy's ID. Then every property constructs a message to call the corresponding method on this instance with the appropriate value; something like "Instantiate 'vtkSimpleObject' and assign it to ID 12'. In this example the vtkSMIntVectorProperty basically creates a message saying "call 'SetValue' on the vtkSimpleObject instance with ID 12 with argument '0'". All these messages are then sent to the server where they are processed and the vtkSimpleObject instance gets created and updated as stated in the message.

Now imagine, I have another client with a ServerManager connected to the same server. As a consequence of the above action happening on client A, I want client B to have a comparable instance of "SimpleProxy" with exactly same property values and state. Look at the message, there's no way that the client B can use that message to create the proxy "SimpleProxy" and update it's property since it doesn't even know what proxy was created by client A (that information is not in the message, we only know that vtkSimpleObject is being created).

Now consider the following messages for the same action:

  1. Create proxy 'SimpleProxy' with ID 12
  2. UpdateState of 'SimpleProxy::Value' to (0) for ID 12.

Now if the client A sent such a message to the server, this message can be forwarded to client B and client B will know exactly what should be done to ensure that client B's ServerManager is same as client A. This would however require that now the server knows that "Create proxy 'SimpleProxy' means construct an instance of vtkSimpleObject etc. i.e. the server also needs to parse the XML.

Seriously? You are doing all this just for collaboration?

There are a few more advantages to this:

  1. This will reduce the messages sent to the server. Typically these higher-level messages tend to be more compact than the expanded streams. You only have to look at vtkSMDoubleVectorProperty or something like that to see that a single value change can result in calling a bunch of command on the underlying VTK object to remove old values, add new ones again. Constructing this stream closer to the object i.e. on the server side will avoid having to send large streams between processes. This is critical. Reduced communication means better scalability out of the box.
  2. The design moves the vtk-level logic to the server side, encouraging developers to build VTK-level pipelines and components rather than creating vtkSMProxy subclasses e.g. use vtkView and subclasses instead of the vtkSMViewProxy class hierarchy. Moving some of this logic to the server-side has the advantage that it can directly work with the data or algorithm without having to gather information and such other things. This would avoid such ridiculous restrictions in the current design that require input to the representation proxy to be set before UpdateVTKObjects() is called on it.
  3. Since messages are state change messages, it will be easier to log and trace them for debugging purposes.
  4. It promotes a philosophy that moves more logic to the server rather than on the client, which should keep us from making a mess on the client side -- just peek into a vtkSMOutputPort or vtkSMRenderViewProxy and it's subclasses or any vtkSMRepresentation and subclasses and you'll know what I am talking about.
  5. This will help make UndoRedo more robust, since UndoRedo can now work with these core state-change messages rather than the ServerManager state XML which could result in problems based on when proxies were registered etc.

Can you give a quick summary of the new design?

You bet. Let's chalk some of the core infrastructure out and we can look at the details later.

  • Think of vtkSMProxy and vtkSMProperty (and their subclasses). There is code in this classes that satisfy the following functions:
  1. State management
  2. Control - e.g. marking proxies dirty as properties are modified, marking consumer proxies dirty -- all that to avoid unnecessary information fetches and updates.
  3. Update -- updating VTK objects, fetching ivar values from vtk objects.
  • Currently, since vtkSM* classes are on the client, all this logic gets executed on the client. In the "Update" code, vtkClientServerStream command messages are created which are then sent to the server and are then processed.
  • The design proposes, splitting these classes into two:
  1. The client-side component that handles state and control logic (vtkSMRemoteObject subclasses).
  2. The server-side component that handles update logic (vtkPMObject subclasses)
  • The communication between vtkSMRemoteObject and corresponding vtkPMObject is via a state message exchange. Note that state message is a state (or change in state) not a sequence of commands to update the state of the other object. Currently, we are planning to use google protocol buffers for serializing and communicating messages across processes.

That's pretty much the crux. All vtkSMRemoteObject and subclasses can now have vtkPMObjects on the server processes which encapsulate server-side update logic. vtkSMProxy becomes a vtkSMRemoteObject subclass. We create a new vtkPMProxy which is a vtkPMObject subclass which has code similar to vtkSMProxy to parse the XML to create the properties instances and then create the corresponding VTK object.

When ever a property on the client changes, on calling vtkSMProxy::UpdateVTKObjects(), a state message is constructed and sent to all the corresponding vtkPMProxy objects on all the concerned processes. They parse the state and then invoke appropriate methods on the vtkObject.

Can you explain the vtkSMRemoteObject and vtkPMObject in more detail?

Let's start with vtkSMRemoteObject.

  • vtkSMRemoteObject is a server-manager object. Any client-side entity that requires remoting capabilities will become a vtkSMRemoteObject subclass. (vtkSMRemoteObject doesn't necessarily have to involve remoting, but that's besides the point).
  • vtkPMObject is a process-module object for the vtkSMRemoteObject. Every vtkSMRemoteObject can have one and only one vtkPMObject on all or some of the processes. Thinking of existing proxies, vtkPMObject exists on all the processes where the vtkObject for that proxy existed. Think of it as a part of the vtkSMProxy that sits closer to the vtkObject that proxy stands for. It has direct access to the vtkObject, so it can have code that discovers information from it or data etc. without resorting to GatherInformation() as the existing vtkSMProxy subclasses tended to do leading to lesser communication.
  • Default implementation of vtkPMObject acts merely as store for the state of the vtkSMRemoteObject. Why would one want that? It comes in handy when you think of multiple clients connected to the same server. Having a server-side state enables the object-synchronization mechanisms to kick in to the vtkSMRemoteObject synchronized on all clients (more on the synchronization framework later).
  • Every vtkSMRemoteObject that has a corresponding vtkPMObject has a global-id (GID) that can be used to uniquely identify the vtkSMRemoteObject as well as vtkPMObject on all the processes involved. Thus all proxies, proxy-manager etc. have a global-id.
  • Whenever the state of a vtkSMRemoteObject changes, it dispatches a state message to the corresponding vtkPMObject (identified using the GID) on the "server" processes. "Server" processes are all the processes where the vtkPMObject exists (i.e. the processes identified by the ServersFlag on vtkSMProxy in the existing ServerManager). On the first such push, the server doesn't have the vtkPMObject associated with the GID. So, it peeks into the state to know what class of vtkPMObject to create and instantiates it and then pushes the state on it. vtkPMProxy then has code to determine the xmlgroup and xmlname from the state, it goes to the vtkSMProxyDefinitionManager (a new class spawned from vtkSMProxyManager that handles the XML definitions for all proxies) to obtain the XML definition for the proxy and then parses it to know what VTKClass to create as well as what corresponding property to create and initializes them. Internals vtkPMProxy property correspond to vtkSMProperty/vtkSMPropertyHelper and have the VTK-object update logic in them.
  class vtkSMRemoteObject : public vtkObject
  {
  ...
  protected:
     // Description:
     // Subclasses can call this method to send a message to its state
     // object on  the server processes specified.
     void PushState(const protobuf::message& msg);
  
     // Description:
     // Subclasses can call this method to pull the state from the 
     // state-object on the server processes specified. Returns true on successful 
     // fetch. The message is updated with the fetched state.
     bool PullState(protobuf::message& msg);
     
     // Description:
     // Same as Push() except that the msg is not treated as a state message instead
     // just an instantaneous trigger that is not synchronized among processes.
     void Invoke(const protobuf::message& msg);

     // Description:
     // Destroys the vtkPMObject associated with this->GlobalID.
     void DestroyPMObject();
  
  private: 
     // Global-ID for this vtkSMRemoteObject. This is assigned when needed.  
     vtkClientServerID GlobalID;

     // Servers flag identifying the processes on which the vtkPMObject corresponding
     // to this vtkSMRemoteObject exist.
     vtkTypeUInt32 Servers;

     // Identifies the connection id.
     vtkIdType ConnectionID;
  };
  • vtkPMObject's API will be something like follows:
  class vtkPMObject : public vtkObject
  {
  ...
  protected:
     // Description:
     // Subclasses can call this method to send a message to its state
     // object on  the server processes specified.
     virtual void PushState(const protobuf::message& msg);
  
     // Description:
     // Subclasses can call this method to pull the state from the 
     // state-object on the server processes specified. Returns true on successful 
     // fetch. The message is updated with the fetched state.
     virtual bool PullState(protobuf::message& msg);
     
     // Description:
     // Same as PushState() except that the msg is not treated as a state message
     // instead just an instantaneous trigger that is not synchronized among
     // processes.
     virtual void Invoke(const protobuf::message& msg);
  };
  • The API basically allows vtkSMRemoteObject and vtkPMObject to communicate with each other using messages.
  • PushState and PullState enable vtkSMRemoteObject to push it's state to remote or fetch state from it (for information-only properties or during a collaborative session).
  • Invoke allows the vtkSMRemoteObject to trigger an event on the vtkPMObject which does not affect the state e.g. vtkAlgorithm::Update().
  • vtkSMRemoteObject's API basically forwards the call to the vtkPMObject through the vtkProcessModule. vtkProcessModule will relay that message to the vtkPMObject on the appropriate process.
  • A PushState implies take this state msg and push it to the corresponding
  • vtkPMObject on all processes where it exists. The vtkPMObject is identified using the GlobalID. The association of the GlobalID and vtkPMObject happens after the first PushState().
  • A PullState goes to the vtkPMObject on the root-processes and fetches the state from the there. This is not much different from what happens with information-only properties, except that we now support fetching state for anything. If nothing else, the fetch will returns the last pushed state. (There will be some smarts to avoid redudandant pushes/pulls and reduce message sizes, but let's not complicate this article more than it needs to be).
  • Invoke sends a custom message to the vtkPMObject. By using subclasses of vtkSMRemoteObject and vtkPMObject one can communicate arbitrary update messages around. Of course, note that we never change state of vtkSMRemoteObject or vtkPMObject in Invoke.

What's the vtkProcessModule's role?

  • All the above mentioned vtkSMRemoteObject API uses vtkProcessModule to send the message to the corresponding vtkPMObject.
  • Whenever a vtkPMObject is created, the process-module puts it in an internal map with the global-id as the key. When a Push/Pull/Invoke trigger gets sent to a process, it locates the vtkPMObject from the map and then calls corresponding method on it.
  • We will further clean up vtkProcessModule to reduce the complexity esp. during process initialization.
  • vtkProcessModule will also have some new support to enable the server to communicate with the client.

How does this affect the proxies and their properties?

  • We remove code from vtkSMProxy and vtkSMProperty that constructs the vtkClientServerStream messages for updating VTK objects or getting values from them (information only properties). This code moves to vtkPMObject subclasses, say vtkPMProxy and vtkPMProxy::Property.
  • vtkSMProxy construction remains somewhat similar. You call vtkSMProxyManager::NewProxy(). It locates the DOM, instantiates vtkSMProxy (or subclass) and calls vtkSMProxy::ReadXMLAttributes(). That further creates the properties.
  • vtkSMProxy now no longer has SelfID and/or VTKObjectID. It just has 1 and at-most 1 global ID (GID), thanks to vtkSMRemoteObject. vtkSMRemoteObject can either assign itself a new GID or it can be set explicitly to a specific value (useful in collaboration).
  • vtkSMProxy::CreateVTKObjects() assigns itself a GID, if none is already set, then creates one. Next it constructs a state message. The initial state will have information about xmlgroup, xmlname, classnames for vtkSMRemoteObject as well as vtkPMObject subclass and the values for properties that may have changed from the defaults. Note defaults don't have to be pushed anymore, since the server-side vtkPMObject has access to the XML and read the default values for itself. Then it pushes this state to the server.
  • When the ProcessModule receives this state, it tries to locate the vtkPMObject for the GID. Since none is found, it create a new vtkPMObject subclasses as specified in the msg, and then passes the state to it. For vtkSMProxy, by default it will request the creation of vtkPMProxy on the server. vtkPMProxy will look at the state to determine the xmlgroup and xmlname and then locate the XML definition. Once it has the XML definition it will initialize itself and it's subproxies and properties using the XML. Then any changes in values are read from the state message and updated.
  • When one does vtkSMProxy::UpdatePropertyInformation(), vtkSMProxy request a PullState() from the serverside, sending it a message that includes the list of information-only properties that need to be updated. The response will be the a state message with state for all those properties.

How are objects synchronized among multiple clients?

  • vtkProcessModule has a map of GID and corresponding vtkPMObjects. On the pvserver root node, this map has additional information about which client connections that GID exits on. So if a proxy on two clients has same GID, then they share the vtkPMObject on the server and the server also is aware that both clients have vtkSMRemoteObjects for this GID.
  • Whenever a PushState() message comes to the server ProcessModule, it knows the GID for which the PushState() is called. Now, looking at its map, ProcessModule knows what other clients this GID exists, so it send the state message to all those clients.
  • When a client receives a PushState() message it handles that by locating the vtkSMProxy associated with the GID and updating its values. It merely updates the client-side values, it doesn't do PushState() since the server-side vtkPMObject is already updated.

How do two clients end up with proxies with the same GID in the first place?

  • For that we use the vtkSMProxyManager.
  • vtkSMProxyManager is a vtkSMRemoteObject as well i.e. it is synchronized among processes. vtkSMProxyManager has an GID reserved for it -- say GID=1.
  • vtkSMProxyManager uses the generic vtkPMObject on the server side as a mere state cache on the server side.
  • Consider ClientA and ClientB, with ClientA creating a new proxy. When a new proxy is created on ClientA, it gets assigned a new unique GID (say 121). ClientB is oblivious of this proxy and doesn't care. Let's call this proxy Proxy121.
  • Next we register Proxy121 as ("sources", "SphereSource1"). This changes the state of the ProxyManager on ClientA. It creates a state message and Pushes it to the server. This state message is something like Register 121 as (sources, SphereSource1). This message gets set to ClientB due to the synchronization described earlier.
  • ClientB's ProxyManager now tries to update its state using this message. It knows it needs to register Proxy121 as (sources, SphereSource1), but it doesn't know what Proxy121 is. It goes to a RemoteObjectFactory (not sure exactly where this is going to be, maybe just API on ProxyManager) that does a PullState for GID 121 to obtain the full state for the Proxy. Now the state has classnames for vtkSMRemoteObject as well as vtkPMObject (refer earlier). Using that classname for vtkSMRemoteObject, it creates the vtkSMRemoteObject subclass (which in this case will be a vtkSMProxy subclass) and tells it to load the state. vtkSMProxy realizes it hasn't been initialized yet, so looks at definition xmlgroup and xmlname in the state to locate the XML DOM and parse it. Once the Proxy becomes useable, the RemoteObjectFactory sets the GID on it so that it doesn't assign itself a new GID. Voila! We have a proxies on two clients with same GIDs.
  • ClienB's ProxyManager now registers this proxy and the two ProxyManager are now in sync.

How does this help with Undo/Redo?

The UndoRedo manager now can listen to state changes are they are being pushed around and capture them as undo-sets. Since this state is exactly how the client is communicating with the server, it won't be fragile as the XML state, which is currently saved separately. The details of this are bit too overwhelming for the scope of this article, but we'll have a separate design document detailing the same.

What happens to XML state?

The only serializations proxies now do will be to protobuf messages. Actually, we will have some encapsulation of protobuf so we can change protobuf under the covers in future, if needed. We will have a xml converter that converts protobuf state to xml and viceversa for serialization. We may even start supporting binary state saving.

Discussion

  • One advantage I see of the new technique is in using a debugger. Debugging is mentioned, but with respect to logging and tracing. However, I’m more excited about being able to look at a snapshot in a debugger and trace back what client code instantiated it. An additional feature could be a debug mode that “synchronized” the client and server. That is, it makes the client wait until the server is finished processing each PushState or Invoke -- Ken Moreland