[Paraview] Programmable filter in parallel

David E DeMarle dave.demarle at kitware.com
Tue Aug 23 18:55:26 EDT 2011


Hmm,

I haven't been able to reproduce it yet. I suspected a simple numpy
dependency problem, but the code in  vtkPythonProgrammableFilter.cxx
where numpy is imported does so in a try block. That code should
suffice and doesn't look like the code you've posted.

Which version of ParaView, which OS and what python version do you
have? A good bug report with details on how to reproduce the problem
would help greatly. Please submit one and I can look at it more
deeply.

David E DeMarle
Kitware, Inc.
R&D Engineer
21 Corporate Drive
Clifton Park, NY 12065-8662
Phone: 518-881-4909



On Mon, Aug 15, 2011 at 3:17 PM, Sean Ziegeler (Contractor)
<sean.ziegeler.ctr at nrlssc.navy.mil> wrote:
> David,
> Actually, I think the problem is that the RequestInformation script crashes
> no matter what.
>
> I put a Reader and ProgrammableFilter object in a pvbatch pipeline.
> Apparently, anytime I put *any* string in the
> ProgrammableFilter.RequestInformationScript property (even just a comment),
> it tries to load numpy, triggering that known error with numpy that I
> mentioned before, i.e.,
> ...
>    import numpy.core.numeric as NX
> AttributeError: 'module' object has no attribute 'core'
>
>
> Thanks,
> Sean
>
>
> On 08/15/11 13:50, Sean Ziegeler wrote:
>>
>> David,
>> Well, if I don't set the extent translator, it obviously duplicates all
>> the points by p (where p=# of processes). However, the Z extents get
>> updated to the new scaled z points.
>>
>> If I do set the extent translator, it fixes the duplication, but the Z
>> extents do not get updated.
>>
>> I've tried other variations on the script, including (1) removing the
>> ShallowCopy(), (2) removing the DeepCopy() & replacing SetValue() with
>> InsertValue(), and (3) enabling Copy Arrays.
>>
>> I also tried replacing the SetValue() with numpy operations instead but
>> numpy doesn't appear to completely work in 3.10.1 (I saw a post about
>> that from a while ago).
>>
>> Thanks,
>> Sean
>>
>>
>> On 08/12/11 16:35, David E DeMarle wrote:
>>>
>>> Howdy Sean,
>>>
>>> When you set the extent translator does each processor not get a
>>> different update extent?
>>>
>>> David E DeMarle
>>> Kitware, Inc.
>>> R&D Engineer
>>> 28 Corporate Drive
>>> Clifton Park, NY 12065-8662
>>> Phone: 518-371-3971 x109
>>>
>>>
>>>
>>> On Fri, Aug 12, 2011 at 12:40 PM, Sean Ziegeler
>>> <sean.ziegeler at nrlssc.navy.mil> wrote:
>>>>
>>>> I appear to be running into the same problem with my programmable
>>>> filter.
>>>> Since the Transform filter cannot scale rectilinear data, I wrote the
>>>> following programmable filter to do it:
>>>>
>>>> zscale = 0.001
>>>> pdi = self.GetInput()
>>>> pdo = self.GetOutput()
>>>> pdo.ShallowCopy(pdi)
>>>> zsi = pdi.GetZCoordinates()
>>>> zso = vtk.vtkDoubleArray()
>>>> zso.DeepCopy(zsi)
>>>> zss = zso.GetSize()
>>>> for i in xrange(zss):
>>>> zso.SetValue(i, zsi.GetValue(i)*zscale)
>>>> pdo.SetZCoordinates(zso)
>>>>
>>>> Obviously, I need to update it to use the newer input/output names
>>>> and numpy
>>>> arrays for speed, but it does work in serial. However, it appears to
>>>> duplicate every point on every processor in parallel. I've been
>>>> poring over
>>>> the docs and experimenting, but I've yet to find a way to use
>>>> UPDATE_EXTENT
>>>> properly in parallel with rectilinear data. Any ideas?
>>>>
>>>> Thanks,
>>>> Sean
>>>>
>>>> On 08/11/11 10:33, David E DeMarle wrote:
>>>>>
>>>>> You should end up with one multiblock dataset on each processor, all
>>>>> of those should have eight children. On any given processor 7 of those
>>>>> children will be NULL and the remaining one will be unique to that
>>>>> processor. Use UPDATE_PIECE and possibly localprocessid to figure out
>>>>> which of the eight children the processor should fill in. The rest of
>>>>> the vtkCompositeDataPipeline that ParaView uses expects and knows how
>>>>> to handle that structure and filters downstream should have no problem
>>>>> handling it.
>>>>>
>>>>> And no these aren't stupid questions. They are described fairly well
>>>>> in the most recent kitware books and courses but otherwise the
>>>>> information is widely scattered around the paraview wiki, kitware
>>>>> source magazine and the mailing list archives.
>>>>>
>>>>> David E DeMarle
>>>>> Kitware, Inc.
>>>>> R&D Engineer
>>>>> 28 Corporate Drive
>>>>> Clifton Park, NY 12065-8662
>>>>> Phone: 518-371-3971 x109
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Aug 11, 2011 at 11:11 AM, Tim Gallagher
>>>>> <tim.gallagher at gatech.edu> wrote:
>>>>>>
>>>>>> David,
>>>>>>
>>>>>> Thanks for your response. It's much clearer how it all works, but I'm
>>>>>> still unsure how it fits together.
>>>>>>
>>>>>> I don't actually need to know the interprocess links -- I have a
>>>>>> list of
>>>>>> blocks to read and that list needs to be split over the processors.
>>>>>> So each
>>>>>> processor needs to identify itself and the total number of procs,
>>>>>> but that's
>>>>>> all. So I can definitely do that with the mpi4py, I was unaware
>>>>>> that would
>>>>>> work inside the filter and I didn't know the paraview.vtk.parallel
>>>>>> existed.
>>>>>>
>>>>>> I'm not actually splitting the structured data; I'm splitting the
>>>>>> vtkMultiBlockDataSet. So each processor is responsible for
>>>>>> populating a
>>>>>> portion of the dataset. For instance, in serial when the file (say,
>>>>>> with 8
>>>>>> blocks) is read, we end up with one vtkMultiBlockDataset with 8
>>>>>> vtkStructuredData's inside it. If I have a parallel reader (with 8
>>>>>> processes), I have a hunch I'll end up with 8
>>>>>> vtkMultiBlockDataSet's with
>>>>>> one vtkStructuredData under each. Is this correct? Will this cause
>>>>>> problems
>>>>>> for other filters downstream? If for fun, I wanted to merge it such
>>>>>> that
>>>>>> each processor still only retains it's block, but they share a
>>>>>> common parent
>>>>>> vtkMultiBlockDataset, is that possible?
>>>>>>
>>>>>> I appreciate your help with this. Maybe these are stupid questions
>>>>>> answered somewhere else, but I can't seem to find them!
>>>>>>
>>>>>> Tim
>>>>>>
>>>>>>
>>>>>> ----- Original Message -----
>>>>>> From: "David E DeMarle"<dave.demarle at kitware.com>
>>>>>> To: gtg085x at mail.gatech.edu
>>>>>> Cc: "ParaView list"<paraview at paraview.org>
>>>>>> Sent: Thursday, August 11, 2011 9:54:24 AM
>>>>>> Subject: Re: [Paraview] Programmable filter in parallel
>>>>>>
>>>>>> ParaView tries to do no aggregation other than rendering onto the same
>>>>>> screen. Each processor is told what portion it is responsible for via
>>>>>> the UPDATE_EXTENT or UPDATE_PIECE/UPDATE_NUMBER_OF_PIECES keys and are
>>>>>> supposed to only produce what it is asked for. (See
>>>>>> http://paraview.org/Wiki/Writing_ParaView_Readers for more of the
>>>>>> story.)
>>>>>>
>>>>>> Filters that need cross communication to work properly (beyond what
>>>>>> they can get from ghost cells) do so by accessing the
>>>>>> vtkMultiProcessController that connects all of the nodes in the server
>>>>>> (or sometimes via MPI directly but that isn't recommended).
>>>>>>
>>>>>> Try the following for two means of getting a hold of the interprocess
>>>>>> links.
>>>>>> import paraview.vtk.parallel
>>>>>> #print(dir(paraview.vtk.parallel))
>>>>>> #print(dir(paraview.vtk.parallel.vtkMultiProcessController))
>>>>>> controller =
>>>>>> paraview.vtk.parallel.vtkMultiProcessController.GetGlobalController()
>>>>>> print controller.GetLocalProcessId()
>>>>>> print controller.GetNumberOfProcesses()
>>>>>>
>>>>>> from mpi4py import MPI
>>>>>> #print(dir(MPI))
>>>>>> #print(help(MPI))
>>>>>> print MPI.COMM_WORLD.Get_rank()
>>>>>> print MPI.COMM_WORLD.Get_size()
>>>>>>
>>>>>> Note also that there is a "feature" in the python programmable filter
>>>>>> that comes into play with structured data. That feature says that
>>>>>> structured data is not split at all by default. If you want structured
>>>>>> data to actually be parallel you need to put this code in your python
>>>>>> programmable filter.
>>>>>>
>>>>>> from paraview import util
>>>>>>
>>>>>>
>>>>>> self.GetExecutive().SetExtentTranslator(self.GetExecutive().GetOutputInformation(0),
>>>>>>
>>>>>> vtk.vtkExtentTranslator())
>>>>>>
>>>>>>
>>>>>> David E DeMarle
>>>>>> Kitware, Inc.
>>>>>> R&D Engineer
>>>>>> 28 Corporate Drive
>>>>>> Clifton Park, NY 12065-8662
>>>>>> Phone: 518-371-3971 x109
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Aug 3, 2011 at 11:09 AM, Tim
>>>>>> Gallagher<tim.gallagher at gatech.edu>
>>>>>> wrote:
>>>>>>>
>>>>>>> I guess I sort of answered my own question -- the entire script
>>>>>>> runs on
>>>>>>> each processor, so I ended up with 8 copies of my data in memory
>>>>>>> (or I would
>>>>>>> have, had I not filled the 12 GB of RAM and 20 GB of swap space
>>>>>>> and my
>>>>>>> system crashed).
>>>>>>>
>>>>>>> So is there some way to query the processor information? Probably
>>>>>>> something in the RequestInformation script -- find out how many
>>>>>>> processors
>>>>>>> there are and then the prog. filter determines based on processor
>>>>>>> ID and
>>>>>>> number of processors what section of the data to load.
>>>>>>>
>>>>>>> In that case, how does the aggregation of the data work? The exact
>>>>>>> pipeline is:
>>>>>>>
>>>>>>> DataObjectGenerator("MB{}")
>>>>>>> ProgrammableFilter
>>>>>>>
>>>>>>> in serial, the PF appends blocks into the input and passes that
>>>>>>> through
>>>>>>> to the output. In parallel, that same pipeline would create a MB{}
>>>>>>> on each
>>>>>>> CPU that gets filled with that CPU's data, but at the end of this
>>>>>>> step I
>>>>>>> would want a single MB{} object, not NCPU MB{}'s.
>>>>>>>
>>>>>>> Hopefully that makes sense... I've never used PV in parallel, so
>>>>>>> I'm not
>>>>>>> sure how it all works.
>>>>>>>
>>>>>>> Tim
>>>>>>>
>>>>>>> ----- Original Message -----
>>>>>>> From: "Tim Gallagher"<tim.gallagher at gatech.edu>
>>>>>>> To: "ParaView list"<paraview at paraview.org>
>>>>>>> Sent: Wednesday, August 3, 2011 9:24:25 AM
>>>>>>> Subject: [Paraview] Programmable filter in parallel
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I know many of the built-in readers/filters already work in parallel,
>>>>>>> but how does one write a parallel programmable filter?
>>>>>>>
>>>>>>> Our data files are XDMF and split into blocks of data. We have a
>>>>>>> single
>>>>>>> XDMF file that we can read that reads all the blocks and generates a
>>>>>>> vtkMultiBlockDataset (this works with the built in XDMF reader).
>>>>>>>
>>>>>>> However, each block has some ghost cells around it that are needed
>>>>>>> to do
>>>>>>> the CellDataToPointData interpolation. For large numbers of
>>>>>>> blocks, this
>>>>>>> creates far too many grid points for our machines to load. So,
>>>>>>> I've written
>>>>>>> a programmable filter that does:
>>>>>>>
>>>>>>> start with empty vtkMultiBlockDataset
>>>>>>> for each block in restart file
>>>>>>> read block file with XDMFReader
>>>>>>> CellDataToPointData
>>>>>>> strip off the extra layers of cells
>>>>>>> append to output vtkMultiBlockDataset
>>>>>>>
>>>>>>> If I run this in parallel, what exactly is parallel? Is the
>>>>>>> reading and
>>>>>>> CD2PD done in parallel on each block? Is none of it parallel?
>>>>>>> Ideally, I
>>>>>>> would have the loop over blocks done in parallel, but I don't know
>>>>>>> how to
>>>>>>> indicate that in the programmable filter (if it's possible).
>>>>>>>
>>>>>>> Any advice would be great,
>>>>>>>
>>>>>>> Tim
>>>>>>> _______________________________________________
>>>>>>> Powered by www.kitware.com
>>>>>>>
>>>>>>> Visit other Kitware open-source projects at
>>>>>>> http://www.kitware.com/opensource/opensource.html
>>>>>>>
>>>>>>> Please keep messages on-topic and check the ParaView Wiki at:
>>>>>>> http://paraview.org/Wiki/ParaView
>>>>>>>
>>>>>>> Follow this link to subscribe/unsubscribe:
>>>>>>> http://www.paraview.org/mailman/listinfo/paraview
>>>>>>> _______________________________________________
>>>>>>> Powered by www.kitware.com
>>>>>>>
>>>>>>> Visit other Kitware open-source projects at
>>>>>>> http://www.kitware.com/opensource/opensource.html
>>>>>>>
>>>>>>> Please keep messages on-topic and check the ParaView Wiki at:
>>>>>>> http://paraview.org/Wiki/ParaView
>>>>>>>
>>>>>>> Follow this link to subscribe/unsubscribe:
>>>>>>> http://www.paraview.org/mailman/listinfo/paraview
>>>>>>>
>>>>>> _______________________________________________
>>>>>> Powered by www.kitware.com
>>>>>>
>>>>>> Visit other Kitware open-source projects at
>>>>>> http://www.kitware.com/opensource/opensource.html
>>>>>>
>>>>>> Please keep messages on-topic and check the ParaView Wiki at:
>>>>>> http://paraview.org/Wiki/ParaView
>>>>>>
>>>>>> Follow this link to subscribe/unsubscribe:
>>>>>> http://www.paraview.org/mailman/listinfo/paraview
>>>>>>
>>>>> _______________________________________________
>>>>> Powered by www.kitware.com
>>>>>
>>>>> Visit other Kitware open-source projects at
>>>>> http://www.kitware.com/opensource/opensource.html
>>>>>
>>>>> Please keep messages on-topic and check the ParaView Wiki at:
>>>>> http://paraview.org/Wiki/ParaView
>>>>>
>>>>> Follow this link to subscribe/unsubscribe:
>>>>> http://www.paraview.org/mailman/listinfo/paraview
>>>>
>> _______________________________________________
>> Powered by www.kitware.com
>>
>> Visit other Kitware open-source projects at
>> http://www.kitware.com/opensource/opensource.html
>>
>> Please keep messages on-topic and check the ParaView Wiki at:
>> http://paraview.org/Wiki/ParaView
>>
>> Follow this link to subscribe/unsubscribe:
>> http://www.paraview.org/mailman/listinfo/paraview
>


More information about the ParaView mailing list