[Paraview] cluster: works with -np 1, crashes when runs in parallel :-(

Stéphane Backaert stephanebackaert at gmail.com
Fri Nov 4 05:43:18 EDT 2011


Hello,

I have installed paraview version 3.12.0-RC3-23-g712c45e on a cluster: compiled with intel compiler, openmpi-1.5.3, no hardware acc. so Mesa-7.9 and --use-offscreen-rendering flag at startup, MPI set to ON in ccmake. 

I launch the server with mpirun -np x ./pvserver  --use-offscreen-rendering and, through an ssh connection, the client on my laptop (latest version too).  That works fine :-) ... but slow :-(. 
So I try to use more procs, for example, mpirun -np 2 ./pvserver  --use-offscreen-rendering. There are two processes running on the cluster (I see them with 'ps x' command).

My problem: when I connect my client to this server, the connection is etablished but the server crashed immediately with the message:

Waiting for client
Connection URL: cs://localhost:11111
Client connected.
pvserver: /home/ucl/tfl/sbackaer/build/ParaView/ParaViewCore/ClientServerCore/vtkPVClientServerSynchronizedRenderers.cxx:76: virtual void vtkPVClientServerSynchronizedRenderers::SlaveEndRender(): Assertion `this->ParallelController->IsA("vtkSocketController")' failed.
[hmem00:08278] *** Process received signal ***
[hmem00:08278] Signal: Aborted (6)
[hmem00:08278] Signal code:  (-6)
pvserver: /home/ucl/tfl/sbackaer/build/ParaView/ParaViewCore/ClientServerCore/vtkPVClientServerSynchronizedRenderers.cxx:48: virtual void vtkPVClientServerSynchronizedRenderers::MasterEndRender(): Assertion `this->ParallelController->IsA("vtkSocketController")' failed.
[hmem00:08277] *** Process received signal ***
[hmem00:08277] Signal: Aborted (6)
[hmem00:08277] Signal code:  (-6)
hmem00:08278] [ 0] /lib64/libpthread.so.0() [0x3f8a40f4c0]
[hmem00:08278] [ 1] /lib64/libc.so.6(gsignal+0x35) [0x3f89c329a5]
[hmem00:08278] [ 2] /lib64/libc.so.6(abort+0x175) [0x3f89c34185]
[hmem00:08278] [ 3] /lib64/libc.so.6(__assert_fail+0xf5) [0x3f89c2b935]
[hmem00:08278] [ 4] /home/ucl/tfl/sbackaer/build/ParaView-bin/bin/libvtkPVClientServerCore.so(_ZN38vtkPVClientServerSynchronizedRenderers14SlaveEndRenderEv+0x56) [0x7fa499c32d6e]
[hmem00:08278] [ 5] /home/ucl/tfl/sbackaer/build/ParaView-bin/bin/libvtkParallel.so.pv3.12(_ZN24vtkSynchronizedRenderers15HandleEndRenderEv+0xfe) [0x7fa493cc1e4c]
[hmem00:08278] [ 6] /home/ucl/tfl/sbackaer/build/ParaView-bin/bin/libvtkParallel.so.pv3.12(_ZN24vtkSynchronizedRenderers15HandleEndRenderEv+0x72) [0x7fa493cc1dc0]
[hmem00:08278] [ 7] /home/ucl/tfl/sbackaer/build/ParaView-bin/bin/libvtkParallel.so.pv3.12(+0x26c470) [0x7fa493cc4470]
[hmem00:08278] [ 8] /home/ucl/tfl/sbackaer/build/ParaView-bin/bin/libvtkCommon.so.pv3.12(+0x26ae71) [0x7fa48f102e71]
[hmem00:08278] [ 9] /home/ucl/tfl/sbackaer/build/ParaView-bin/bin/libvtkCommon.so.pv3.12(_ZN9vtkObject11InvokeEventEmPv+0x41) [0x7fa48f103381]
[hmem00:08278] [10] /home/ucl/tfl/sbackaer/build/ParaView-bin/bin/libvtkRendering.so.pv3.12(_ZN11vtkRenderer6RenderEv+0xdcb) [0x7fa492c18267]
[hmem00:08278] [11] /home/ucl/tfl/sbackaer/build/ParaView-bin/bin/libvtkRendering.so.pv3.12(_ZN21vtkRendererCollection6RenderEv+0xca) [0x7fa492c15fb4]
[hmem00:08278] [12] /home/ucl/tfl/sbackaer/build/ParaView-bin/bin/libvtkRendering.so.pv3.12(_ZN15vtkRenderWindow14DoStereoRenderEv+0xee) [0x7fa492c2c84c]
[hmem00:08278] [13] /home/ucl/tfl/sbackaer/build/ParaView-bin/bin/libvtkRendering.so.pv3.12(_ZN15vtkRenderWindow10DoFDRenderEv+0x54a) [0x7fa492c2c754]
[hmem00:08278] [14] /home/ucl/tfl/sbackaer/build/ParaView-bin/bin/libvtkRendering.so.pv3.12(_ZN15vtkRenderWindow10DoAARenderEv+0x7c3) [0x7fa492c2c1ff]
[hmem00:08278] [15] /home/ucl/tfl/sbackaer/build/ParaView-bin/bin/libvtkRendering.so.pv3.12(_ZN15vtkRenderWindow6RenderEv+0x868) [0x7fa492c2b7ca]
[hmem00:08278] [16] /home/ucl/tfl/sbackaer/build/ParaView-bin/bin/libvtkPVClientServerCore.so(_ZN30vtkPVSynchronizedRenderWindows6RenderEj+0x95) [0x7fa499c9caed]
[hmem00:08278] [17] /home/ucl/tfl/sbackaer/build/ParaView-bin/bin/libvtkPVClientServerCore.so(+0x1dab8e) [0x7fa499c9ab8e]
[hmem00:08278] [18] /home/ucl/tfl/sbackaer/build/ParaView-bin/bin/libvtkParallel.so.pv3.12(_ZN25vtkMultiProcessController10ProcessRMIEiPvii+0x3cb) [0x7fa493bc99bd]
[hmem00:08278] [19] /home/ucl/tfl/sbackaer/build/ParaView-bin/bin/libvtkParallel.so.pv3.12(_ZN25vtkMultiProcessController11ProcessRMIsEii+0x6a8) [0x7fa493bc957a]
[hmem00:08278] [20] /home/ucl/tfl/sbackaer/build/ParaView-bin/bin/libvtkParallel.so.pv3.12(_ZN25vtkMultiProcessController11ProcessRMIsEv+0x22) [0x7fa493bc8ed0]
[hmem00:08278] [21] ./bin/pvserver() [0x401a70]
[hmem00:08278] [22] ./bin/pvserver(main+0x25) [0x401aef]
[hmem00:08278] [23] /lib64/libc.so.6(__libc_start_main+0xfd) [0x3f89c1ec5d]
[hmem00:08278] [24] ./bin/pvserver() [0x4017c9]
[hmem00:08278] *** End of error message ***



I tried with different configuration options in ccmake related to MPI, no changes. The cluster works for other MPI programs...

Any idea?

Thanks!

Stephane




More information about the ParaView mailing list