DOLFIN

python loop over cells for DG0 elements hangs in parallel

Bug #921472 reported by Marco Morandini on 2012-01-25

This bug affects 2 people

Affects		Status	Importance	Assigned to	Milestone
	DOLFIN	Confirmed	High	Unassigned

Bug Description

This is a follow-up of https://answers.launchpad.net/dolfin/+question/185196

The following code works fine in parallel (4 processors, mpirun -np 4 ) for nx = 13, but hangs, on my pc, as soon as I increase it to nx = 14

As noted by Johan Hake in his first answer the code does not hang if the number of cells are equally distributed on each processor, while it hangs when they are not equally distributed.

I understand that what I'm doing here below should be discouraged; however, I see no reason
for which it should not work.

#-------------
from dolfin import *

nx = 13
mesh = UnitCube(nx,10,10)

DG = FunctionSpace(mesh, "DG", 0)
E = Function(DG)
#print cells(mesh)
i = 0
global_idx = mesh.parallel_data().global_entity_indices(3)
#print mesh
#print global_idx
print "Number of cells", mesh.num_cells()
for c in cells(mesh):
i = i+1
#E.vector()[c.index()]=1./c.volume()
if c.index() > mesh.num_cells():
print "XXXXX", c.index()
break
E.vector()[global_idx.array()[c.index()]]=1./c.volume()
print i
V = VectorFunctionSpace(mesh, "CG", 1)
print "Done VF"

Revision history for this message

Marco Morandini (morandini) wrote on 2012-01-27:

I fear that nothing guarantees that the global vector index corresponds the the global cell index, even for a scalar function and DG constant elements as here. For this reason the "correct" test (still hanging) should be
(please correct me if I'm wrong):

#-------------------
from dolfin import *
import numpy

nx = 14
mesh = UnitCube(nx,10,10)

DG = FunctionSpace(mesh, "DG", 0)
E = Function(DG)
v = E.vector()
global_idx = mesh.parallel_data().global_entity_indices(3)
dof_map = E.function_space().dofmap()
print "Number of cells", mesh.num_cells()
indices = numpy.array([0], dtype=numpy.uintc)
i = 0
for c in cells(mesh):
dof_map.tabulate_dofs(indices, c)
if not v.owns_index(indices[0]):
  print "WTF", c.index(), indices[0]
  break
  v[indices[0]] = 1./c.volume()
i = i + 1
print cpp.MPI_process_number(), i
print "Done VF"
#-------------------

I think that the problem with the above code is that every time

v[indices[0]] = 1./c.volume()

is called VecAssemblyBegin is called as well. So, if the assignment instruction is not called
the same number of times by all processes we end up with some processes waiting for exit, and some others
blocked in a PMPI_Allreduce() call that will never return because of the ones that are already past the loop.
If I add some v.apply("replace") at the end then all the processes do exit the loop while, for the same reason as before, the the program still hangs; this seems to support the explanation.
The following code seems to work for me;
I let you to decide if this bug should simply be closed as invalid or suggest a re-design of the python interface.

#-----------------
from dolfin import *
import numpy

nx = 14
mesh = UnitCube(nx,10,10)

DG = FunctionSpace(mesh, "DG", 0)
E = Function(DG)
v = E.vector()
global_idx = mesh.parallel_data().global_entity_indices(3)
dof_map = E.function_space().dofmap()
print "Number of cells", mesh.num_cells()
indices = numpy.array([0], dtype=numpy.uintc)
i = 0
a = numpy.zeros(v.local_size())
(n0, n1) = v.local_range()
print n0, n1
for c in cells(mesh):
dof_map.tabulate_dofs(indices, c)
if not v.owns_index(indices[0]):
print "WTF", c.index(), indices[0]
break
a[indices[0] - n0] = 1./c.volume()
v.set_local(a)
print "Done VF"
#-----------------

#-------------------
from dolfin import *
import numpy

nx = 14
mesh = UnitCube(nx,10,10)

DG = FunctionSpace(mesh, "DG", 0)
E = Function(DG)
v = E.vector()
global_idx = mesh.parallel_data().global_entity_indices(3)
dof_map = E.function_space().dofmap()
print "Number of cells", mesh.num_cells()
indices = numpy.array([0], dtype=numpy.uintc)
i = 0
for c in cells(mesh):
	dof_map.tabulate_dofs(indices, c)
	if not v.owns_index(indices[0]):
		print "WTF", c.index(), indices[0]
		break
 	v[indices[0]] = 1./c.volume()
	i = i + 1
	print cpp.MPI_process_number(), i
print "Done VF"
#-------------------

I think that the problem with the above code is that every time

v[indices[0]] = 1./c.volume()

is called VecAssemblyBegin is called as well. So, if the assignment instruction is not called
the same number of times by all processes we end up with some processes waiting for exit, and some others
blocked in a PMPI_Allreduce() call that will never return because of the ones that are already past the loop.
If I add some v.apply("replace") at the end then all the processes do exit the loop while, for the same reason as before, the the program still hangs; this seems to support the explanation.
The following code seems to work for me; 
I let you to decide if this bug should simply be closed as invalid or suggest a re-design of the python interface.

#-----------------
from dolfin import *
import numpy

nx = 14
mesh = UnitCube(nx,10,10)

DG = FunctionSpace(mesh, "DG", 0)
E = Function(DG)
v = E.vector()
global_idx = mesh.parallel_data().global_entity_indices(3)
dof_map = E.function_space().dofmap()
print "Number of cells", mesh.num_cells()
indices = numpy.array([0], dtype=numpy.uintc)
i = 0
a = numpy.zeros(v.local_size())
(n0, n1) = v.local_range()
print n0, n1
for c in cells(mesh):
	dof_map.tabulate_dofs(indices, c)
	if not v.owns_index(indices[0]):
		print "WTF", c.index(), indices[0]
		break
	a[indices[0] - n0] = 1./c.volume()
v.set_local(a)
print "Done VF"
#-----------------

Revision history for this message

Garth Wells (garth-wells) wrote on 2012-02-14:

I think that the right solution to this is lo allow indexing of local entries only (using local indices), and insist on using 'set' for possibly off-process terms. This would mirror what the backends PETSc and Epetra.

Changed in dolfin:
status:	New → Confirmed
importance:	Undecided → High

Revision history for this message

Anders Logg (logg) wrote on 2012-02-14: Re: [Bug 921472] Re: python loop over cells for DG0 elements hangs in parallel

Agree.

--
Anders

On Tue, Feb 14, 2012 at 01:30:14PM -0000, Garth Wells wrote:
> I think that the right solution to this is lo allow indexing of local
> entries only (using local indices), and insist on using 'set' for
> possibly off-process terms. This would mirror what the backends PETSc
> and Epetra.
>
> ** Changed in: dolfin
> Status: New => Confirmed
>
> ** Changed in: dolfin
> Importance: Undecided => High
>

Report a bug

This report contains Public information

Everyone can see this information.

Duplicates of this bug

Bug #932034

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.