Skip to content
Snippets Groups Projects
report-01 13.64 KiB
Title: Porting douar
Author: Douglas Guptill
Date: 2009-06-04


Porting to p690
---------------
  - frustrating, time-costly
  - douar behaviour varies with 
     + compiler (xlf 8.1, xlf 10.1) 
     + compiler options
     + changes in noctreemax
  - I wonder if the execution time and/or results are being affected
    by the type mismatches at link time; see below for more.
  - on the p690, using 16 processors, douar is running at 1/300 of the
    speed of grace. (David Whipp ran the input.txt file on grace, and
    sent me the stdout file.)
  - no run yet has passed the point where douar calls wsmp.


======== (start) output from grace ===================================
 start of non-linear iteratio      1    168.0293
 -----------------------------------
                        build system    168.0294      0.0001      0.0001
                                                                        nelem per proc (min/max)       27059       42683
                                                                        viscosity range   1.00000E-04  3.59904E+02
                                                                        viscosity capped in    212992
                          wsmp solve    215.1212     47.0919     47.0918
======== (end) output from grace =====================================


======== (start) output from p690 (np=16) ============================
 start of non-linear iteratio      1  57001.9888
 -----------------------------------
                        build system  57001.9889      0.0001      0.0001
                                                                        nelem per proc (min/max)       13522       28627
                                                                        viscosity range   1.00000E-04  3.59904E+02
                                                                        viscosity capped in    212992
                          wsmp solve  61747.2775   4745.2886   4745.2886
======== (end) output from p690 (np=16) ==============================




Porting to mahone.ace-net.ca
----------------------------

For more about mahone, its hardware and software, see here:
  https://wiki.ace-net.ca/index.php/Main_Page

  - PGI compilers
  - quick and easy
  - runs about 10% faster (using 4 processes on the head node) than grace.
  - no run yet has passed the point where it calls wsmp.
  - no successful run yet; MPI problems, as yet un-diagnosed:
    [clhead:03784] *** An error occurred in MPI_Send
    [clhead:03784] *** on communicator MPI_COMM_WORLD
    [clhead:03784] *** MPI_ERR_COUNT: invalid count argument
    [clhead:03784] *** MPI_ERRORS_ARE_FATAL (goodbye)


======== (start) output from mahone (np=4) ===========================
 start of non-linear iteratio      1    137.5392
 -----------------------------------
                        build system    137.5393      0.0000      0.0000
                                                                        nelem per proc (min/max)       54619       68563
                                                                        viscosity range   1.00000E-04  3.59904E+02
                                                                        viscosity capped in    196608
                          wsmp solve    179.2639     41.7246     41.7246
======== (end) output from mahone (np=4) =============================



Type mismatches
---------------

The xlf Fortran compiler on the p690 will, if asked, check for
mismatches between calling sequences and subroutine definitions.  

What do I mean by a type mismatch?  An example: parameter #3 in the
calling list is a scalar, parameter #3 in the dummy argument list of
the subroutine code is an array.  

The xlf compiler found type mismatches in douar.  Some I could correct
easily; others will require a much closer examination of the code to
determine a fix that doesn't break douar.  There are type mismatches
in the calls to these routines:
  nn2d_setup
  nn2d
  octree_interpolate_many
  octree_interpolate_many_derivative
  show
  delaun
  indexx
  fluvial_erosion
  diffusion_erosion
  update_time_step

There are also mismatches for mpi_ routines; this is common on the
p690; I believe they can be ignored.  See below for the complete list.


The C code in NN
----------------
(this section is for detail fanatics)

The files stack.c, stackpair.c and volume.c had copies with a .cc
extension:
  stack.c      and stack.cc
  stackpair.c  and stackpair.cc
  volume.c     and volume.cc

The actual code in each pair of files (stack.c and stack.cc) was
identical, except that in one copy there was an underscore at the end
of some routine names.  

The reason for these duplicates appears to be the variation in how
Fortran compilers link to non-Fortran routines; some add an underscore
to the name of non-Fortran routine, some don't.  This results in two
possibilities at link time: a reference to, for example, 
  "stackinit" 
or to 
  "stackinit_"

I believe that the duplicated code is a maintenance problem and a
potential source of nasty bugs.  So I removed the duplicate files, and
modified the remaining copy by adding a stub for each routine which
calls the other.  For example, the code for "stackinit" now looks like
the snippet below.

The C routines which were un-typed caused link failures on the p690.
The cure for this was to add a type (void) for those entry points.

================= (start) stackinit ======================
/* prototypes */
        void stackinit_();
        void stackinit();

/* provide an entry point with "_" at the end */
        void stackinit_() {stackinit();}

/* the code */
        void stackinit() 
	   {
	     head = (struct node *) malloc(sizeof *head);
	     z = (struct node *) malloc(sizeof *z);
	     head->next = z; head->key=0;
	     z->next = z;
	     z->key = 0;
	   }
================= (end) stackinit ======================


============ (start) type mismatches from the p690 ====================
(ld): mismatch
ld: 0711-189 ERROR: Type mismatches were detected.
	The following symbols are in error:
 Symbol                    Hash                   Inpndx  TY CL Source-File(Object-File) OR Import-File{Shared-object}
 ------------------------- ---------------------- ------- -- -- ------------------------------------------------------
 .mpi_reduce               ** No Hash **          [IMPORT] -- PR {/usr/lpp/ppe.poe/lib/libmpi_r.a[mpifort64_r.o]}
       ** References **                           [249]   ER PR (/home/beaumnt1/software/wsmp/lib/Power4/libpwsmp64.a[pwssmp.o])
       ** References Without Matching Definitions **
                           Fort 1C031446 20202020 [499]   ER PR (do_leaf_measurements.o)
                                                  [497]   ER PR (compute_divergence.o)
                           Fort 883D2E6F 883D2E6F [520]   ER PR (build_system_wsmp.o)

 .mpi_allreduce            ** No Hash **          [IMPORT] -- PR {/usr/lpp/ppe.poe/lib/libmpi_r.a[mpifort64_r.o]}
       ** References **                           [60]    ER PR (/home/beaumnt1/software/wsmp/lib/Power4/libpwsmp64.a[pwgcomm.o])
                                                  [26]    ER PR (/home/beaumnt1/software/wsmp/lib/Power4/libpwsmp64.a[pwgcomm.o])
                                                  [496]   ER PR (/home/beaumnt1/software/wsmp/lib/Power4/libpwsmp64.a[pwgsmp.o])
                                                  [167]   ER PR (/home/beaumnt1/software/wsmp/lib/Power4/libpwsmp64.a[pwgsmp.o])
                                                  [1437]  ER PR (/home/beaumnt1/software/wsmp/lib/Power4/libpwsmp64.a[parsymb.o])
       ** References Without Matching Definitions **
                           Fort A0C85910 20202020 [577]   ER PR (update_cloud_fields.o)
                                                  [393]   ER PR (move_cloud.o)
                                                  [490]   ER PR (move_surface.o)
                                                  [390]   ER PR (interpolate_velocity_on_surface.o)
                                                  [477]   ER PR (interpolate_ov_on_osolve.o)
                                                  [538]   ER PR (improve_osolve.o)
                                                  [540]   ER PR (erosion.o)
                                                  [476]   ER PR (compute_pressure.o)
                                                  [528]   ER PR (build_system_wsmp.o)
                           Fort 7300E28E 20202020 [413]   ER PR (refine_surface.o)
                                                  [402]   ER PR (check_delaunay.o)

 .mpi_bcast                ** No Hash **          [IMPORT] -- PR {/usr/lpp/ppe.poe/lib/libmpi_r.a[mpifort64_r.o]}
       ** References **                           [376]   ER PR (/home/beaumnt1/software/wsmp/lib/Power4/libpwsmp64.a[pwgsmp.o])
                                                  [295]   ER PR (/home/beaumnt1/software/wsmp/lib/Power4/libpwsmp64.a[pwssmp.o])
                                                  [24]    ER PR (/home/beaumnt1/software/wsmp/lib/Power4/libpwsmp64.a[porder.o])
                                                  [244]   ER PR (/home/beaumnt1/software/wsmp/lib/Power4/libpwsmp64.a[plda.o])
       ** References Without Matching Definitions **
                           Fort D8DDC68E 20202020 [519]   ER PR (solve_with_pwgsmp.o)
                                                  [528]   ER PR (solve_with_pwssmp.o)
                                                  [524]   ER PR (erosion.o)
                           Fort 4955A997 4955A997 [354]   ER PR (read_input_file.o)
                                                  [272]   ER PR (read_controlling_parameters.o)
                                                  [450]   ER PR (create_surfaces.o)

 .nn2d_setup               Fort 415F1726 DC32B44C [47]    LD PR nn.f(NN/libnn_f-q64.a[nn.o])
       ** References Without Matching Definitions **
                           Fort 5064C7F3 20202020 [536]   ER PR (erosion.o)
                                                  [466]   ER PR (create_surfaces.o)

 .nn2d                     Fort DAF314E0 477683E6 [182]   LD PR nn.f(NN/libnn_f-q64.a[nn.o])
       ** References Without Matching Definitions **
                           Fort 14D0B906 20202020 [538]   ER PR (erosion.o)

 .octree_interpolate_many  Fort 56666C28 C1BA756A [1853]  LD PR OctreeBitPlus.f90(OCTREE/libOctree-q64.a[OctreeBitPlus.o])
       ** References Without Matching Definitions **
                           Fort 5C3279C7 20202020 [591]   ER PR (update_cloud_fields.o)
                                                  [391]   ER PR (move_cloud.o)
                                                  [480]   ER PR (move_surface.o)
                                                  [388]   ER PR (interpolate_velocity_on_surface.o)
                           Fort 8D36C986 20202020 [473]   ER PR (interpolate_ov_on_osolve.o)

 .octree_interpolate_many_derivative Fort 8F7B9C4E 70645C45 [2013]  LD PR OctreeBitPlus.f90(OCTREE/libOctree-q64.a[OctreeBitPlus.o])
       ** References Without Matching Definitions **
                           Fort 7989C8EA 20202020 [482]   ER PR (move_surface.o)

 .mpi_wtime                ** No Hash **          [IMPORT] -- PR {/usr/lpp/ppe.poe/lib/libmpi_r.a[mpifort64_r.o]}
       ** References Without Matching Definitions **
                           Fort D298D1B5 D298D1B5 [272]   ER PR (toolbox.o)
                           Fort D8CCB3E4 D8CCB3E4 [501]   ER PR (solve_with_pwgsmp.o)
                                                  [508]   ER PR (solve_with_pwssmp.o)

 .show                     Fort 4955A997 0C4AC181 [738]   LD PR OctreeBitPlus.f90(OCTREE/libOctree-q64.a[OctreeBitPlus.o])
       ** References Without Matching Definitions **
                           Fort F55DBCFA 20202020 [566]   ER PR (CASCADE/libcascade-q64.a[cascade.o])

 .delaun                   Fort FCC34CBE F539DC8A [34]    LD PR delaun.f(NN/libnn_f-q64.a[delaun.o])
       ** References Without Matching Definitions **
                           Fort FCC34CBE 20202020 [516]   ER PR (CASCADE/libcascade-q64.a[nn_remove.o])
                                                  [121]   ER PR (NN/libnn_f-q64.a[nn.o])
                           Fort 61BF4539 20202020 [210]   ER PR (CASCADE/libcascade-q64.a[check_mesh.o])
                           Fort F6CF495C 20202020 [126]   ER PR (CASCADE/libcascade-q64.a[find_neighbours.o])

 .indexx                   Fort B1C0D0F4 BDD1E49E [2404]  LD PR nn.f(NN/libnn_f-q64.a[nn.o])
       ** References Without Matching Definitions **
                           Fort 410FD164 20202020 [122]   ER PR (CASCADE/libcascade-q64.a[find_neighbours.o])
                           Fort B1C0D0F4 20202020 [1847]  ER PR (NN/libnn_f-q64.a[nn.o])
                                                  [1637]  ER PR (NN/libnn_f-q64.a[nn.o])
                                                  [1451]  ER PR (NN/libnn_f-q64.a[nn.o])

 .fluvial_erosion          Fort 05493BBF 38B5F94F [15]    LD PR fluvial_erosion.f(CASCADE/libcascade-q64.a[fluvial_erosion.o])
       ** References Without Matching Definitions **
                           Fort 6E2E3C6A 20202020 [546]   ER PR (CASCADE/libcascade-q64.a[cascade.o])

 .diffusion_erosion        Fort 9BE4E4A5 B1655465 [12]    LD PR diffusion_erosion.f(CASCADE/libcascade-q64.a[diffusion_erosion.o])
       ** References Without Matching Definitions **
                           Fort A199E5A4 20202020 [548]   ER PR (CASCADE/libcascade-q64.a[cascade.o])

 .update_time_step         Fort B1E8249C FE6F92E6 [14]    LD PR update_time_step.f(CASCADE/libcascade-q64.a[update_time_step.o])
       ** References Without Matching Definitions **
                           Fort 949623C9 20202020 [532]   ER PR (CASCADE/libcascade-q64.a[cascade.o])
MISMATCH: The return code is 8.
============ (end) type mismatches from the p690 ====================