Skip to content
Snippets Groups Projects
report-01 13.6 KiB
Newer Older
  • Learn to ignore specific revisions
  • Douglas Guptill's avatar
    Douglas Guptill committed
    Title: Porting douar
    Author: Douglas Guptill
    Date: 2009-06-04
    
    
    Porting to p690
    ---------------
      - frustrating, time-costly
      - douar behaviour varies with 
         + compiler (xlf 8.1, xlf 10.1) 
         + compiler options
         + changes in noctreemax
      - I wonder if the execution time and/or results are being affected
        by the type mismatches at link time; see below for more.
      - on the p690, using 16 processors, douar is running at 1/300 of the
        speed of grace. (David Whipp ran the input.txt file on grace, and
        sent me the stdout file.)
      - no run yet has passed the point where douar calls wsmp.
    
    
    ======== (start) output from grace ===================================
     start of non-linear iteratio      1    168.0293
     -----------------------------------
                            build system    168.0294      0.0001      0.0001
                                                                            nelem per proc (min/max)       27059       42683
                                                                            viscosity range   1.00000E-04  3.59904E+02
                                                                            viscosity capped in    212992
                              wsmp solve    215.1212     47.0919     47.0918
    ======== (end) output from grace =====================================
    
    
    ======== (start) output from p690 (np=16) ============================
     start of non-linear iteratio      1  57001.9888
     -----------------------------------
                            build system  57001.9889      0.0001      0.0001
                                                                            nelem per proc (min/max)       13522       28627
                                                                            viscosity range   1.00000E-04  3.59904E+02
                                                                            viscosity capped in    212992
                              wsmp solve  61747.2775   4745.2886   4745.2886
    ======== (end) output from p690 (np=16) ==============================
    
    
    
    
    Porting to mahone.ace-net.ca
    ----------------------------
    
    For more about mahone, its hardware and software, see here:
      https://wiki.ace-net.ca/index.php/Main_Page
    
      - PGI compilers
      - quick and easy
      - runs about 10% faster (using 4 processes on the head node) than grace.
      - no run yet has passed the point where it calls wsmp.
      - no successful run yet; MPI problems, as yet un-diagnosed:
        [clhead:03784] *** An error occurred in MPI_Send
        [clhead:03784] *** on communicator MPI_COMM_WORLD
        [clhead:03784] *** MPI_ERR_COUNT: invalid count argument
        [clhead:03784] *** MPI_ERRORS_ARE_FATAL (goodbye)
    
    
    ======== (start) output from mahone (np=4) ===========================
     start of non-linear iteratio      1    137.5392
     -----------------------------------
                            build system    137.5393      0.0000      0.0000
                                                                            nelem per proc (min/max)       54619       68563
                                                                            viscosity range   1.00000E-04  3.59904E+02
                                                                            viscosity capped in    196608
                              wsmp solve    179.2639     41.7246     41.7246
    ======== (end) output from mahone (np=4) =============================
    
    
    
    Type mismatches
    ---------------
    
    The xlf Fortran compiler on the p690 will, if asked, check for
    mismatches between calling sequences and subroutine definitions.  
    
    What do I mean by a type mismatch?  An example: parameter #3 in the
    calling list is a scalar, parameter #3 in the dummy argument list of
    the subroutine code is an array.  
    
    The xlf compiler found type mismatches in douar.  Some I could correct
    easily; others will require a much closer examination of the code to
    determine a fix that doesn't break douar.  There are type mismatches
    in the calls to these routines:
      nn2d_setup
      nn2d
      octree_interpolate_many
      octree_interpolate_many_derivative
      show
      delaun
      indexx
      fluvial_erosion
      diffusion_erosion
      update_time_step
    
    There are also mismatches for mpi_ routines; this is common on the
    p690; I believe they can be ignored.  See below for the complete list.
    
    
    The C code in NN
    ----------------
    (this section is for detail fanatics)
    
    The files stack.c, stackpair.c and volume.c had copies with a .cc
    extension:
      stack.c      and stack.cc
      stackpair.c  and stackpair.cc
      volume.c     and volume.cc
    
    The actual code in each pair of files (stack.c and stack.cc) was
    identical, except that in one copy there was an underscore at the end
    of some routine names.  
    
    The reason for these duplicates appears to be the variation in how
    Fortran compilers link to non-Fortran routines; some add an underscore
    to the name of non-Fortran routine, some don't.  This results in two
    possibilities at link time: a reference to, for example, 
      "stackinit" 
    or to 
      "stackinit_"
    
    I believe that the duplicated code is a maintenance problem and a
    potential source of nasty bugs.  So I removed the duplicate files, and
    modified the remaining copy by adding a stub for each routine which
    calls the other.  For example, the code for "stackinit" now looks like
    the snippet below.
    
    The C routines which were un-typed caused link failures on the p690.
    The cure for this was to add a type (void) for those entry points.
    
    ================= (start) stackinit ======================
    /* prototypes */
            void stackinit_();
            void stackinit();
    
    /* provide an entry point with "_" at the end */
            void stackinit_() {stackinit();}
    
    /* the code */
            void stackinit() 
    	   {
    	     head = (struct node *) malloc(sizeof *head);
    	     z = (struct node *) malloc(sizeof *z);
    	     head->next = z; head->key=0;
    	     z->next = z;
    	     z->key = 0;
    	   }
    ================= (end) stackinit ======================
    
    
    ============ (start) type mismatches from the p690 ====================
    (ld): mismatch
    ld: 0711-189 ERROR: Type mismatches were detected.
    	The following symbols are in error:
     Symbol                    Hash                   Inpndx  TY CL Source-File(Object-File) OR Import-File{Shared-object}
     ------------------------- ---------------------- ------- -- -- ------------------------------------------------------
     .mpi_reduce               ** No Hash **          [IMPORT] -- PR {/usr/lpp/ppe.poe/lib/libmpi_r.a[mpifort64_r.o]}
           ** References **                           [249]   ER PR (/home/beaumnt1/software/wsmp/lib/Power4/libpwsmp64.a[pwssmp.o])
           ** References Without Matching Definitions **
                               Fort 1C031446 20202020 [499]   ER PR (do_leaf_measurements.o)
                                                      [497]   ER PR (compute_divergence.o)
                               Fort 883D2E6F 883D2E6F [520]   ER PR (build_system_wsmp.o)
    
     .mpi_allreduce            ** No Hash **          [IMPORT] -- PR {/usr/lpp/ppe.poe/lib/libmpi_r.a[mpifort64_r.o]}
           ** References **                           [60]    ER PR (/home/beaumnt1/software/wsmp/lib/Power4/libpwsmp64.a[pwgcomm.o])
                                                      [26]    ER PR (/home/beaumnt1/software/wsmp/lib/Power4/libpwsmp64.a[pwgcomm.o])
                                                      [496]   ER PR (/home/beaumnt1/software/wsmp/lib/Power4/libpwsmp64.a[pwgsmp.o])
                                                      [167]   ER PR (/home/beaumnt1/software/wsmp/lib/Power4/libpwsmp64.a[pwgsmp.o])
                                                      [1437]  ER PR (/home/beaumnt1/software/wsmp/lib/Power4/libpwsmp64.a[parsymb.o])
           ** References Without Matching Definitions **
                               Fort A0C85910 20202020 [577]   ER PR (update_cloud_fields.o)
                                                      [393]   ER PR (move_cloud.o)
                                                      [490]   ER PR (move_surface.o)
                                                      [390]   ER PR (interpolate_velocity_on_surface.o)
                                                      [477]   ER PR (interpolate_ov_on_osolve.o)
                                                      [538]   ER PR (improve_osolve.o)
                                                      [540]   ER PR (erosion.o)
                                                      [476]   ER PR (compute_pressure.o)
                                                      [528]   ER PR (build_system_wsmp.o)
                               Fort 7300E28E 20202020 [413]   ER PR (refine_surface.o)
                                                      [402]   ER PR (check_delaunay.o)
    
     .mpi_bcast                ** No Hash **          [IMPORT] -- PR {/usr/lpp/ppe.poe/lib/libmpi_r.a[mpifort64_r.o]}
           ** References **                           [376]   ER PR (/home/beaumnt1/software/wsmp/lib/Power4/libpwsmp64.a[pwgsmp.o])
                                                      [295]   ER PR (/home/beaumnt1/software/wsmp/lib/Power4/libpwsmp64.a[pwssmp.o])
                                                      [24]    ER PR (/home/beaumnt1/software/wsmp/lib/Power4/libpwsmp64.a[porder.o])
                                                      [244]   ER PR (/home/beaumnt1/software/wsmp/lib/Power4/libpwsmp64.a[plda.o])
           ** References Without Matching Definitions **
                               Fort D8DDC68E 20202020 [519]   ER PR (solve_with_pwgsmp.o)
                                                      [528]   ER PR (solve_with_pwssmp.o)
                                                      [524]   ER PR (erosion.o)
                               Fort 4955A997 4955A997 [354]   ER PR (read_input_file.o)
                                                      [272]   ER PR (read_controlling_parameters.o)
                                                      [450]   ER PR (create_surfaces.o)
    
     .nn2d_setup               Fort 415F1726 DC32B44C [47]    LD PR nn.f(NN/libnn_f-q64.a[nn.o])
           ** References Without Matching Definitions **
                               Fort 5064C7F3 20202020 [536]   ER PR (erosion.o)
                                                      [466]   ER PR (create_surfaces.o)
    
     .nn2d                     Fort DAF314E0 477683E6 [182]   LD PR nn.f(NN/libnn_f-q64.a[nn.o])
           ** References Without Matching Definitions **
                               Fort 14D0B906 20202020 [538]   ER PR (erosion.o)
    
     .octree_interpolate_many  Fort 56666C28 C1BA756A [1853]  LD PR OctreeBitPlus.f90(OCTREE/libOctree-q64.a[OctreeBitPlus.o])
           ** References Without Matching Definitions **
                               Fort 5C3279C7 20202020 [591]   ER PR (update_cloud_fields.o)
                                                      [391]   ER PR (move_cloud.o)
                                                      [480]   ER PR (move_surface.o)
                                                      [388]   ER PR (interpolate_velocity_on_surface.o)
                               Fort 8D36C986 20202020 [473]   ER PR (interpolate_ov_on_osolve.o)
    
     .octree_interpolate_many_derivative Fort 8F7B9C4E 70645C45 [2013]  LD PR OctreeBitPlus.f90(OCTREE/libOctree-q64.a[OctreeBitPlus.o])
           ** References Without Matching Definitions **
                               Fort 7989C8EA 20202020 [482]   ER PR (move_surface.o)
    
     .mpi_wtime                ** No Hash **          [IMPORT] -- PR {/usr/lpp/ppe.poe/lib/libmpi_r.a[mpifort64_r.o]}
           ** References Without Matching Definitions **
                               Fort D298D1B5 D298D1B5 [272]   ER PR (toolbox.o)
                               Fort D8CCB3E4 D8CCB3E4 [501]   ER PR (solve_with_pwgsmp.o)
                                                      [508]   ER PR (solve_with_pwssmp.o)
    
     .show                     Fort 4955A997 0C4AC181 [738]   LD PR OctreeBitPlus.f90(OCTREE/libOctree-q64.a[OctreeBitPlus.o])
           ** References Without Matching Definitions **
                               Fort F55DBCFA 20202020 [566]   ER PR (CASCADE/libcascade-q64.a[cascade.o])
    
     .delaun                   Fort FCC34CBE F539DC8A [34]    LD PR delaun.f(NN/libnn_f-q64.a[delaun.o])
           ** References Without Matching Definitions **
                               Fort FCC34CBE 20202020 [516]   ER PR (CASCADE/libcascade-q64.a[nn_remove.o])
                                                      [121]   ER PR (NN/libnn_f-q64.a[nn.o])
                               Fort 61BF4539 20202020 [210]   ER PR (CASCADE/libcascade-q64.a[check_mesh.o])
                               Fort F6CF495C 20202020 [126]   ER PR (CASCADE/libcascade-q64.a[find_neighbours.o])
    
     .indexx                   Fort B1C0D0F4 BDD1E49E [2404]  LD PR nn.f(NN/libnn_f-q64.a[nn.o])
           ** References Without Matching Definitions **
                               Fort 410FD164 20202020 [122]   ER PR (CASCADE/libcascade-q64.a[find_neighbours.o])
                               Fort B1C0D0F4 20202020 [1847]  ER PR (NN/libnn_f-q64.a[nn.o])
                                                      [1637]  ER PR (NN/libnn_f-q64.a[nn.o])
                                                      [1451]  ER PR (NN/libnn_f-q64.a[nn.o])
    
     .fluvial_erosion          Fort 05493BBF 38B5F94F [15]    LD PR fluvial_erosion.f(CASCADE/libcascade-q64.a[fluvial_erosion.o])
           ** References Without Matching Definitions **
                               Fort 6E2E3C6A 20202020 [546]   ER PR (CASCADE/libcascade-q64.a[cascade.o])
    
     .diffusion_erosion        Fort 9BE4E4A5 B1655465 [12]    LD PR diffusion_erosion.f(CASCADE/libcascade-q64.a[diffusion_erosion.o])
           ** References Without Matching Definitions **
                               Fort A199E5A4 20202020 [548]   ER PR (CASCADE/libcascade-q64.a[cascade.o])
    
     .update_time_step         Fort B1E8249C FE6F92E6 [14]    LD PR update_time_step.f(CASCADE/libcascade-q64.a[update_time_step.o])
           ** References Without Matching Definitions **
                               Fort 949623C9 20202020 [532]   ER PR (CASCADE/libcascade-q64.a[cascade.o])
    MISMATCH: The return code is 8.
    ============ (end) type mismatches from the p690 ====================