reducing amount of GlobalMemory #266

jeffhammond · 2016-01-07T01:03:35Z

Travis CI containers only give me 4 GB of memory. Grappa is allocating all of that in the shared heap (or 4x that if the printout is the per-thread amount).

I read through source and cannot see how to set this myself. Is that supported?

I guess I can run only 2 procs, but I like testing on 4 even though this is oversubscribing, because oversubscribing seems to release more bugs in parallel runtimes.

+/home/travis/PRK-deps/mpich/bin/mpirun -n 4 GRAPPA/Synch_p2p/p2p 10 1024 1024
I0107 00:46:42.978490 81814 Allocator.hpp:185] Allocator is responsible for addresses from 0 to 0xeb860000
I0107 00:46:42.978878 81814 GlobalMemory.cpp:67] Initialized GlobalMemory with 3951427584 bytes of shared heap.
I0107 00:46:42.990459 81815 GlobalMemory.cpp:67] Initialized GlobalMemory with 3951427584 bytes of shared heap.
I0107 00:46:42.990805 81816 GlobalMemory.cpp:67] Initialized GlobalMemory with 3951427584 bytes of shared heap.
I0107 00:46:43.006459 81817 GlobalMemory.cpp:67] Initialized GlobalMemory with 3951427584 bytes of shared heap.
I0107 00:46:43.009131 81814 Grappa.cpp:647] 
-------------------------
Shared memory breakdown:
  node total:                   29.4405 GB
  locale shared heap total:     14.7202 GB
  locale shared heap per core:  3.68006 GB
  communicator per core:        0.125 GB
  tasks per core:               0.0156631 GB
  global heap per core:         0.920013 GB
  aggregator per core:          0.00247955 GB
  shared_pool current per core: 4.76837e-07 GB
  shared_pool max per core:     0.920015 GB
  free per locale:              10.475 GB
  free per core:                2.61876 GB
-------------------------
Parallel Research Kernels version 2.16
Grappa pipeline execution on 2D grid
===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   PID 81815 RUNNING AT testing-worker-linux-docker-8535467c-3182-linux-2
=   EXIT CODE: 135
=   CLEANING UP REMAINING PROCESSES
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Bus error (signal 7)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions

bmyerz · 2016-01-07T01:19:59Z

my quick answer:

You can get some memory related options with
$YOUR_APPLICATION_COMMAND --help | grep global\|locale\|heap\|memory -A1

You may want to mess with the --global_heap_fraction option. If your configuration is 4cores(ie grappa processes) on 1 node, it does look like all 3.6 GB are being assigned to the global heap (0.9 * 4)

nelsonje · 2016-01-07T01:29:21Z

There are three main flags that control the way the node memory gets divided up. The main one is

    --locale_shared_fraction (Fraction of total node memory to allocate for
      Grappa) type: double default: 0.5

There are a couple other other pools that are allocated out of that locale shared heap, and they are controlled with

    -global_heap_fraction (Fraction of locale shared memory to set aside for
      global shared heap) type: double default: 0.25
    -shared_pool_memory_fraction (Fraction of locale shared heap to use for
      shared pool) type: double default: 0.25

I would suggest first setting ---locale_shared_fraction=0.25 or slightly less and see what happens. We did not design for oversubscription, though, so we may not have exposed all the necessary flags. I can take a look in a couple days if this doesn't work for you immediately.

(This is another thing I hope to simplify this month)

jeffhammond mentioned this issue Jul 4, 2016

Updated GRAPPA Transpose implementation, Introduce new grappa kernels, and fix whitespace ParRes/Kernels#94

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reducing amount of GlobalMemory #266

reducing amount of GlobalMemory #266

jeffhammond commented Jan 7, 2016

bmyerz commented Jan 7, 2016

nelsonje commented Jan 7, 2016

reducing amount of GlobalMemory #266

reducing amount of GlobalMemory #266

Comments

jeffhammond commented Jan 7, 2016

bmyerz commented Jan 7, 2016

nelsonje commented Jan 7, 2016