Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

physl sometimes does not write newick tree or graph #1186

Open
stevenrbrandt opened this issue Jun 1, 2020 · 6 comments
Open

physl sometimes does not write newick tree or graph #1186

stevenrbrandt opened this issue Jun 1, 2020 · 6 comments
Assignees

Comments

@stevenrbrandt
Copy link
Member

The command

['mpirun', '-np', '1', '-machinefile', 'hosts.txt', '/work/sbrandt/phylanx/build/bin/physl', '--dump-counters=py-csv.txt', '--dump-newick-tree=py-tree.txt', '--dump-dot=py-graph.txt', '--performance', '--print=result.py', 'call_lra_demo.physl']

applied to the physl code generated by this python code

def lra_demo(x, y, alpha, iterations, enable_output):
    weights = np.zeros(np.shape(x)[1])
    transx = np.transpose(x)
    pred = np.zeros(np.shape(x)[0])
    error = np.zeros(np.shape(x)[0])
    gradient = np.zeros(np.shape(x)[1])
    step = 0
    while step < iterations:
        if (enable_output):
            print("step: ", step, ", ", weights)
        pred = 1.0 / (1.0 + np.exp(-np.dot(x, weights)))
        error = pred - y
        gradient = np.dot(transx, error)
        weights = weights - (alpha * gradient)
        step += 1
    return weights

and using the breast cancer data https://raw.githubusercontent.com/STEllAR-GROUP/phylanx/master/examples/algorithms/datasets/breast_cancer.csv

will sometimes generate the files py-tree.txt py-graph.txt and sometimes not.

@stevenrbrandt
Copy link
Member Author

Note there are no errors, nothing aborts, and result.py is written.

@stevenrbrandt
Copy link
Member Author

It seems that there was, in fact a segfault

    at /usr/include/c++/7/ext/atomicity.h:81
81      if (__gthread_active_p())
#0  __gnu_cxx::__exchange_and_add_dispatch (__mem=0x7fff7819b818, __val=-1)
    at /usr/include/c++/7/ext/atomicity.h:81
#1  0x00000000004f2bfd in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x7fff7819b810) at /usr/include/c++/7/bits/shared_ptr_base.h:151
#2  0x00000000004ed015 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x7fff700c3f08, __in_chrg=<optimized out>)
    at /usr/include/c++/7/bits/shared_ptr_base.h:684
#3  0x00007ffff6c33bfe in std::__shared_ptr<apex::task_wrapper, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x7fff700c3f00, __in_chrg=<optimized out>)
    at /usr/include/c++/7/bits/shared_ptr_base.h:1123
#4  0x00007ffff6c33c1a in std::shared_ptr<apex::task_wrapper>::~shared_ptr (
    this=0x7fff700c3f00, __in_chrg=<optimized out>)
    at /usr/include/c++/7/bits/shared_ptr.h:93
#5  0x00007fffef8057c2 in apex::task_wrapper::~task_wrapper (
    this=0x7fff700c3ee0, __in_chrg=<optimized out>)
---Type <return> to continue, or q <return> to quit---    at /hpx/apex/src/apex/task_wrapper.hpp:29
#6  0x00007fffef8057e2 in __gnu_cxx::new_allocator<apex::task_wrapper>::destroy<apex::task_wrapper> (this=0x7fff700c3ee0, __p=0x7fff700c3ee0)
    at /usr/include/c++/7/ext/new_allocator.h:140
#7  0x00007fffef805785 in std::allocator_traits<std::allocator<apex::task_wrapper> >::destroy<apex::task_wrapper> (__a=..., __p=0x7fff700c3ee0)
    at /usr/include/c++/7/bits/alloc_traits.h:487
#8  0x00007fffef805625 in std::_Sp_counted_ptr_inplace<apex::task_wrapper, std::allocator<apex::task_wrapper>, (__gnu_cxx::_Lock_policy)2>::_M_dispose (
    this=0x7fff700c3ed0) at /usr/include/c++/7/bits/shared_ptr_base.h:535

@stevenrbrandt
Copy link
Member Author

I'm wondering if this is the bug Kevin recently fixed?

@stevenrbrandt
Copy link
Member Author

#9  0x00000000004f2c1e in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x7fff700c3ed0) at /usr/include/c++/7/bits/shared_ptr_base.h:154
#10 0x00000000004ed015 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x7fff940ac848, __in_chrg=<optimized out>)
    at /usr/include/c++/7/bits/shared_ptr_base.h:684
#11 0x00007ffff6c33bfe in std::__shared_ptr<apex::task_wrapper, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x7fff940ac840, __in_chrg=<optimized out>)
    at /usr/include/c++/7/bits/shared_ptr_base.h:1123
#12 0x00007ffff6c33c1a in std::shared_ptr<apex::task_wrapper>::~shared_ptr (
    this=0x7fff940ac840, __in_chrg=<optimized out>)
    at /usr/include/c++/7/bits/shared_ptr.h:93
#13 0x00007fffef8057c2 in apex::task_wrapper::~task_wrapper (
    this=0x7fff940ac820, __in_chrg=<optimized out>)
---Type <return> to continue, or q <return> to quit---    at /hpx/apex/src/apex/task_wrapper.hpp:29
#14 0x00007fffef8057e2 in __gnu_cxx::new_allocator<apex::task_wrapper>::destroy<apex::task_wrapper> (this=0x7fff940ac820, __p=0x7fff940ac820)
    at /usr/include/c++/7/ext/new_allocator.h:140
#15 0x00007fffef805785 in std::allocator_traits<std::allocator<apex::task_wrapper> >::destroy<apex::task_wrapper> (__a=..., __p=0x7fff940ac820)
    at /usr/include/c++/7/bits/alloc_traits.h:487
#16 0x00007fffef805625 in std::_Sp_counted_ptr_inplace<apex::task_wrapper, std::allocator<apex::task_wrapper>, (__gnu_cxx::_Lock_policy)2>::_M_dispose (
    this=0x7fff940ac810) at /usr/include/c++/7/bits/shared_ptr_base.h:535

@stevenrbrandt
Copy link
Member Author

17 0x00000000004f2c1e in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x7fff940ac810) at /usr/include/c++/7/bits/shared_ptr_base.h:154
#18 0x00000000004ed015 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x7fff900cefa8, __in_chrg=<optimized out>)
    at /usr/include/c++/7/bits/shared_ptr_base.h:684
#19 0x00007ffff6c33bfe in std::__shared_ptr<apex::task_wrapper, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x7fff900cefa0, __in_chrg=<optimized out>)
    at /usr/include/c++/7/bits/shared_ptr_base.h:1123
#20 0x00007ffff6c33c1a in std::shared_ptr<apex::task_wrapper>::~shared_ptr (
    this=0x7fff900cefa0, __in_chrg=<optimized out>)
    at /usr/include/c++/7/bits/shared_ptr.h:93
#21 0x00007fffef8057c2 in apex::task_wrapper::~task_wrapper (
    this=0x7fff900cef80, __in_chrg=<optimized out>)
---Type <return> to continue, or q <return> to quit---    at /hpx/apex/src/apex/task_wrapper.hpp:29
#22 0x00007fffef8057e2 in __gnu_cxx::new_allocator<apex::task_wrapper>::destroy<apex::task_wrapper> (this=0x7fff900cef80, __p=0x7fff900cef80)
    at /usr/include/c++/7/ext/new_allocator.h:140
#23 0x00007fffef805785 in std::allocator_traits<std::allocator<apex::task_wrapper> >::destroy<apex::task_wrapper> (__a=..., __p=0x7fff900cef80)
    at /usr/include/c++/7/bits/alloc_traits.h:487
#24 0x00007fffef805625 in std::_Sp_counted_ptr_inplace<apex::task_wrapper, std::allocator<apex::task_wrapper>, (__gnu_cxx::_Lock_policy)2>::_M_dispose (
    this=0x7fff900cef70) at /usr/include/c++/7/bits/shared_ptr_base.h:535

@khuck
Copy link
Contributor

khuck commented Jun 2, 2020

I have seen something transient lately...only with mpi / distributed runs. I'll see what I can find

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants