Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python lock up while using ADC #358

Open
petski73 opened this issue Mar 19, 2022 · 1 comment
Open

Python lock up while using ADC #358

petski73 opened this issue Mar 19, 2022 · 1 comment

Comments

@petski73
Copy link

I am using a Beaglebone black SBC as a data collector for a solar thermal installation and have been running into a hang issue.
I am using a python program for collecting and preprocessing ADC measurements of temperature and motor currents in conjunction with a high level web enabled GUI front end. The Python program is spawned as a service by this program. The Python program reads ADC values from six different channels every 2 seconds. (5 conversions per channel every 2 seconds = 30 conversions)

Problem: After running correctly for anywhere from 3 to 6 weeks it hangs and no longer sends data to the GUI app.
Once hung, the Python program cannot be killed even with a "sudo kill -9" command. When in this hung state, attempting to start the program manually in a terminal results in an immediate hang and ^C does not work, the terminal is dead. Now there are two copies of the program running that cannot be killed. The only way out of this condition is a reboot. Once rebooted, manually running this program works correctly as does the GUI front end. Note that two copies of the ADC collector program are never run at the same time.

The rest of the machine still operates normally but the ADC cannot be accessed again once hung.
Question, is there a fail safe timer running in the ADC.read() function while waiting for end of conversion?

Since I cannot get any access to the Python program once hung, I am not sure how to attack debugging this issue.

FYI, I have noticed that the GUI front end appears to kill and restart the data collection program once each 24 hour period. When issuing "systemctl status MCJEDI" it shows a new process ID for the collector program for each day it has been running. It is possible this restart operation is causing the issue if it interrupts the ADC procedure at the wrong time. However, a program should not hang endlessly for any reason.

git:/opt/scripts/:[1aa73453b2c980b75e31e83dab7dd8b6696f10c7]
eeprom:[A335BNLT00C03919BBBK01F8]
model:[TI_AM335x_BeagleBone_Black]
dogtag:[BeagleBoard.org Debian Image 2018-10-07]
bootloader:[eMMC-(default)]:[/dev/mmcblk1]:[U-Boot 2018.09-00002-g0b54a51eee]:[location: dd MBR]
kernel:[4.14.71-ti-r80]
nodejs:[v6.14.4]
uboot_overlay_options:[enable_uboot_overlays=1]
uboot_overlay_options:[uboot_overlay_pru=/lib/firmware/AM335X-PRU-RPROC-4-14-TI-00A0.dtbo]
uboot_overlay_options:[enable_uboot_cape_universal=1]
pkg check: to individually upgrade run: [sudo apt install --only-upgrade ]
pkg:[bb-cape-overlays]:[4.4.20180928.0-0rcnee0stretch+20180928]
pkg:[bb-wl18xx-firmware]:[1.20180517-0rcnee0
stretch+20180517]
pkg:[kmod]:[23-2rcnee1stretch+20171005]
pkg:[librobotcontrol]:[1.0.3-git20181005.0-0rcnee0
stretch+20181005]
pkg:[firmware-ti-connectivity]:[20170823-1rcnee1~stretch+20180328]
groups:[debian : debian adm kmem dialout cdrom floppy audio dip video plugdev users systemd-journal i2c bluetooth netdev cloud9ide gpio pwm eqep admin spi tisdk weston-launch xenomai]
cmdline:[console=ttyO0,115200n8 bone_capemgr.uboot_capemgr_enabled=1 root=/dev/mmcblk1p1 ro rootfstype=ext4 rootwait coherent_pool=1M net.ifnames=0 quiet]
dmesg | grep pinctrl-single
[ 1.119748] pinctrl-single 44e10800.pinmux: 142 pins at pa f9e10800 size 568
dmesg | grep gpio-of-helper
[ 1.131736] gpio-of-helper ocp:cape-universal: ready
END

This script should be present for any image downloaded from:
https://beagleboard.org/ or https://rcn-ee.com/

@petski73
Copy link
Author

petski73 commented Nov 20, 2022

Additional information:
These random hangs continue. I just upgraded from BBIO version 1.0.10 to 1.2.0 and time will tell as it may take months to hang again.

Observations from hung state:
sudo reboot now command does not complete successfully (Must power cycle to recover)
sudo shutdown now never completes (Must power cycle to recover)
directly writing to the ADC control register (disable and power down bits) does not cause it to recover.
None of the kill commands can terminate the hung Python program.
Only a complete power removal resolves the issue.

See above original post. Python runs correctly for any program that does not use BBIO but hangs when attempting to use the ADC via BBIO.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant