Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Workaround] v1.0.192 crashes with error "out of time for read" #39

Open
adnovea opened this issue Nov 9, 2023 · 3 comments
Open

[Workaround] v1.0.192 crashes with error "out of time for read" #39

adnovea opened this issue Nov 9, 2023 · 3 comments

Comments

@adnovea
Copy link

adnovea commented Nov 9, 2023

I have a CZ-TAW1B module and use the Ethernet link to monitor my Aquarea.
It have tried different versions : v1.0.166, v1.1.191 and the last one v1.0.192.
The two latter versions crashes the same way after a variable period of time (few minutes) with the same error message "out of time for read" :

71 C8 01 10 55 95 52 49 00 55 00 01 00 00 00 00 00 00 00 00 59 15 14 55 55 15 55 55 55 19 00 00 00 00 00 00 00 00 99 9C 85 80 B4 71 71 97 99 00 00 00 00 00 00 00 00 00 00 00 80 85 15 8A 85 85 D0 7B 78 1F 7E 1F 1F 79 79 8D 8D B7 A3 7B 8F B7 A3 7B 8F 98 85 80 8F 8A 94 9E 8F 8A 94 9E 85 8F 8A 11 3D 78 C1 0B 7E 7C 1F 7C 7E 00 00 00 55 55 55 21 73 15 55 05 09 11 65 00 00 00 00 00 00 00 00 C2 D3 0C 33 65 B2 D3 0B 94 65 95 00 00 8D 8B 8A 32 32 A5 AA 32 32 32 99 A5 8B 8B 96 8C 8B 61 8C 61 8E 38 01 01 01 00 00 22 00 01 01 01 01 79 79 01 01 43 02 00 2F 03 00 16 00 00 01 00 00 06 01 01 01 01 01 01 01 02 00 00 5A
Checksum and header received ok!
Total reads : 279.000000 and total good reads : 277.000000 (99.28 %)
received TOP50 Discharge_Temp: 11
Publikuje do  panasonic_heat_pump/main/Discharge_Temp warosc 11
read ma status true
sent bytes: 110 with checksum: 18
71 6C 01 10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Received 203 bytes data

71 C8 01 10 55 95 52 49 00 55 00 01 00 00 00 00 00 00 00 00 59 15 14 55 55 15 55 55 55 19 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Checksum received false!
read ma status false
sent bytes: 110 with checksum: 18
71 6C 01 10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
out of time for read :(
sent bytes: 110 with checksum: 18
71 6C 01 10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
out of time for read :(
sent bytes: 110 with checksum: 18
71 6C 01 10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

Once the failure occurs, it repeats indefinitely.

Here is a ~5 hours chronology of the crashes:

Thu Nov  9 11:29:42 CET 2023 Start GoHeishaMon
Thu Nov  9 11:30:58 CET 2023 Crashed
Thu Nov  9 11:37:43 CET 2023 Crashed
Thu Nov  9 11:44:01 CET 2023 Crashed
Thu Nov  9 11:50:27 CET 2023 Crashed
Thu Nov  9 12:03:11 CET 2023 Crashed
Thu Nov  9 12:07:25 CET 2023 Crashed
Thu Nov  9 12:17:05 CET 2023 Crashed
Thu Nov  9 12:19:19 CET 2023 Crashed
Thu Nov  9 12:24:53 CET 2023 Crashed
Thu Nov  9 12:30:14 CET 2023 Crashed
Thu Nov  9 12:33:00 CET 2023 Crashed
Thu Nov  9 12:42:20 CET 2023 Crashed
Thu Nov  9 12:48:07 CET 2023 Crashed
Thu Nov  9 12:51:53 CET 2023 Crashed
Thu Nov  9 12:56:51 CET 2023 Crashed
Thu Nov  9 12:58:34 CET 2023 Crashed
Thu Nov  9 13:05:40 CET 2023 Crashed
Thu Nov  9 13:20:07 CET 2023 Crashed
Thu Nov  9 13:21:39 CET 2023 Crashed
Thu Nov  9 13:28:31 CET 2023 Crashed
Thu Nov  9 13:37:08 CET 2023 Crashed
Thu Nov  9 13:40:48 CET 2023 Crashed
Thu Nov  9 13:45:53 CET 2023 Crashed
Thu Nov  9 13:48:17 CET 2023 Crashed
Thu Nov  9 14:08:54 CET 2023 Crashed
Thu Nov  9 14:17:38 CET 2023 Crashed
Thu Nov  9 14:26:07 CET 2023 Crashed
Thu Nov  9 14:37:23 CET 2023 Crashed
Thu Nov  9 14:50:44 CET 2023 Crashed
Thu Nov  9 14:54:55 CET 2023 Crashed
Thu Nov  9 14:57:09 CET 2023 Crashed
Thu Nov  9 14:57:53 CET 2023 Crashed
Thu Nov  9 15:07:08 CET 2023 Crashed
Thu Nov  9 15:08:19 CET 2023 Crashed
Thu Nov  9 15:41:12 CET 2023 Crashed
Thu Nov  9 15:46:54 CET 2023 Crashed
Thu Nov  9 16:00:41 CET 2023 Crashed
Thu Nov  9 16:03:58 CET 2023 Crashed
@adnovea adnovea changed the title v1.0.192 crashes with error "out of time for read" [Workaround] v1.0.192 crashes with error "out of time for read" Nov 15, 2023
@adnovea
Copy link
Author

adnovea commented Nov 15, 2023

For a week now, I have a working workaround to manage the read failures. It's not perfect but it works !

I wrote a script /etc/gh/daemon.sh that kills and restarts GHM when the reading fails :

#!/bin/sh
echo -e "\n`date` Start GoHeishaMon"
while [ true ]; do
  [ ! -z `pidof GoHeishaMon_MIPSUPX` ] && kill -9 `pidof GoHeishaMon_MIPSUPX`
  sleep 10
  /usr/bin/GoHeishaMon_MIPSUPX | grep -q "out of time for read"
# echo -e "`date` Crashed\n"
done

and run the script at startup using the Local Startup file from LuCI :

# Put your custom commands here that should be executed once
# the system init finished. By default this file does nothing.

echo "START WATCHDOG   =====" > /dev/ttyS0
echo 300 > /proc/sys/kernel/panic
echo 0 > /proc/sys/kernel/panic_on_oops

(/usr/bin/check_buttons.sh > /dev/null 2>&1) &
/etc/gh/nextboot.sh
echo "" > /etc/gh/nextboot.sh

echo "START GoHeishaMon APL=====" > /dev/ttyS0
#/usr/bin/GoHeishaMon_MIPSUPX > /dev/ttyS0

/etc/gh/daemon.sh > /dev/ttyS0

#/usr/bin/a2wmain > /dev/ttyS0
#exit 0

Across the day, I saw the Free space decreasing. I was afraid of memory losses and crash but it seems there are some housekeeping daemon that clears the logs and recovers the free space every day.

image

@vzamanillo
Copy link

The "real" workaround is checking why reading the serial took more than 5 seconds or if you don't wanna do that increasing the value yourself and building a new image.

@adnovea
Copy link
Author

adnovea commented Feb 29, 2024

Dear vzamanillo, Thanks for your answer.
Unfortunately as explained above, the reading is fine during a variable period of time a couple of second to 5 minutes or more then the "Checksum received false!" arrives and the software must be killed and restated.
I was not able to find out the reason of the crash.

Building a new image requires to install a development chain for OpenWRT but this is out of my scope of skills.
For the time being, my "forced workaround" works with H.A.. Maybe someone with more software experience will be able to tackle this issue in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants