Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UART/network interaction #72

Open
sbirrari opened this issue Sep 29, 2023 · 14 comments
Open

UART/network interaction #72

sbirrari opened this issue Sep 29, 2023 · 14 comments

Comments

@sbirrari
Copy link

sbirrari commented Sep 29, 2023

Board: Omega2S+
Carrier: Omega2 PRO + Ethernet expansion
Image: onion_omega2-22.03.5-20230925.bin

In my project I extensively use the serial port to communicate between the board and an external data acquisition device.
I report small data losses when the network connection state changes.

In detail in my test:

  • a PC is sending data at full speed at 230400/460800 baud on ttyS1, using a repetitive pattern from 0x00 to 0xFF
  • a sample program running on the board is receving the data checking the pattern continuity
  • the data acquisition works fine without any loss
  • during the data acquisition is issued the command "/etc/init.d/network restart"
    Just after the command ends the communication is perturbated and some data is lost.
    The same thing happens for example removing and inserting again the nework cable.

Beyond the checks made by the receiving program itself, I can also confirm the data loss from the "overrun" counter of the "serial_icounter_struct" received issuing a TIOCGICOUNT ioctl, that increments after the fact.

I performed this test because I saw the same issue using the omega2pro-v0.3.4-b257.bin image (OpenWRT 18.06), that is the image on which I developed my project.

I provide the source of the receiving program
testserial.txt

Do you know what causes the problem and can maybe suggest a workaround?

@greenbreakfast
Copy link
Contributor

@sbirrari thanks for submitting a detailed issue. A few questions:

  1. How are you running this program? Are you connected via (UART0) serial to the command line or through SSH? Is the program running in the background or foreground? I ask because if you're connected with SSH, perhaps the program stops while the SSH connection is broken during the network reset
  2. Does the same issue occur at lower baud rates?

@sbirrari
Copy link
Author

Thank you for reviewing my issue!

  1. I'm connected via UART0 and the program is running in the foreground.
  2. It happens at lower baud rates too (115200, 38400). It doesn't happen at 9600.

@greenbreakfast
Copy link
Contributor

@sbirrari Thanks for confirming. We'd like to reproduce this on our end.

Can you please provide an example sending program? And perhaps some info on how you're using the PC to send the data - for example, are you using a USB-to-serial?

@sbirrari
Copy link
Author

I can provide an executable for Windows written in Delphi and its source code, would it be ok for you?
To send data we are using an USB to UART module named FTDI FT234XD with the pinout that follows the industry standard FTDI TTL 3.3v cable interface.

@sbirrari
Copy link
Author

We will try to reproduce the problem writing and using a sending program that can run under Linux.
I will let you know the result and send the program if the problem still persists.

@sbirrari
Copy link
Author

TestUart.zip

Attached the source code of a simple test sending program. At the moment it runs only on Windows, it was the faster solution to send a test program, we need more time if a Linux version is absolutely needed.
You can build it using the free Lazarus IDE for Windows.
The file to load in the IDE is the one named TestUart.lpr, to compile press CTRL+F9, the executable will be created in the same folder of the source file.
Anyway in the zip there also is the executable itself.

The executable is a console program, you can launch it using the syntax "TestUart [COM_port]:[serial parameters]", for example "TestUart COM2:38400,n,8,1".
Right after the start it only prints the received chars, the purpose is to test the connection, since the receiving program prints a test char on start. After that pressing "G" it starts to send data and stops pressing "ESC". Wait until the buffer is flushed and then stop the receving program.

Using this executable the problem happens from 38400 baud.
The program sends data using the module mentioned before (FTDI FT234XD) on the Onion UART1, the receiving program is started from a console session on the UART0.
This is what we can provide at the moment, please let me know of any other request.
Thank you again.

@greenbreakfast
Copy link
Contributor

@sbirrari ok, we'll try to reproduce this. We'll get back to you if we run into issues

@greenbreakfast
Copy link
Contributor

@sbirrari should have some updates on this within the next week

@greenbreakfast
Copy link
Contributor

Hi @sbirrari, we've been able to reproduce your issue. Along with some additional testing, we've came to the conclusion the overrun is caused by an unknown issue in the testserial C program.

First test - screen

As the first and simplest test, we used a serial tool on a PC to send data, and observed it on the Omega2 using the screen command.

✅ We observed no lost data when the network was reset.

See issue-72-testing-screen.pdf for full details.

Second Test - testserial C program

Second, we wanted to reproduce the behaviour you reported. Two versions of testserial were compiled, one running at 9600 baud and the other at 460800. The TestUart program you provided was used to transmit data from a PC.

✅ We confirmed normal operation at 9600 baud when pulling out or inserting the ethernet cable.
❌ We also confirmed the overrun at 460800 baud when pulling out or inserting the ethernet cable

Third Test - Python program

Given the two data points above, we wanted to rule out any issues with the testserial program.
We wrote a simple python program to receive data on UART1 and output it on stdout. The TestUart program you provided was used to transmit data from a PC.

✅ This program was tested at 9600, 115200, 230400, and 460800 baud rates. In all cases, there was no data loss when pulling out or inserting the ethernet cable

See issue-72-testing-c-python.pdf for details on the C and Python testing.

Recommendations

We recommend looking into the testserial C program to determine the root cause of the issue.
I would also recommend running our Python sample program to confirm no data loss on your end. More details on the python program below.


Python Program Details

Use the stable v0.3.4 firmware, run the following commands to install the required packages:

opkg update
opkg install python-pyserial python-codecs

Here is the source code, the baudrate can be adjusted by changing line 35:

import serial

def read_serial_data(port, baudrate=9600, timeout=1):
    try:
        # Open the serial port
        ser = serial.Serial(port, baudrate, timeout=timeout)

        print("Reading data from {}. Press Ctrl+C to exit.".format(port))

        while True:
            try:
                # Read data from the serial port
                data = ser.readline().decode('utf-8', errors='ignore').strip()

                # Print the received data to stdout
                print("Received data: {}".format(data))
            except serial.SerialException as e:
                if "overrun" in str(e).lower():
                    print("Overrun error occurred. Skipping this reading.")
                else:
                    raise

    except serial.SerialException as e:
        print("Error: {}".format(e))
    except KeyboardInterrupt:
        print("\nExiting the program.")
    finally:
        # Close the serial port when done
        if ser.isOpen():
            ser.close()

if __name__ == "__main__":
    serial_port = '/dev/ttyS1'
    # change the baudrate variable below to set baudrate
    baudrate = 9600

    # Call the function to read data from the serial port
    read_serial_data(serial_port, baudrate)

@sbirrari
Copy link
Author

sbirrari commented Nov 23, 2023

@greenbreakfast, thank you so much for the detailed tests and analysis.
I'll try to run your python code and contact you if I have any update.

@sbirrari
Copy link
Author

@greenbreakfast
We reproduced the problem with the following code:

import serial

def SerialLoop():
	#port='COM13'
	port='/dev/ttyS1'
	#baudrate=230400
	baudrate=38400
	Cnt=0
	Tot=0
	Parz=0
	ByteIn=0
	Errors=0
	
	print("Reading data from {}. Press Ctrl+C to exit.".format(port))
	try:
		# Open the serial port
		ser = serial.Serial(port, baudrate, timeout=0)
		ser.write('Ready'.encode('ascii'))
		while True:
			data = ser.read()
			if (len(data)>0):
				ByteIn = ord(data);
				if (ByteIn!=Cnt):
					print("Read "+hex(ByteIn)+" instead of: "+hex(Cnt))
					Errors = Errors + 1;
					Cnt = ByteIn
				Tot = Tot + 1
				Parz = Parz + 1
				if (Parz>=5000):
					Parz = 0;
					print("Received "+str(Tot)+" bytes.")
				Cnt = Cnt + 1
				if (Cnt>255):
					Cnt = 0
		
	except serial.SerialException as e:
		print("Error: {}".format(e))
	except KeyboardInterrupt:
		print("\nExiting the program. Received {} bytes. ".format(Tot) + str(Errors) + " Errors.")
	finally:
		# Close the serial port when done
		if ser.isOpen():
			ser.close()

SerialLoop()

Launch the TestUart program, then the Python one (let's call it PySer).
TestUart will see the control chars sent by PySer.
Start the data sending in TestUart, PySer will print a line for each 5000 chars received.
Stop the data sending in TestUart (it will print the sent chars count) and wait for PySer to get all the data.
Before closing, PySer prints the received chars count.
If the network cable is not connected everything works fine.
When inserting and removing the network cable PySer losts data.

Please take in consideration that the issue can be at the driver/kernel level.
We first saw the problem in our data acquisition software that works without any data loss on many SBC platforms.

@greenbreakfast
Copy link
Contributor

@sbirrari with this new python program we were able to confirm there is data loss if the ethernet cable is connected or disconnected when running over 9600 baud.

@jempatel can you please try debugging this on the kernel driver level?

See the attached report for testing details and the previous comment in this issue for the updated python program source. You'll need to be running the stable v0.3.4 firmware and run these commands to install the python dependencies opkg update; opkg install python-pyserial python-codecs

issue-72-pyser-testing-report.pdf

Let me know if there's anything else you need.

@sbirrari
Copy link
Author

sbirrari commented Dec 4, 2023

@greenbreakfast, @jempatel thank you again for your interest in this issue.
I confirm that it seems to happen for any network change, inserting and removing the network cable, stopping/starting the wifi interface, stopping/starting the whole network service.

@sbirrari
Copy link
Author

@greenbreakfast, @jempatel, have you any update on this issue?
I want to let you know that I tested the last published image (onion_omega2p-22.03.5-20240318.bin) and the problem is still present.
Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants