Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tcpdump -v -w foo.out fails to capture on Solaris #1258

Open
normjacobs opened this issue Apr 7, 2014 · 12 comments
Open

tcpdump -v -w foo.out fails to capture on Solaris #1258

normjacobs opened this issue Apr 7, 2014 · 12 comments
Assignees
Labels

Comments

@normjacobs
Copy link

root@snappy:/builds/jacobs/tcpdump/components/tcpdump# tcpdump -v -w /tmp/dump.out
tcpdump: listening on net0, link-type EN10MB (Ethernet), capture size 65535 bytes
^C0 packets captured
15 packets received by filter
0 packets dropped by kernel
root@snappy:/builds/jacobs/tcpdump/components/tcpdump# 

This should have captured all 15 packets received, but it didn't.
Further investigation seems to indicate that this is because of the following:
tcpdump sets are read() timeout for libpcap to 1 second using pcap_set_timeout(1000)
tcpdump sets an alarm timer to 1 second (alarm(1)) to updated the displayed number of packets captured every second when -v and -w are used.
in the bpf module, bpf_read() uses cv_timedwait_sig() to wait for the configured timeout before it returns from read(). This is always interruppted by the alarm causing read() to return EINTR. Dropping the bpf read timeout down to 950 seems to provide enough time for it to work.

--- tcpdump-4.5.1/tcpdump.c Mon Apr  7 10:34:02 2014
+++ tcpdump-4.5.1/tcpdump.c.orig    Thu Nov  7 17:22:54 2013
@@ -1286,7 +1283,7 @@
                error("%s: Can't set monitor mode: %s",
                    device, pcap_statustostr(status));
        }
-       status = pcap_set_timeout(pd, 1000);
+       status = pcap_set_timeout(pd, 950);
        if (status != 0)
            error("%s: pcap_set_timeout failed: %s",
                device, pcap_statustostr(status));

root@snappy:/builds/jacobs/tcpdump/components/tcpdump# build/amd64/tcpdump -v -w /tmp/dump.out
tcpdump: listening on net0, link-type EN10MB (Ethernet), capture size 65535 bytes
^C16 packets captured
16 packets received by filter
0 packets dropped by kernel
root@snappy:/builds/jacobs/tcpdump/components/tcpdump# 
@guyharris
Copy link
Member

in the bpf module, bpf_read() uses cv_timedwait_sig() to wait for the configured timeout before it returns from read(). This is always interruppted by the alarm causing read() to return EINTR.

That's not unique to Solaris, other than the names of the kernel routines used in the BPF code.

I'm not sure why this is only showing up on Solaris, but what happens if you change setsignal() to mark SIGALRM as a "restart after the interrupt" signal?

diff --git a/setsignal.c b/setsignal.c
index 1bcc032..ba53a07 100644
--- a/setsignal.c
+++ b/setsignal.c
@@ -73,7 +73,7 @@ RETSIGTYPE

    memset(&new, 0, sizeof(new));
    new.sa_handler = func;
-   if (sig == SIGCHLD)
+   if (sig == SIGCHLD || sig == SIGALRM)
        new.sa_flags = SA_RESTART;
    if (sigaction(sig, &new, &old) < 0)
        return (SIG_ERR);

and revert the timeout change?

@normjacobs
Copy link
Author

On 04/07/2014 11:57 AM, Guy Harris wrote:

in the bpf module, bpf_read() uses cv_timedwait_sig() to wait for
the configured timeout before it returns from read(). This is
always interruppted by the alarm causing read() to return EINTR.

That's not unique to Solaris, other than the names of the kernel
routines used in the BPF code.

I'm not sure why this is only showing up on Solaris, but what happens
if you change |setsignal()| to mark SIGALRM as a "restart after the
interrupt" signal?

This works.

 -Norm

|diff --git a/setsignal.c b/setsignal.c
index 1bcc032..ba53a07 100644
--- a/setsignal.c
+++ b/setsignal.c
@@ -73,7 +73,7 @@ RETSIGTYPE

 memset(&new, 0, sizeof(new));
 new.sa_handler = func;
  • if (sig == SIGCHLD)
  • if (sig == SIGCHLD || sig == SIGALRM)
    new.sa_flags = SA_RESTART;
    if (sigaction(sig, &new, &old) < 0)
    return (SIG_ERR);
    |


Reply to this email directly or view it on GitHub
#1258.

@normjacobs
Copy link
Author

I take that back. I had left in the shorter timeout. It's still broken
with setsignal() change.

 -Norm

On 04/07/2014 06:56 PM, Norm Jacobs wrote:

On 04/07/2014 11:57 AM, Guy Harris wrote:

in the bpf module, bpf_read() uses cv_timedwait_sig() to wait for
the configured timeout before it returns from read(). This is
always interruppted by the alarm causing read() to return EINTR.

That's not unique to Solaris, other than the names of the kernel
routines used in the BPF code.

I'm not sure why this is only showing up on Solaris, but what happens
if you change |setsignal()| to mark SIGALRM as a "restart after the
interrupt" signal?

This works.

-Norm

|diff --git a/setsignal.c b/setsignal.c
index 1bcc032..ba53a07 100644
--- a/setsignal.c
+++ b/setsignal.c
@@ -73,7 +73,7 @@ RETSIGTYPE

 memset(&new, 0, sizeof(new));
 new.sa_handler = func;
  • if (sig == SIGCHLD)
  • if (sig == SIGCHLD || sig == SIGALRM)
    new.sa_flags = SA_RESTART;
    if (sigaction(sig, &new, &old) < 0)
    return (SIG_ERR);
    |


Reply to this email directly or view it on GitHub
#1258.

@guyharris
Copy link
Member

It's still broken with setsignal() change.

My goodness, Sun^WOracle certainly seem to have failed big time here; I'm not having that problem on, for example, OS X, and if it were happening on *BSD, we would probably have heard about it. If Larry hadn't decided to kill OpenSolaris, I could perhaps have looked at the code to see what the problem is.

The Single UNIX Specification page for sigaction() says:

SA_RESTART
This flag affects the behavior of interruptible functions; that is, those specified to fail with errno set to [EINTR]. If set, and a function specified as interruptible is interrupted by this signal, the function shall restart and shall not fail with [EINTR] unless otherwise specified. If an interruptible function which uses a timeout is restarted, the duration of the timeout following the restart is set to an unspecified value that does not exceed the original timeout value. If the flag is not set, interruptible functions interrupted by this signal shall fail with errno set to [EINTR].

Implementing that behavior, for character special files such as the /dev/bpf files, is a function of the driver on BSD-flavored systems. For example, the bpfread() loop in OS X Mavericks includes

            error = BPF_SLEEP(d, PRINET|PCATCH, "bpf",
                              d->bd_rtout);

...

            if (error == EINTR || error == ERESTART) {
                    if (d->bd_slen) {
                            /*
                             * Sometimes we may be interrupted often and
                             * the sleep above will not timeout.
                             * Regardless, we should rotate the buffers
                             * if there's any new data pending and
                             * return it.
                             */
                            ROTATE_BUFFERS(d);
                            break;
                    }
                    lck_mtx_unlock(bpf_mlock);
                    return (error);
            }

In FreeBSD, the equivalent code is

            error = msleep(d, &d->bd_lock, PRINET|PCATCH,
                 "bpf", d->bd_rtout);
            if (error == EINTR || error == ERESTART) {
                    BPFD_UNLOCK(d);
                    return (error);
            }

and in NetBSD is

            error = tsleep(d, PRINET|PCATCH, "bpf",
                            d->bd_rtout);
            if (error == EINTR || error == ERESTART) {
                    splx(s);
                    KERNEL_UNLOCK_ONE(NULL);
                    return (error);
            }

and in OpenBSD is

            if (d->bd_rtout == -1) {
                    /* User requested non-blocking I/O */
                    error = EWOULDBLOCK;
            } else {
                    if ((d->bd_rdStart + d->bd_rtout) < ticks) {
                            error = tsleep((caddr_t)d, PRINET|PCATCH, "bpf",
                                d->bd_rtout);
                    } else
                            error = EWOULDBLOCK;
            }
            if (error == EINTR || error == ERESTART) {
                    D_PUT(d);
                    splx(s);
                    return (error);  
            }

and in DragonFly BSD is

            error = tsleep(d, PCATCH, "bpf", d->bd_rtout);
            if (error == EINTR || error == ERESTART) {
                    lwkt_reltoken(&bpf_token);
                    return(error);
            }

which might not handle this case; I'll have to try it, but I think there would have been significant complaints if it were broken on *BSD.

@infrastation
Copy link
Member

@normjacobs, could you post the following additional information for completeness?

  • OS version and patchlevel (or equivalent)
  • the tcpdump release that fails and if the bug reproduces on the current git master build of tcpdump

Thank you.

@guyharris
Copy link
Member

See also the-tcpdump-group/tcpdump#155.

@infrastation
Copy link
Member

Given lack of feedback in the past 6 years, is there any good reason to keep this open?

@mcr
Copy link
Member

mcr commented Aug 29, 2020

reopen if still an issue with recent version.

@mcr mcr closed this as completed Aug 29, 2020
@infrastation
Copy link
Member

@guyharris, is there anything useful to take from this report? Would it make sense to leave a note in a man page or source code comment?

@guyharris
Copy link
Member

For what it's worth, I built the tip of the master branch of libpcap, and then built the tip of the master branch of tcpdump with that libpcap, and tested it with -v and -w, and it seemed to work on "SunOS 5.11", with a "version" of 11.3, which I guess means Solaris 11.3.

However, if I

  • change the SIGALRM handler not to print anything;
  • change the interval timer to 1 microsecond;

no packets get captured.

So there is a problem - and I'm not sure it's Solaris-only. It does not happen on macOS Catalina; the Catalina code is

	if (error == EINTR || error == ERESTART) {
		if (d->bd_hbuf != NULL) {
			/*
			 * Because we msleep, the hold buffer might
			 * be filled when we wake up.  Avoid rotating
			 * in this case.
			 */
			break;
		}
		if (d->bd_slen != 0) {
			/*
			 * Sometimes we may be interrupted often and
			 * the sleep above will not timeout.
			 * Regardless, we should rotate the buffers
			 * if there's any new data pending and
			 * return it.
			 */
			ROTATE_BUFFERS(d);
			break;
		}
		bpf_release_d(d);
		lck_mtx_unlock(bpf_mlock);
		if (error == ERESTART) {
			printf("%s: %llx ERESTART to EINTR\n",
			    __func__, (uint64_t)VM_KERNEL_ADDRPERM(d));
			error = EINTR;
		}
		return error;
	}

and the "Sometimes we may be interrupted often and..." code is probably what makes it work - if there's data in the store buffer, and there's no data in the hold buffer, if a signal happened during the read(), the buffers get rotated, so the store buffer becomes the hold buffer, and the next read will return that data immediately, so the data doesn't get stuck in the store buffer until the store buffer fills up.

On FreeBSD 12, however, no packets get captured in that case; FreeBSD does not have the "rotate the buffers if the store buffer is non-empty and the hold buffer is empty" code that macOS has. (If I were still working there, I could see whether this is the scenario that caused Apple to add that code.)

So there is an issue here, at least with systems using BPF and not having that code.

@infrastation
Copy link
Member

So there is an old small problem. A solution does not have to be implemented immediately. I would be OK if this discussion boiled down to a comment with detailed directions in the source code for any future volunteers.

Would making the periodic update a separate POSIX thread be an OK solution? The periodic update code would be conditional, so if something like HAVE_WORKING_PTHREADS is not defined, the optional feature will not compile and will not get in the way of essential code flow.

Would the following text in the man page be OK to document the problem for the time being?

          When writing to a file with the -w option and at the  same  time
          not  reading  from  a  file with the -r option, report, once per
          second, the number of packets captured.  On Solaris and possibly
          other platforms this periodic update currently can cause loss of
          packets on their way from the kernel BPF to tcpdump.

@infrastation infrastation reopened this Sep 3, 2020
@mcr
Copy link
Member

mcr commented Sep 4, 2020

So there is an old small problem. A solution does not have to be implemented immediately. I would be OK if this discussion boiled down to a comment with detailed directions in the source code for any future volunteers.

Would making the periodic update a separate POSIX thread be an OK solution?
The periodic update code would be conditional, so if something like HAVE_WORKING_PTHREADS is not defined, the optional feature will not compile and will not get in the way of essential code flow.

I'm okay with doing this in tcpdump. I hate having a library that uses threads, but I don't think we need to do that.

Would the following text in the man page be OK to document the problem for the time being?

so, we'd document the issue. I would go further and suggest that it not be used on Solaris.

infrastation referenced this issue in the-tcpdump-group/tcpdump Sep 15, 2020
Capture what Guy has figured out so far.
@guyharris guyharris transferred this issue from the-tcpdump-group/tcpdump Jan 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

4 participants