Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using Linux cooked header version 2 always outputs link level data #1077

Open
AndreLuyer opened this issue Aug 20, 2023 · 17 comments
Open

Using Linux cooked header version 2 always outputs link level data #1077

AndreLuyer opened this issue Aug 20, 2023 · 17 comments

Comments

@AndreLuyer
Copy link

AndreLuyer commented Aug 20, 2023

Linux cooked header version 2 always outputs link level data, while version 1 only outputs extra information when the option -e is used (“Print the link-level header on each dump line.”).
The output of version 2 should be consistent with version 1 (and EN10MB Ethernet, etc.) and only output when -e is used.

For version 1 the function sll_if_print is called in print-sll.c and sll_print only when option -e is used (if (ndo->ndo_eflag)).
For version 2 the function sll2_if_print is called and ifname and sll2_pkttype is printed unconditionally.
I believe this should be moved to the sll2_print function.

Then also the warning “interface names might be incorrect” (tcpdump.c(2080)) should only be printed if option -e is used.
Version 4.9.3 and higher supports link-type LINUX_SLL2 (Linux cooked v2).

If you agree I will try to make a PR for this.

sll-v1-v2.zip
Attached samples produce this output:

$ tcpdump -qn -r sll-v1.pcap -c3
reading from file sll-v1.pcap, link-type LINUX_SLL (Linux cooked v1), snapshot length 262144
22:15:05.676722 IP 192.168.178.46.36096 > 192.168.178.1.80: tcp 0
22:15:05.677375 IP 192.168.178.1.80 > 192.168.178.46.36096: tcp 0
22:15:05.677400 IP 192.168.178.46.36096 > 192.168.178.1.80: tcp 0
$ tcpdump -qn -r sll-v2.pcap -c3
reading from file sll-v2.pcap, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144
Warning: interface names might be incorrect
22:17:45.529693 ens33 Out IP 192.168.178.46.36100 > 192.168.178.1.80: tcp 0
22:17:45.531064 ens33 In  IP 192.168.178.1.80 > 192.168.178.46.36100: tcp 0
22:17:45.531087 ens33 Out IP 192.168.178.46.36100 > 192.168.178.1.80: tcp 0
@AndreLuyer
Copy link
Author

I tried to subscribe to the tcpdump-workers mailing list, but I get the error:

- Results:
subscribe bad argument: tcpdump-workers

@guyharris
Copy link
Member

So the specific piece of link-level data you're referring to here is the direction, not the interface name or link-layer address?

@AndreLuyer
Copy link
Author

I mean both are printed; the interface name and direction. While in version 1 nothing is printed unless option -e is used. (This actually broke my existing script when version 2 was used after an upgrade of the servers.)
The link-layer address is only printed when option -e is used, thus when sll2_print function is called.

So referring to in print-sll.c in function sll2_if_print line 426 ND_PRINT("%-5s ", ifname);
and line 429-430 ND_PRINT("%-3s ", tok2str(sll_pkttype_values, "?", GET_U_1(sllp->sll2_pkttype)));
In the example above it is shown as "ens33 Out" and "ens33 In".

I believe lines 422-430 should be moved to the sll2_print function.

With option -e the output is:

$ tcpdump -qn -r sll-v2.pcap -c3 -e
reading from file sll-v2.pcap, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144
Warning: interface names might be incorrect
22:17:45.529693 ens33 Out ifindex 2 00:0c:29:06:f9:85 192.168.178.46.36100 > 192.168.178.1.80: tcp 0
22:17:45.531064 ens33 In  ifindex 2 e8:df:70:3a:1e:27 192.168.178.1.80 > 192.168.178.46.36100: tcp 0
22:17:45.531087 ens33 Out ifindex 2 00:0c:29:06:f9:85 192.168.178.46.36100 > 192.168.178.1.80: tcp 0

Which is fine with me.

@guyharris
Copy link
Member

guyharris commented Aug 22, 2023

I mean both are printed; the interface name and direction. While in version 1 nothing is printed unless option -e is used.

The interface name isn't printed with version 1 even if -e is used. It's only printed with version 2.

The direction is only printed with version 1 if -e is used. [corrected with "1" inserted after "version"]

Given that there is no guarantee that, in the future, only Linux cooked captures will ever show the direction or that Linux cooked captures v2 will show the interface, and that there may be cases where somebody doesn't want any link-layer addresses but does want the interface or direction, the right solution might be to adopt macOS's -k flag:

   -k metadata_arg
   --apple-md-print metadata_arg
          Control the display of packet metadata via an optional
          metadata_arg argument. This is useful when displaying packet
          saved in the pcap-ng file format or with interfaces that support
          the PKTAP data link type.

          By default, when the metadata_arg optional argument is not
          specified, any available packet metadata information is printed
          out.

          The metadata_arg argument controls the display of specific
          packet metadata information using a flag word, where each
          character corresponds to a type of packet metadata as follows:

                 I     interface name (or interface ID)
                 N     process name
                 P     process ID
                 S     service class
                 D     direction
                 C     comment
                 F     flags
                 U     process UUID (not shown by default)
                 V     verbose printf of pcap-ng blocks (not shown by default)
                 f     flow identifier
                 t     trace tag
                 A     display all types of metadata

so that the user can specify which packet metadata (as opposed to packet data, such as link-layer headers) they want.

@AndreLuyer
Copy link
Author

… the right solution might be to adopt macOS's -k flag:

I am in favor of this. But that is a major change.

That reminds me that libpcap support for Linux cooked can be enhanced as well. For example the filter "broadcast" results in the error "tcpdump: not a broadcast link". Now you have to use "link[0:2]=1" for version 1 or "link[10:1]=1" for version 2 instead. (Maybe add "sll broadcast"?)
I expect that Linux cooked will become more popular.

The direction is only printed with version if -e is used.

Maybe I misunderstood, I thought you meant by 'direction' that what is printed on line 430 (sllp->sll2_pkttype).
Anyway without -e option the output, for packets containing TCP, is:
<timestamp> <IP addresses and ports> etc. for Ethernet
<timestamp> <IP addresses and ports> etc. for Linux cooked v1
<timestamp> <sll info> <IP addresses and ports> etc. for Linux cooked v2
So without changing the command line (options) there are 2 extra fields printed. That was unexpected and was the reason why I opened this issue.

Please advise how to go forward.

@guyharris
Copy link
Member

The direction is only printed with version if -e is used.

Maybe I misunderstood, I thought you meant by 'direction' that what is printed on line 430 (sllp->sll2_pkttype).

By "direction" I meant "that which is printed on line 164 (sllp->sll_pkttype). for version 1 and on line 430 (sllp->sll2_pkttype).for version 2", and by "printed with version" I meant "printed with version 1" (and fixed that comment to say "version 1").

The direction is printed only with -e with version 1 and printed unconditionally with version 2. That is inconsistent.

I don't consider the interface to be part of the link-layer header. Currently, with version 2 Linux cooked captures, the interface ID is part of the header constructed by libpcap, but it's not part of any link-layer header on the wire, or even a characteristic of either OSI layer 1 or 2 such as the data from the radiotap layer on 802.11 - it's not different for different packets on the same network segment.

And I can imagine somebody who is not interested in any link-layer information, and may not want MAC addresses cluttering up their tcpdump output, wanting to know, for a capture on more than one interface, on which interface a packet is sent or received. (That's why version 2 was introduced.)

So I am not in favor of tying the interface name to -e. If we're to tie it to a command-line option, I'd tie it to -k, to allow it to be controlled independently of -e.

The direction, however, could be considered part of the link layer. However, some people might want to see it even if they don't want to see the link-layer header, so I think that one's debatable; perhaps using -k would be the right answer.

@fxlb
Copy link
Member

fxlb commented Aug 27, 2023

We could: Leave the sll2 version as is. For sll1, print the packet type (In, B, M, P, Out) without -e.

@AndreLuyer
Copy link
Author

I don't consider the interface to be part of the link-layer header.

That makes sense as it contains data that is not send over the wire. However it does replace the Ethernet header that is send over the wire (for both NIC and localhost) and it contains the source MAC address.
The comment on line 84 states: "A DLT_LINUX_SLL fake link-layer header.". Apparently it was originally designed as a 'fake link-layer'.

The direction is printed only with -e with version 1 and printed unconditionally with version 2. That is inconsistent.

IMO it should be consistent.

We could: Leave the sll2 version as is. For sll1, print the packet type (In, B, M, P, Out) without -e.

That would make it less consistent: No output with Ethernet, only direction with sll1 and name and direction with sll2.

Looking at the man page the option -e should print, at least, the MAC address(es).

I propose to change it as follows:

  1. without using option -e or -k: no extra output.
  2. with option -e: for version 1 keep current output; for version 2 (at least) the same output as for version 1.
  3. have option -k to 'fine tune' what is outputted.

@guyharris
Copy link
Member

guyharris commented Aug 27, 2023

However it does replace the Ethernet header that is send over the wire (for both NIC and localhost) and it contains the source MAC address.
The comment on line 84 states: "A DLT_LINUX_SLL fake link-layer header.". Apparently it was originally designed as a 'fake link-layer'.

Yes, it was. The "cooked" header is used in three cases:

  1. For some interface link-layer types (which Linux, for historica(?) reasons, calls "ARP hardware types"), the drivers that generate them tweak some skbuff fields so that the real link-layer header is permanently stripped off, so that it's not presented to programs reading from PF_PACKET/SOCK_RAW sockets (which is what libpcap uses by default for packet capture), meaning that packet information necessary to figure out how to dissect the payload is missing. At least at one point, PPP interfaces were an example of this.
  2. Linux has, as of the current tip of the main branch of the official Git repository of the kernel, 68 link-layer types indicated by ARPHRD_ values. There aren't documents that describe all of them precisely (to the extent that they could be described at the level of precision with which the link-layer headers corresponding to DLT_/LINKTYPE_ should be described), and the list may increase without notification and without a new LINKTYPE_/DLT_ being requested, so, instead of rejecting those ARPHRD_ values, we map them to a cooked capture value and provide a warning (so some applications will report a warning, so the user knows that they might be losing information, and should perhaps request that a LINKTYPE_/DLT_ be assigned to them).
  3. When capturing on the "any" device, there would be only one linktype in a pcap file, and there would be only one filter program in the kernel. This requires that packets from all devices have, in effect, the same link-layer header, so cooked capturing is done.

Yes, this means that you lose whatever link-layer headers the packets have (which cannot be guaranteed to be Ethernet headers - even if all current interface drivers provide Ethernet headers, a new device might get attached during the capture process, and it might not provide Ethernet headers). This is an unfortunate consequence of

  1. annoying drivers that mess with the skbuff fields in question;
  2. supporting devices with an unknown link-layer header type;
  3. the inability of the pcap file format and the current libpcap API to support the notion of a capture on multiple devices that don't all have the same link-layer header type.

In order not to use cooked captures for those:

For 1), that'll take some checking to see whether kernel changes have fixes that issue.

For 2), there's not much we can do about that, other than, again, looking at the kernel source to assign LINKTYPE_/DLT_ values to current ARPHRD_ values that don't already have them, but there may always be new ones.

For 3), pcapng, libpcap API changes to fully support pcapng, and some hackery in the filter compiler to use, for example, "offset to the payload" values to try to generate a single filter program that could handle multiple link layer types would be necessary.

That would make it less consistent: No output with Ethernet, only direction with sll1 and name and direction with sll2.

You can't get name or direction with Ethernet and other non-cooked link-layer headers, because they don't provide that information. That requires some place to provide metadata such as that, which means using pcapng and libpcap API changes to fully support pcapng.

You can't get the interface name with SLL1 as the SLL1 header doesn't provide an indication of the interface on which packets arrived. Again, that requires some place to provide metadata such as that, which means, again, using pcapng and libpcap API changes to fully support pcapng, which would allow the interface information to be stored in capture files, allowing the right interface names to be reported if you do a multi-interface capture on one machine and read it on a different machine, and would also allow, as noted in the previous paragraph, direction information to be reported for all devices.

That's what macOS's libpcap and tcpdump support, and what their -k flag uses for the metadata to report.

Looking at the man page the option -e should print, at least, the MAC address(es).

That's not going to happen with cooked captures, as cooked captures discard the device-specific link-layer header in favor of a standard link-layer header. Perhaps with an SLL3 header, which prepends a metadata header before the device-specific link-layer header, that could be done, but that would involve changes to libpcap, tcpdump, Wireshark, and other programs that read capture files.

without using option -e or -k: no extra output.

I.e., don't print the interface, even for SLL2 (the only link-layer type where the interface can be printed without the pcapng supportrt changes to libpcap mentioned above), and don't print the direction.

with option -e: for version 1 keep current output; for version 2 (at least) the same output as for version 1.

I.e., don't print the direction for SLL1 and SLL2, and don't print the interface for SLL2.

have option -k to 'fine tune' what is outputted.

To the extent that's possible, so you can - without the pcapng changes, only get the direction when reading an SLL1 or SLL2 file and only get the interface when reading an SLL2 file - choose which of those to print.

Note that, BTW, the output for the -k flag should match the macOS format:

$ sudo tcpdump -c 1 -i any -k I 

...

13:14:36.582464 (en0) IP XXX.XXX.XXX.XXX.https > 192.168.1.3.64723: Flags [P.], seq 1576073188:1576073248, ack 2552479664, win 8, options [nop,nop,TS val 2982863689 ecr 3258798788], length 60

...

$ sudo tcpdump -c 1 -i any -k D

...

13:14:43.371715 (in) IP XXX.XXX.XXX.XXX.https > 192.168.1.3.64615: Flags [P.], seq 1019456218:1019456278, ack 2984318233, win 8, options [nop,nop,TS val 2615934370 ecr 3097681973], length 60

...

$ sudo tcpdump -c 1 -i any -k ID

...

13:14:46.296686 (en0, in) IP XXX.XXX.XXX.XXX.https > 192.168.1.3.64615: Flags [P.], seq 1019456278:1019456338, ack 2984318233, win 8, options [nop,nop,TS val 2615937372 ecr 3097685002], length 60

...

for compatibility with the macOS tcpdump.

@fxlb
Copy link
Member

fxlb commented Aug 27, 2023

We could: Leave the sll2 version as is. For sll1, print the packet type (In, B, M, P, Out) without -e.

That would make it less consistent: No output with Ethernet, only direction with sll1 and name and direction with sll2.

I mean with or without -e (not depending on -e), like for sll2.

@fxlb
Copy link
Member

fxlb commented Aug 28, 2023

  1. without using option -e or -k: no extra output

I don't agree with this change. sll2 is now the default with the any interface and a quick look is easy with tcpdump -#ni any.
We need to keep that.

@fxlb
Copy link
Member

fxlb commented Aug 28, 2023

In version 4.99.0 (December 30, 2020), SLL2 was make the default for Linux "any" pseudo-device.
SLL1 is the past. Not sure we need to take care of this inconsistency.

@guyharris
Copy link
Member

Here's the difference between tcpdump 4.99's print-sll.c and macOS's tip-of-the-main-branch tcpdump's print-sll.c:

$ diff print-sll.c /Volumes/Case-sensitive/src/macos/tcpdump/tcpdump/print-sll.c
29,33d28
< /*
<  * Include diag-control.h before <net/if.h>, which too defines a macro
<  * named ND_UNREACHABLE.
<  */
< #include "diag-control.h"

so it prints SLL1 and SLL2 the same way tcpdump 4.99 does.

(Of course, if you do an SLL2 capture, it's almost certainly on a Linux box, meaning that if you read the capture on a Mac, the two machines will be different - even if they're the same machine dual-booting - so the warning about the interface names being wrong will come true; the capture I did on a Linux VM, capturing on the "any" device and pinging over both ens33 and lo, reported traffic on gif0 and lo0 when I read it on the Mac on which the VM was running.)

My inclination is:

  • with no -k flag, print stuff the same way it's printed now, so as not to make changes;
  • with a -k flag, print stuff as specified by the -k flag (maybe add support for -k none, meaning "print no metadata"), in the macOS style (e.g., "(xyzzy0, in)" rather than "xyzzy0 In", with no metadata printing as "()"), rather than in the current style;
  • with a future version of libpcap with full pcapng support, add full support in tcpdump and, when reading a pcapng file with a -k flag, any metadata from the pcapng packet block (other than "this packet arrived on the 'any' interface") overrides metadata from the SLL1 or SLL2 header.

The first of those matches what any box running a 4.99-based tcpdump (including a macOS box running Apple's tcpdump) would print.

The second of those wouldn't match what the current macOS tcpdump would print, as, if you have a pcapng packet on an interface with LINUX_SLL2 as its link-layer type, it would print metadata from both the pcapng packet block and the SLL2 header. I'm not sure the current macOS tcpdump behavior is useful, so I hope Apple would just pick that up from us.

The third of those also wouldn't match what the current macOS tcpdump would print, for the same reason, and, again, I'm not sure the current macOS behavior is useful in that case.

@fxlb
Copy link
Member

fxlb commented Aug 28, 2023

  • with no -k flag, print stuff the same way it's printed now, so as not to make changes;

Agreed.

  • with a -k flag, print stuff as specified by the -k flag (maybe add support for -k none, meaning "print no metadata"), in the macOS style (e.g., "(xyzzy0, in)" rather than "xyzzy0 In", with no metadata printing as "()"), rather than in the current style;

Like ?

$ tcpdump -#ne -r sll-v1.pcap
    1  22:15:05.676722 Out 00:0c:29:06:f9:85 ethertype IPv4 ...

$ tcpdump -k D -#n -r sll-v1.pcap
    1  22:15:05.676722 (out) 00:0c:29:06:f9:85 ethertype IPv4 ...
    
$ tcpdump -k D -#ne -r sll-v1.pcap
    1  22:15:05.676722 (out) 00:0c:29:06:f9:85 ethertype IPv4 ...

Not sure having 2 output strings Out and (out) make things clear and easy to grep.
I think there is no need to copy the Apple format.

@guyharris
Copy link
Member

Like ?

Yes.

Not sure having 2 output strings Out and (out) make things clear and easy to grep.
I think there is no need to copy the Apple format.

If and when we do a libpcap and tcpdump with pcapng support, copying the Apple format for that means that 1) the output is the same for Apple's tcpdump and our tcpdump when reading a pcapng file, so it's "easy to grep" in that you don't need different grep patterns for macOS and other OSes and 2) Apple wouldn't have to decide whether to change their output to match ours or to patch our code to produce their output style.

For pcap files, currently we don't even support a -k flag, so it's not as if producing a different output format for -k output would break our output.

@fxlb
Copy link
Member

fxlb commented Aug 31, 2023

For pcap files, currently we don't even support a -k flag, so it's not as if producing a different output format for -k output would break our output.

Maybe I didn't understand your proposal. Would the -k option only be used for pcapng files?

@infrastation
Copy link
Member

One other potential space for improvement in the SLL2 decoder is to print exactly one space around the interface name when the latter is present. Currently sll2_if_print() uses the following, which produces more spaces for short interface names:

ND_PRINT("%-5s ", ifname);

Since commit 0e04b9d the "interface names might be incorrect" warning now appears only if the output includes interface names.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants