2017年9月5日 星期二

計算TCP/IP Checksum

在TCP/IP有三個checksum
(Layer 3) IP header checksum : IPv4 header (option)
(Layer 4) TCP/UDP checksum : TCP/UDP Header + Pseudo Header + Payload 

如何計算:

1. IP header checksum
始終是由核心在軟體中進行計算和查證
IP封包資料
把資料以2 bytes為一組加總(checksum欄位除外)
45 00 00 30 cc 61 40 00 40 06 4c 02 0a 05 04 6b 0a 08 09 ed
4500 + 0030 + cc61 + 4000 + 4006 + 0a05 + 046b + 0a08 + 09ed
= 1b3fc

進位的再加回來

1 + b3fc = b3fd (1011 0011 1111 1101)
結果取1的補數
0100 1100 0000 0010 -> 4c 02

2. TCP checksum

TCP封包資料
(1) Pseudo Header: Source IP + Destination IP + Protocol + L4 Header Length
0a05 + 046b + 0a08 + 09ed + 0006 + 001c (28 bytes)
= 2287

(2) TCP header

把資料以2 bytes為一組加總(checksum欄位除外)
f3 dd 0c d3 d9 fa f8 26 00 00 00 00 70 02 ff ff 8e e9 00 00 02 04 05 b4 04 02 00 00
f3dd + 0cd3 +d9fa +f826 + 7002 + ffff + 0204 + 05b4 + 0402
= 44e8b

2287 + 44e8b = 47112

進位的再加回來
4 + 7112 = 7116 (0111 0001 0001 0110)
結果取1的補數
1000 1110 1110 1001 -> 8e e9

3. UDP checksum

UDP封包資料
(1) Pseudo Header: Source IP + Destination IP + Protocol + L4 Header Length
0a05 + 046b + 0808 + 0808 + 0011 + 0028 (40 bytes)
= 1EB9

(2) UDP header

把資料以2 bytes為一組加總(checksum欄位除外)
f3 42 00 35 00 28 73 c2
f342 + 35 + 28
= F39F

(3) UDP Payload

eb 3c 01 00 00 01 00 00 00 00 00 00 03 77 77 77 06 67 6f 6f 67 6c 65 03 63 6f 6d 00 00 01 00 01
eb3c + 100 + 1 + 377 + 7777 + 667 + 6f6f + 676c + 6503 +636f + 6d00 + 1 + 1
= 379E1

1EB9 + F39F + 379E1 = 48C39

進位的再加回來
4 + 8C39 = 8C3D (1000 1100 0011 1101)
結果取1的補數
0111 0011 1100 0010 -> 73 c2

核心計算checksum
ip_send_check計算外出封包的IP checksum
net/ipv4/ip_output.c
/* Generate a checksum for an outgoing IP datagram. */
void ip_send_check(struct iphdr *iph)
{
    iph->check = 0;
    iph->check = ip_fast_csum((unsigned char *)iph, iph->ihl);
}
EXPORT_SYMBOL(ip_send_check);
iphdr->check之值應該先變為零,因為checksum不應該反應出checksum本身
因此使用的是簡單的求合法,零值欄位就能有效地被排除在所得到之checksum結果之外

核心驗證checksum
如果checksum是正確的,而且進行轉送或接收的節點有對整個header執行此演算法(原本的iphdr->check欄位留著不動),結果會得到零。這種檢查損毀的方式比較快
net/ipv4/ip_input.c
int ip_rcv(struct sk_buff *skb, struct net_device *dev, struct packet_type *pt, struct net_device *orig_dev)
{
...
    if (unlikely(ip_fast_csum((u8 *)iph, iph->ihl)))
        goto csum_error;
}
----------------------------------------------------------------------------------
arch/x86/include/asm/checksum_64.h

extern __sum16 ip_compute_csum(const void *buff, int len);
* ip_compute_csum - Compute an 16bit IP checksum. * @buff: buffer address. * @len: length of buffer. * * Returns the 16bit folded/inverted checksum of the passed buffer. * Ready to fill in. */
計算checksum的通用函式。它的輸入參數就是一個任意大小的緩衝區

➠ static inline __sum16 csum_fold(__wsum sum) 

* csum_fold - Fold and invert a 32bit checksum.
 * sum: 32bit unfolded sum
 *
 * Fold a 32bit running checksum to 16bit and invert it. This is usually
 * the last step before putting a checksum into a packet.
 * Make sure not to mix with 64bit checksums.
 */

extern __wsum csum_partial(const void *buff, int len, __wsum sum);

 * csum_partial - Compute an internet checksum.
 * @buff: buffer to be checksummed
 * @len: length of buffer.
 * @sum: initial sum to be added in (32bit unfolded)
 *
 * Returns the 32bit unfolded internet checksum of the buffer.
 * Before filling it in it needs to be csum_fold()'ed.
 * buff should be aligned to a 64bit boundary if possible.
 */

所計算的checksum缺少csum_fold所做的最後對褶步驟。L4協定會先呼叫csum_partial函式之一對L4資料做checksum計算,接著調用csum_tcpudp_magic之類的函式來計算假標頭checksum,最後求出這兩部分的checksum,並把結果合併

net/ipv4/tcp_ipv4.c
void __tcp_v4_send_check(struct sk_buff *skb, __be32 saddr, __be32 daddr) { struct tcphdr *th = tcp_hdr(skb); if (skb->ip_summed == CHECKSUM_PARTIAL) { th->check = ~tcp_v4_check(skb->len, saddr, daddr, 0); skb->csum_start = skb_transport_header(skb) - skb->head; skb->csum_offset = offsetof(struct tcphdr, check); } else { th->check = tcp_v4_check(skb->len, saddr, daddr, csum_partial(th, th->doff << 2, skb->csum)); } }

include/net/tcp.h

/* * Calculate(/check) TCP checksum */ static inline __sum16 tcp_v4_check(int len, __be32 saddr, __be32 daddr, __wsum base) { return csum_tcpudp_magic(saddr,daddr,len,IPPROTO_TCP,base); }

static inline __sum16 ip_fast_csum(const void *iph, unsigned int ihl)
 * ip_fast_csum - Compute the IPv4 header checksum efficiently.
 * iph: ipv4 header
 * ihl: length of header / 4
 */
根據所指定的IP header和長度,計算並傳回IP checksum。這個函式可用來驗證輸入封包(input packet),並計算外出封包(outgoing packet)的checksum

➠ static inline __wsum csum_tcpudp_nofold(__be32 saddr, __be32 daddr, __u32 len, __u8 proto, __wsum sum)

* csum_tcpup_nofold - Compute an IPv4 pseudo header checksum. * @saddr: source address * @daddr: destination address * @len: length of packet * @proto: ip protocol of packet * @sum: initial sum to be added in (32bit unfolded) * * Returns the pseudo header checksum the input data. Result is * 32bit unfolded. */
static inline __sum16 csum_tcpudp_magic(__be32 saddr, __be32 daddr,
                    __u32 len, __u8 proto,
                    __wsum sum)
 * csum_tcpup_magic - Compute an IPv4 pseudo header checksum.
 * @saddr: source address
 * @daddr: destination address
 * @len: length of packet
 * @proto: ip protocol of packet
 * @sum: initial sum to be added in (32bit unfolded)
 *
 * Returns the 16bit pseudo header checksum the input data already
 * complemented and ready to be filled in.
 */
static inline __sum16 csum_tcpudp_magic(__be32 saddr, __be32 daddr,
                    __u32 len, __u8 proto,
                    __wsum sum)
{
    return csum_fold(csum_tcpudp_nofold(saddr, daddr, len, proto, sum));
}


net/core/skbuff.c 
➠ __wsum skb_checksum(const struct sk_buff *skb, int offset, int len, __wsum csum)
幾乎都是由L4協定在特定情況下使用。

Pseudo header 假標頭的定義只是為了計算checksum;假標頭並不存在於網路線上所流動的封包內。


sk_buff和net_device相關欄位

1. 用來儲存關於checksum的資訊
2. 裝置如何通知核心關於它們硬體checksum計算能力
3. L4協定如何使用這類資訊來決定是否替入境和出境封包計算checksum,或者讓NIC來做這件事。


取決於skb是指向已收到之封包或已傳出去之封包,
skb->csum和skb->ip_summed這兩個欄位的意義會有所不同。
當一個封包被接收到時(RX)
skb->csum           記錄L4 checksum
skb->ip_summed 記錄L4 checksum的狀態
代表裝置驅動程式要告訴L4層的事。一旦L4接收常式接收到一些緩衝區時,
就可能改變skb->ip_summed的初始設定值。
include/linux/skbuff.h
#define CHECKSUM_NONE 0
#define CHECKSUM_UNNECESSARY 1 #define CHECKSUM_COMPLETE 2 #define CHECKSUM_PARTIAL 3
當一個封包被傳輸時(TX)
skb->csum  一個偏移量,它會指向緩衝區內某處(NIC會把它即將計算的checksum放在該處)
skb->ip_summed 記錄L4 checksum的狀態
供L4協定用來通知裝置,是否需要搞定checksum的計算工作。
當IP層知道有東西使得L4 checksum失效時(像是假標頭中有個欄位遭到修改),就會操作此欄位之值。

入境/出境 區段的checksum查驗工作
net/ipv4/tcp_ipv4.c
int tcp_v4_rcv(struct sk_buff *skb)
/* This routine computes an IPv4 TCP checksum. */
void tcp_v4_send_check(struct sock *sk, struct sk_buff *skb)
void __tcp_v4_send_check(struct sk_buff *skb, __be32 saddr, __be32 daddr)
{
    struct tcphdr *th = tcp_hdr(skb);

    if (skb->ip_summed == CHECKSUM_PARTIAL) {
        th->check = ~tcp_v4_check(skb->len, saddr, daddr, 0);
        skb->csum_start = skb_transport_header(skb) - skb->head;
        skb->csum_offset = offsetof(struct tcphdr, check);
    } else {
        th->check = tcp_v4_check(skb->len, saddr, daddr,
                     csum_partial(th,
                              th->doff << 2,
                              skb->csum));
    }
}

include/net/tcp.h /*
 * Calculate(/check) TCP checksum
 */
static inline __sum16 tcp_v4_check(int len, __be32 saddr,
                   __be32 daddr, __wsum base)
{
    return csum_tcpudp_magic(saddr,daddr,len,IPPROTO_TCP,base);

}

Checksum Offloads
net_device->features 裝置能力
NETIF_F_IP_CSUM
此裝置可以在硬體中計算L4 checksum,但是只針對使用IPv4的TCP和UDP。


$ ethtool -k em3
Features for em3:
rx-checksumming: on
tx-checksumming: on



net/core/pktgen.c
static struct sk_buff *fill_packet_ipv4(struct net_device *odev,
                    struct pktgen_dev *pkt_dev)
{
...
    if (!(pkt_dev->flags & F_UDPCSUM)) {
        skb->ip_summed = CHECKSUM_NONE;
    } else if (odev->features & (NETIF_F_HW_CSUM | NETIF_F_IP_CSUM)) {
        skb->ip_summed = CHECKSUM_PARTIAL;
        skb->csum = 0;
        udp4_hwcsum(skb, iph->saddr, iph->daddr);
    } else {
        __wsum csum = skb_checksum(skb, skb_transport_offset(skb), datalen + 8, 0);

        /* add protocol-dependent pseudo-header */

        udph->check = csum_tcpudp_magic(iph->saddr, iph->daddr,
                        datalen + 8, IPPROTO_UDP, csum);

        if (udph->check == 0)

            udph->check = CSUM_MANGLED_0;
    }

if (skb->ip_summed == CHECKSUM_PARTIAL)

參考資料:
http://www.tcpipguide.com/free/t_TCPChecksumCalculationandtheTCPPseudoHeader-2.htm
範例程式
https://github.com/bruce690813/checksum
Checksum Offloads in the Linux Networking Stack
https://www.kernel.org/doc/Documentation/networking/checksum-offloads.txt




2 則留言: