2017年12月12日 星期二

File I/O

1. scatter/gather I/O (vectored)
相較於linear I/O,分散/聚集 I/O有幾項優點:
(1)較自然的撰碼模式 (2)效率 (3)效能 (4)不可分割
Linux核心內部所有I/O均採用向量的方式
readv, writev, preadv, pwritev - read or write data into multiple buffers
ssize_t readv(int fd, const struct iovec *iov, int iovcnt);
ssize_t writev(int fd, const struct iovec *iov, int iovcnt);

每個iovec結構可用來描述一個獨立的緩衝區,稱為區段segmen(向量vector)
           struct iovec {
               void  *iov_base;   /* Starting address */
               size_t iov_len;      /* Number of bytes to transfer */
           };

iovcnt必須≥ 0以及 IOV_MAX(1024)
如果iovcnt個iov_len值的總和大於SSIZE_MAX,資料將不會被傳送
grep SSIZE_MAX -r /usr/include -n
SSIZE_MAX = 9223372036854775807

2. event poll (Epoll)
poll()與select()的每次調用,需提供一份所要檢視之檔案描述器的完整清單。然後核心必須處理清單中每個檔案描述器。當這份清單變大時-它可能包含成百上千的檔案描述氣-每次調用所要處理的清單會變成擴充性的瓶頸。
epoll避開這個問題的方法,就是讓「事件檢視器的登記」與「實際的事件檢視工作」脫勾
(1) epoll_create1系統呼叫用於初始設定epoll的作業環境
epoll_create1 - open an epoll file descriptor
int epoll_create1(int flags);

傳回一個與實例相對應的檔案描述器,與真實的檔案沒有關係;它只是一個可供隨後呼叫使用epoll措施的操作代號,flag目前只有一個有效旗標: EPOLL_CLOEXEC。它可用於啟用close-on-exec行為。

(2) epoll_ctl系統呼叫用於把所要檢視的檔案描述器加入作業環境,或是從作業環境中移除檔案描述器
epoll_ctl - control interface for an epoll descriptor
int epoll_ctl(int epfd, int op, int fd, struct epoll_event *event);

          typedef union epoll_data {
               void       *ptr;
               int           fd;
               uint32_t   u32;
               uint64_t   u64;
           } epoll_data_t;

           struct epoll_event {
               uint32_t      events;      /* Epoll events */
               epoll_data_t data;        /* User data variable */
           };

參數op用於指定如何操作與fd相對應的檔案
EPOLL_CTL_ADD: 把特定檔案(fd)上的一個檢視器加入epoll實例(epfd)
EPOLL_CTL_DEL: 從epoll實例(epfd)移除特定檔案(fd)上的一個事件檢視器

epoll ET/LT
event參數用於進一步描述操作的行為
EPOLLIN: 檔案可供讀取而且不會遭到阻擋
EPOLLET: 為檔案的檢視器啟用邊緣觸發行為(預設是準位觸發)
poll() / select(): Level-triggered
epoll()Level-triggered (default) and Edge-triggered
邊緣觸發通常需要運用非阻擋I/O並且仔細檢查EAGAIN

(3) epoll_wait系統呼叫則實際進行事件的等待。
epoll_wait, epoll_pwait - wait for an I/O event on an epoll file descriptor
int epoll_wait(int epfd, struct epoll_event *events, int maxevents, int timeout);

3. memory-mapped I/O (mmap)
mmap, munmap - map or unmap files or devices into memory
void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset);
prot:
PROT_READ  Pages may be read.
PROT_WRITE Pages may be written.
PROT_EXEC  Pages may be executed.

flags: 
MAP_SHARED
Share this mapping. Updates to the mapping are visible to other processes that map this file, and are carried through to the underlying file. The file may not actually be updated until msync(2) or munmap() is called.

MAP_PRIVATE
Create a private copy-on-write mapping. Updates to the mapping are not visible to  other processes mapping the same file, and are not carried through to the underlying file. It is unspecified whether changes made to the file after the mmap() call are visible in the mapped region.

調整一個映射的大小
mremap - remap a virtual memory address
void *mremap(void *old_address, size_t old_size,
                    size_t new_size, int flags, ... /* void *new_address */);
mremap() uses the Linux page table scheme. mremap() changes the mapping between virtual addresses and memory pages. This can be used to implement a very efficient realloc(3).

變更一個映射的保護旗標
mprotect - set protection on a region of memory
int mprotect(void *addr, size_t len, int prot);
prot:   
PROT_NONE  The memory cannot be accessed at all.
PROT_READ  The memory can be read.
PROT_WRITE The memory can be modified.
PROT_EXEC  The memory can be executed.

檔案與映射的同步
msync - synchronize a file with a memory map
int msync(void *addr, size_t length, int flags);
flag:
MS_SYNC
MS_ASYNC
MS_INVALIDATE

對映射的用法提供建議
madvise - give advice about use of memory
int madvise(void *addr, size_t length, int advice);


4. file advice

5. asynchronous I/O

kqueue

2017年10月30日 星期一

Ftrace - function tracer

ftrace: /sys/kernel/debug/tracing
Ring buffer: /sys/kernel/debug/tracing/trace
debugfs:
mount | grep debugfs
debugfs on /sys/kernel/debug type debugfs (rw,relatime)


/sys/kernel/debug/tracing/
├── available_events
├── available_filter_functions
├── available_tracers
├── buffer_size_kb
├── buffer_total_size_kb
├── current_tracer
├── dyn_ftrace_total_info
├── enabled_functions
├── events
├── free_buffer
├── function_profile_enabled
├── hwlat_detector
├── instances
├── kprobe_events
├── kprobe_profile
├── max_graph_depth
├── options
├── per_cpu
├── printk_formats
├── README
├── saved_cmdlines
├── saved_cmdlines_size
├── saved_tgids
├── set_event
├── set_event_pid
├── set_ftrace_filter
├── set_ftrace_notrace
├── set_ftrace_pid
├── set_graph_function
├── set_graph_notrace
├── snapshot
├── stack_max_size
├── stack_trace
├── stack_trace_filter
├── trace
├── trace_clock
├── trace_marker
├── trace_marker_raw
├── trace_options
├── trace_pipe
├── trace_stat
├── tracing_cpumask
├── tracing_max_latency
├── tracing_on
├── tracing_thresh
├── uprobe_events
└── uprobe_profile

root@instance-1:/sys/kernel/debug/tracing# cat current_tracer
nop
root@instance-1:/sys/kernel/debug/tracing# echo function > current_tracer

root@instance-1:/sys/kernel/debug/tracing# cat available_tracers
hwlat blk mmiotrace function_graph wakeup_dl wakeup_rt wakeup function nop
root@instance-1:/sys/kernel/debug/tracing# cat set_graph_function
#### all functions enabled ####
root@instance-1:/sys/kernel/debug/tracing# cat buffer_size_kb
7 (expanded: 1408)

1. event
available_events: A list of events that can be enabled in tracing.
set_event: By echoing in the event into this file, will enable that event.
# cat available_events
net:netif_rx_ni_entry
net:netif_rx_entry
net:netif_receive_skb_entry
net:napi_gro_receive_entry
net:napi_gro_frags_entry
net:netif_rx
net:netif_receive_skb
net:net_dev_queue
net:net_dev_xmit
net:net_dev_start_xmit
skb:skb_copy_datagram_iovec
skb:consume_skb
skb:kfree_skb
syscalls:sys_exit_socket
syscalls:sys_enter_socket

# echo 0 > tracing_on
# echo net:netif_receive_skb > set_event
# echo 1 > tracing_on
# cat trace_pipe
# cat trace
# tracer: nop
#
#                              _-----=> irqs-off
#                             / _----=> need-resched
#                            | / _---=> hardirq/softirq
#                            || / _--=> preempt-depth
#                            ||| /     delay
#           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
#              | |       |   ||||       |         |
          <idle>-0     [000] ..s.  4175.515810: netif_receive_skb: dev=ens4 skbaddr=ffff91034b184900 len=52
          <idle>-0     [000] ..s.  4175.556855: netif_receive_skb: dev=ens4 skbaddr=ffff91034b184900 len=52

2. function
available_filter_functions / set_ftrace_filter
This lists the functions that ftrace has processed and can trace.
These are the function names that you can pass to
"set_ftrace_filter" or "set_ftrace_notrace".
(See the section "dynamic ftrace" below for more details.)

# cat available_filter_functions
netif_receive_skb
ip_rcv_finish
ip_rcv
ip_local_deliver_finish
ip_local_deliver
ip_forward_finish
ip_forward
ip_output
ip_finish_output2
ip_finish_output
dev_queue_xmit
dev_hard_start_xmit
[...]

3. kprobe: Kernel space;
kprobe_events: Enable dynamic trace points.
kprobe_profile: Dynamic trace points stats.

4. uprobe: User space; See uprobetrace.txt
uprobe_events: Add dynamic tracepoints in programs.
uprobe_profile: Uprobe statistics.

tracer: ftrace, perf, systemtap
debugger: gdb

參考資料:
Debugging the kernel using Ftrace - part 1

https://www.kernel.org/doc/Documentation/trace/ftrace.txt
kernel/linux-4.13/kernel/trace/ftrace.c
http://www.brendangregg.com/blog/2015-07-08/choosing-a-linux-tracer.html
https://www.ibm.com/developerworks/cn/linux/1609_houp_ftrace/index.html

2017年10月19日 星期四

exclusive or :XOR


A ⊕ 1 = A'; //某些特定位元反轉
A ⊕ 0 = A;
A ⊕ A = 0; //快速比較兩個值;將變數reset
A ⊕ A' = 1;

➠ 將變數設為零  a^=a
快速比較兩個值  if ((a^b) == 0) {}

include/linux/etherdevice.h
/**
 * ether_addr_equal - Compare two Ethernet addresses
 * @addr1: Pointer to a six-byte array containing the Ethernet address
 * @addr2: Pointer other six-byte array containing the Ethernet address
 *
 * Compare two Ethernet addresses, returns true if equal
 *
 * Please note: addr1 & addr2 must both be aligned to u16.
 */
static inline bool ether_addr_equal(const u8 *addr1, const u8 *addr2)
{
#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)
    u32 fold = ((*(const u32 *)addr1) ^ (*(const u32 *)addr2)) |
           ((*(const u16 *)(addr1 + 4)) ^ (*(const u16 *)(addr2 + 4)));

    return fold == 0;

#else
    const u16 *a = (const u16 *)addr1;
    const u16 *b = (const u16 *)addr2;

    return ((a[0] ^ b[0]) | (a[1] ^ b[1]) | (a[2] ^ b[2])) == 0;

#endif
}

➠ 某些特定位元反轉
status ^= 1 << x;                             //Toggling a bit
status |= 1 << x; //Setting a bit
status &= ~(1 << x);                         //Clearing a bit
bit = (status >> x) & 1;                     //Checking a bit
status ^= (-x ^ status) & (1 << n);     //Changing the nth bit to x
unsigned int a, b, mask = 1 << 6; 
a = 0xB1; // 10100001 
b = a ^ mask; /* flip the 6th bit */
https://stackoverflow.com/questions/47981/how-do-you-set-clear-and-toggle-a-single-bit

一個二進制數中1的數量是奇數還是偶數 (Parity Check)
例如:求10100001中1的數量是奇還是偶數; 
答案:1 ^ 0 ^ 1 ^ 0 ^ 0 ^ 0 ^ 0 ^ 1 = 1,
結果為1就是奇數個1,結果為0就是偶數個1;

校驗和恢復
if (a ^ b = c) then a ^ c = b
RAID5: 使用三塊磁盤(A、B、C)組成RAID5陣列,當寫資料時,將資料分成兩部分,分別寫到磁盤A和磁盤B,A^B的結果寫到磁盤C,
當讀取A的資料時,透過B ^ C可以對A的數據做校驗
當磁盤A出錯時,透過B ^ C也可以恢復磁盤A的資料

➠ [面試] XOR swap  #define swap(a,b) { a^=b, b^=a, a^=b; }
不用宣告新變數直接做swap
x = a ^ b
y = x ^ b
z = x ^ y

y = (a ^ b) ^ b = a ^ b ^ b = a ^ (b ^ b) = a ^ 0 = a
z = (a ^ b) ^ a = a ^ b ^ a = (a ^ a) ^ b = 0 ^ b = b
https://en.wikipedia.org/wiki/XOR_swap_algorithm

➠ [面試] 互換二進制數的奇偶位
#define N(n) ((n<<1) & 0xAAAAAAAA) | ((n>>1) & 0x55555555)
將一個int型態的數的奇偶位互換
例如6的2進制為00000110,交換後得到00001001,輸出應該為9

➠ OddOccurrencesInArray
一個數組存放若干整數,一個數出現奇數次,其餘數均出現偶數次,找出這個出現奇數次的數?
交換律:A ^ B = B ^ A 
結合律:A ^ (B ^ C) = (A ^ B) ^ C

A ^ B ^ C ^ B ^ C ^ D ^ A

= A ^ A ^ B ^ B ^ C ^ C ^ D
= 0 ^ 0 ^ 0 ^ D
= 0 ^ D
= D
int iterator;
int result;
for (result = A[0], iterator = 1; iterator < N; result ^= A[iterator], iterator++);
return (result);
https://github.com/avrahamcohen/CodilitySolutions/blob/master/OddOccurrencesInArray.c

題目: 一個整數類型組除了一個數字之外,其他的數字都出現了兩次,找出這一個數字
題目: 一個整數類型組除了二個數字之外,其他的數字都出現了兩次,找出這二個數字
題目: 一個整數類型組除了三個數字之外,其他的數字都出現了兩次,找出這三個數字
https://www.lijinma.com/blog/2014/05/29/amazing-xor/

➠ 1 ~ 1000放在含有1001個元素的數組中,只有唯一的一個元素值重複,其它均只出現一次
每個數組元素只能訪問一次,不用輔助儲存空間
http://www.atove.com/Article/Details/635F056EDD0150910AEC759527EFD539

2017年10月18日 星期三

[Codility] Iterations: BinaryGap

Find longest sequence of zeros in binary representation of an integer.


binary gap within a positive integer N is any maximal sequence of consecutive zeros that is surrounded by ones at both ends in the binary representation of N.
For example, number 9 has binary representation 1001 and contains a binary gap of length 2. The number 529 has binary representation 1000010001 and contains two binary gaps: one of length 4 and one of length 3. The number 20 has binary representation 10100 and contains one binary gap of length 1. The number 15 has binary representation 1111 and has no binary gaps.
Write a function:
int solution(int N);
that, given a positive integer N, returns the length of its longest binary gap. The function should return 0 if N doesn't contain a binary gap.
For example, given N = 1041 the function should return 5, because N has binary representation 10000010001 and so its longest binary gap is of length 5.
Assume that:
  • N is an integer within the range [1..2,147,483,647].
Complexity:
  • expected worst-case time complexity is O(log(N));
  • expected worst-case space complexity is O(1).

int solution(int N)
{
    int i, mask;
    int first = 0, gap = 0, max_value = 0;

    for (i = 0; i < 32; i++) {

        mask = 0x1 << i;
        if (N & mask) {
            if (first) {
                max_value = (gap > max_value) ? gap : max_value;
                gap = 0;
            }
            first = 1;
        } else if (first) {
            gap++;
        }
    }

    return max_value;

}

https://codility.com/programmers/lessons/1-iterations/

2017年10月12日 星期四

DPDK KNI interface

DPDK因為是bypass kernel network stack機制,輸入ifconfig是看不到此網路介面所以要使用tcpdump來debug是無法使用
DPDK應用程式透過Kernel Network Interface (KNI)建立虛擬網路介面 (vEthX)
才可以使用一般的Linux kernel TCP/IP stack

The benefits of using the DPDK KNI are:
➠ Faster than existing Linux TUN/TAP interfaces (by eliminating system calls and copy_to_user()/copy_from_user() operations.
➠ Allows management of DPDK ports using standard Linux net tools such as ethtool, ifconfig and tcpdump.
➠ Allows an interface with the kernel network stack.

Elements of KNI in DPDK



Kernel Module (rte_kni.ko)
CONFIG_RTE_KNI_KMOD
lib/librte_eal/linuxapp/kni
lib/librte_eal/linuxapp/kni/ethtool
├── compat.h
├── ethtool
├── kni_dev.h
├── kni_ethtool.c
├── kni_fifo.h
├── kni_misc.c
└── kni_net.c
//private information for a kni device
struct kni_dev
struct kni_net

//
struct rte_kni_request
struct rte_kni_fifo
struct rte_kni_mbuf
struct rte_kni_device_info
Static Library (librte_kni.a)
CONFIG_RTE_LIBRTE_KNI
lib/librte_kni
├── rte_kni.c
├── rte_kni_fifo.h
└── rte_kni.h
// KNI context
struct rte_kni

//Structure for configuring KNI device.
struct rte_kni_conf

//Structure which has the function pointers for KNI interface.
struct rte_kni_ops

//KNI memzone pool slot
struct rte_kni_memzone_slot

//KNI memzone pool
struct rte_kni_memzone_pool

//
struct rte_kni_fifo
struct rte_kni_device_info
struct rte_mempool
struct rte_memzone
struct rte_mbuf
rte_kni.c
rte_kni_init()
librte_eal (librte_eal.a)
lib/librte_eal/common/include
rte_memzone.h
//A structure describing a memzone, which is a contiguous portion of physical memory identified by a name.
struct rte_memzone
linuxapp/eal/
eal.c
//Launch threads, called at application init().
rte_eal_init()
common/
eal_common_launch.c
rte_eal_mp_remote_launch()
librte_mempool (librte_mempool.a)
lib/librte_mempool
rte_mempool.h
//The RTE mempool structure.
struct rte_mempool
librte_mbuf (librte_mbuf.a)
lib/librte_mbuf
rte_mbuf.h
//The generic rte_mbuf, containing a packet mbuf.
struct rte_mbuf

rte_pktmbuf_alloc()
rte_mbuf.c
//helper to create a mbuf pool
rte_pktmbuf_pool_create()
//Free a packet mbuf back into its original mempool.
rte_pktmbuf_free()
librte_ether (librte_ethdev.a)
lib/librte_ether
rte_ethdev.h
//A structure used to configure an Ethernet port. Depending upon the RX multi-queue mode, extra advanced configuration settings may be needed.
struct rte_eth_conf

//A structure used to retrieve link-level information of an Ethernet port.
struct rte_eth_link

//Ethernet device information
struct rte_eth_dev_info

rte_eth_rx_burst()
rte_ethdev.c
rte_eth_dev_count()
rte_eth_dev_configure()
rte_eth_dev_socket_id()
rte_eth_rx_queue_setup()
rte_eth_tx_queue_setup()
rte_eth_promiscuous_enable()
rte_eth_link_get_nowait()
rte_eth_dev_start()
rte_eth_dev_stop()
rte_eth_dev_info_get()
Drivers (librte_pmd_kni.a)
CONFIG_RTE_LIBRTE_PMD_KNI
drivers/net/kni
└── rte_eth_kni.c
struct eth_kni_args
struct pmd_queue_stats
struct pmd_queue
Others (rte_kni.ko / librte_kni.a)
lib/librte_eal/linuxapp/eal/include/exec-env
└── rte_kni_common.h
struct
//Structure for KNI request.
  rte_kni_request
//Fifo struct mapped in a shared memory.
  rte_kni_fifo
//The kernel image of the rte_mbuf struct
  rte_kni_mbuf
//Struct used to create a KNI device.
//Passed to the kernel in IOCTL call
  rte_kni_device_info
DPDK Application (kni)
CONFIG_RTE_LIBRTE_KNI
examples/kni
└── main.c
//Structure of port parameters
struct kni_port_params

//Structure type for recording kni interface specific stats
struct kni_interface_stats

//
struct rte_mempool
struct rte_eth_conf
struct rte_eth_link
struct rte_eth_dev_info
struct rte_mbuf
struct rte_kni_conf
struct rte_kni_ops

Kernel Module
# wget http://fast.dpdk.org/rel/dpdk-17.05.2.tar.xz
# tar -Jxvf dpdk-17.05.2.tar.xz
# cd dpdk-stable-17.05.2/
# make config T=x86_64-native-linuxapp-gcc
Configuration done
# make install T=x86_64-native-linuxapp-gcc
# insmod ./x86_64-native-linuxapp-gcc/kmod/rte_kni.ko kthread_mode =multiple
DPDK Application (kni)
# export RTE_SDK=`pwd`
# export RTE_TARGET=x86_64-native-linuxapp-gcc
# make -C examples/kni
–config=”(port,lcore_rx, lcore_tx[,lcore_kthread, ...]) [, port,lcore_rx, lcore_tx[,lcore_kthread, ...]]”:
# ./examples/kni/build/app/kni -c 0xFFFFF -n 4 -- -P -p 0x3 --config="(0,0,1),(1,2,3)"

ifconfig -a
vEth0     Link encap:Ethernet  HWaddr 52:3c:fd:56:18:63
          BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:29 errors:0 dropped:3 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:1993 (1.9 KB)  TX bytes:0 (0.0 B)

vEth1     Link encap:Ethernet  HWaddr 56:54:a3:50:02:0c

          BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:23 errors:0 dropped:3 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:1577 (1.5 KB)  TX bytes:0 (0.0 B)
ifconfig vEth0 up; ifconfig vEth1 up
top
...
%Cpu0  :100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu1  :100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu2  :100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu3  :100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
   PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 77109 root      20   0 36.178g   6220   3408 R 400.0  0.0 575:00.07 kni
 77128 root      20   0       0      0      0 S   1.0  0.0   1:17.20 kni_vEth0
 77136 root      20   0       0      0      0 S   0.7  0.0   1:16.38 kni_vEth1
tcpdump -i vEth0

Test 
    ➠ test/test/ 
./test/test/test_kni.c


DPDK KNI Kernel Module (rte_kni.ko)

# modinfo rte_kni.ko
filename:       /home/bh0322/workspace/dpdk-stable-17.05.2/./build/kmod/rte_kni.ko
description:    Kernel Module for managing kni devices
author:          Intel Corporation
license:         Dual BSD/GPL
srcversion:     C1BCE1852D37B5F833BB878
depends:
vermagic:       3.16.0-30-generic SMP mod_unload modversions
parm:           lo_mode:KNI loopback mode (default=lo_mode_none):
    lo_mode_none        Kernel loopback disabled
    lo_mode_fifo          Enable kernel loopback with fifo
    lo_mode_fifo_skb    Enable kernel loopback with fifo and skb buffer
 (charp)
parm:           kthread_mode:Kernel thread mode (default=single):
    single     Single kernel thread mode enabled.
    multiple  Multiple kernel thread mode enabled.

➠ Transmit a packet (called by the kernel)
static int
kni_net_tx(struct sk_buff *skb, struct net_device *dev)
(1) dequeue a mbuf from alloc_q
(2) enqueue mbuf into tx_q
(3) Free skb and update statistics

➠ Struct used to create a KNI device. Passed to the kernel in IOCTL call
lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
struct rte_kni_device_info
(1) The interface name.
(2) Physical addresses of the corresponding memzones for the relevant FIFOs.
(3) Mbuf mempool details, both physical and virtual (to calculate the offset for mbuf pointers).
(4) PCI information.
(5) Core affinity (force_bind, core_id).

# ls -al /dev/kni
crw------- 1 root root 10, 57 Sep 26 12:15 /dev/kni
# cat /proc/misc
57 kni  //minor number

# dmidecode -t memory | grep Size

Alternative Solutions

➠ Tun/Tap  
➠ Recently tap PMD patch sent  
➠ af_packet  
➠ virtio-user + vhost-net  
➠ Bifurcated driver

System Environment

Server Platform: Dell PowerEdge R630
CPU: Intel(R) Xeon(R) CPU E5-2660 v3 @ 2.60GHz Number of cores 20
Memory: Total 256 GBs over 24 channels @ 2133 MHz
NICs: 2x Intel ® 82599ES 10-Gigabit SFI/SFP+ Network Connection
Driver ixgbe DPDK PMD
Operating System: Ubuntu 14.04.2 LTS
Linux kernel version: 3.16.0-30-generic
GCC version: Ubuntu 4.8.4-2ubuntu1~14.04.3
DPDK version: 17.05.2


參考資料