Zetavault Blog

Insights, comments, tips and random ramblings.

NVMe Native vs NVMe Over Fabrics

6th October 2016

Update

The results in this article were updated on 30th November to reflect the change of the Ethernet results to use a Dell Force10 S6000 switch.

Background

It's the technology that is getting a lot of press recently. So what exactly is "NVMe over fabrics"?

NVMe over fabrics (NVMeF for short) allows high performance NVMe flash devices to be connected across RDMA capable networks.

At present that means RDMA based Ethernet or InfiniBand.

This is the first truly new block based networked storage technology to be developed in over 20 years. Since the introduction of Fibre Channel in 1994, all block based network storage has been SCSI based.

NVMeF allows connected devices to function as if they are locally attached to the system. You are no longer restricted to installing NVMe devices locally inside each system. You can create a shared pool of NVMe devices to be accessed by different systems.

Ethernet or InfiniBand networks of up to 100 Gigabit can be used for NVMeF.

Put simply, it is the fastest networked based storage available today, bar none.

So, the big question.

If we connect an NVMe device over Ethernet or InfiniBand, how much performance do we lose compared to running the device natively?

Hardware

For both the initiator and target, we used a system with a single Xeon CPU E5-1620 v4 @ 3.50GHz and 16GB of RAM.

An Intel DC P3700 400GB PCIe NVMe device was used. Part number: SSDPEDMD400G401 [link]. Firmware version 8DV10171.

Ethernet

For the Ethernet tests, we used Mellanox ConnectX-3 Pro adapters running at 40 Gigabit. Part number: MCX314A-BCCT [link]. Firmware version 2.40.5000.

A Dell Force10 S6000 switch was used.

InfiniBand

For the InfiniBand tests, we used Mellanox ConnectX-3 adapters running at 56 Gigabit (FDR). Part number: MCX354A-FCBT [link]. Firmware version 2.40.5000.

A Mellanox SX6036 switch was used.

Software

Ubuntu 14.04 LTS was used as the OS for both target and initiator.

Linux kernel version 4.8.10 was used.

We used FIO version 2.15 [link] for the benchmark.

The Test

For the native test, we ran FIO on the system where the NVMe device was installed.

For the Ethernet and InfiniBand tests, we connected the target to the initiator over the relevant fabric where it appeared as a native NVMe device.

The test is direct to the device. No filesystem or volume manager is used.

We ran each test a total of 20 times and then averaged them.

Both the read and write tests are random, not sequential. Queue depth is 32 per worker.

This is the FIO test that was run:

fio -name [device] -rw=[test] -bs=[blocksize] -runtime=2 -numjobs=[1/2/4] -ioengine=libaio \
-direct=1 -iodepth=32 -thread -minimal -group_reporting -output=[log]

Availability

We are adding NVMeF target support to version 3.28 of Zetavault. It is due for release in December.

For now, we have the following benchmarks to share with you.

The Results

Throughput values are in MB/s.

Latency values are in milliseconds (ms).

 

Random Read — 1 Worker

chart 1w read bw.600px chart 1w read iops.600px
 
Throughput
Native Ethernet InfiniBand
1K 314.2 204.0 222.6
4K 864.1 731.1 754.8
16K 1,782.4 1,726.2 1,693.6
64K 2,268.6 2,248.1 2,252.2
IOPS
Native Ethernet InfiniBand
1K 321,769 208,852 227,968
4K 221,207 187,173 193,235
16K 114,071 110,475 108,389
64K 36,296 35,969 36,035
 
Latency
Native Ethernet InfiniBand
1K 0.10 0.15 0.14
4K 0.14 0.17 0.16
16K 0.28 0.29 0.29
64K 0.87 0.88 0.88

 

Random Write — 1 Worker

chart 1w write bw.600px chart 1w write iops.600px
 
Throughput
Native Ethernet InfiniBand
1K 159.4 161.5 160.2
4K 965.6 840.4 794.5
16K 1,003.2 924.3 1,013.5
64K 925.4 1,027.7 943.4
IOPS
Native Ethernet InfiniBand
1K 163,188 165,415 164,045
4K 247,187 215,136 203,385
16K 64,204 59,157 64,861
64K 14,806 16,443 15,093
 
Latency
Native Ethernet InfiniBand
1K 0.19 0.19 0.19
4K 0.13 0.15 0.15
16K 0.49 0.54 0.49
64K 2.16 1.93 2.12

 

Random Read — 2 Workers

chart 2w read bw.600px chart 2w read iops.600px
 
Throughput
Native Ethernet InfiniBand
1K 447.5 336.4 339.2
4K 1,438.9 1,184.1 1,201.6
16K 2,260.6 2,258.4 2,235.8
64K 2,342.2 2,268.1 2,312.2
IOPS
Native Ethernet InfiniBand
1K 458,258 344,494 347,340
4K 368,349 303,137 307,607
16K 144,679 144,539 143,088
64K 37,475 36,290 36,994
 
Latency
Native Ethernet InfiniBand
1K 0.14 0.18 0.18
4K 0.17 0.21 0.20
16K 0.44 0.44 0.44
64K 1.70 1.91 1.72

 

Random Write — 2 Workers

chart 2w write bw.600px chart 2w write iops.600px
 
Throughput
Native Ethernet InfiniBand
1K 156.3 156.4 150.5
4K 931.1 943.4 911.4
16K 1,010.5 957.9 1,019.2
64K 921.9 922.7 1,015.8
IOPS
Native Ethernet InfiniBand
1K 159,998 160,158 154,057
4K 238,365 241,501 233,317
16K 64,671 61,304 65,227
64K 14,750 14,763 16,252
 
Latency
Native Ethernet InfiniBand
1K 0.39 0.18 0.18
4K 0.27 0.21 0.20
16K 0.98 0.44 0.44
64K 4.35 1.91 1.72

 

Random Read — 4 Workers

chart 4w read bw.600px chart 4w read iops.600px
 
Throughput
Native Ethernet InfiniBand
1K 510.9 506.9 498.4
4K 1,821.7 1,842.0 1,625.5
16K 2,347.0 2,348.8 2,334.6
64K 2,277.3 2,219.5 2,305.3
IOPS
Native Ethernet InfiniBand
1K 523,194 519,014 510,367
4K 466,365 471,544 416,137
16K 150,208 150,323 149,417
64K 36,437 35,511 36,884
 
Latency
Native Ethernet InfiniBand
1K 0.24 0.24 0.25
4K 0.27 0.27 0.30
16K 0.85 0.85 0.85
64K 3.50 3.80 3.46

 

Random Write — 4 Workers

chart 4w write bw.600px chart 4w write iops.600px
 
Throughput
Native Ethernet InfiniBand
1K 154.1 143.3 139.8
4K 927.8 951.0 985.9
16K 1,033.2 947.4 1,036.3
64K 887.6 1,025.0 965.5
IOPS
Native Ethernet InfiniBand
1K 157,776 146,738 143,167
4K 237,528 243,451 252,382
16K 66,124 60,632 66,320
64K 14,202 16,399 15,447
 
Latency
Native Ethernet InfiniBand
1K 0.80 0.87 0.89
4K 0.54 0.52 0.50
16K 1.93 2.12 1.92
64K 9.01 7.79 8.32

Observations

If multiple workers are running (think multiple virtual machines) then it is possible to get very close to native performance.

While there is more bandwidth in the InfiniBand network we used (54.3 Gigabit vs 40 Gigabit), the NVMe device is not capable of getting anywhere near that level of throughput, so the Ethernet bandwidth is not an issue in this test. It would be in a test which features higher performing NVMe devices which can exceed 5 GB/s.

Conclusion

NVMeF is quickly becoming a commodity technology. If it isn't already.

It's the newest block based network storage for a long time, with no ties to legacy SCSI based SAN.

Like SCSI based network storage, vendors are free to build NVMeF products without having to license the technology. Expect an explosion of vendors to ship NVMeF based storage or integrate it into their existing storage products.

NVMeF will enable the creation of the next generation of high performance network storage. Think greater than 10 GB/s throughput and well over a million IOPS.

Bring it on.