Zetavault Blog

Insights, comments, tips and random ramblings.

iSCSI vs iSER vs SRP on Ethernet & InfiniBand

21st September 2016

Background

We use SRP (RDMA based SCSI over InfiniBand) to build ZFS clusters from multiple nodes.

We tested iSER — an alternative RDMA based SCSI transport — several years ago. A lot has happened since then, so we wanted to retest. Has iSER closed the gap? Or is SRP still superior?

We also wanted to know how viable iSER is on Ethernet. Does it work? Is it any good?

To make this blog more relevant, we also tested standard iSCSI on 10 GbE and 40 GbE.

SRP

Stands for "SCSI RDMA Protocol" [link].

Allows an initiator to access a SCSI target (and it's devices) via RDMA. We've been using it in production systems since 2008. We've only ever run it on an InfiniBand network. Getting it to work on an Ethernet network would be very difficult. Not something we have ever attempted.

We use it to connect head nodes with storage nodes so we can build large (and small) ZFS based clusters. It works beautifully with ZFS since each SRP disk works exactly as if it was plugged directly into the head node. In fact, even better: because the disk is served by a target, we can do SCSI-3 persistent reservations on the disks.

Target support is provided by SCST and LIO on Linux, and COMSTAR on Solaris/illumos based systems.

SRP was first published as a standard in 2002. It's very mature and completely stable. As Ashford & Simpson said, "solid as a rock" [link].

iSER

Stands for "iSCSI Extensions for RDMA" [link]. It basically extends the iSCSI protocol to include RDMA support.

With iSER based setups, you use the same software as IP based iSCSI. That is, you use the same initiator and target. You just configure the initiator side to use an iSER interface when performing iSCSI discovery. You don't need to learn anything new. As long as your hardware, initiator and target support iSER, it will work.

Unlike SRP, you can run it on Ethernet.

Target support is provided by SCST and LIO on Linux, COMSTAR on Solaris/illumos and StarWind on Windows.

The protocol specifications were first published in 2004. Like SRP, it is very mature and completely stable.

Hardware

For the initiator, we used a system with a single Xeon E3-1220 v3 @ 3.10GHz and 32GB of RAM.

For the target, we used a system with a single Xeon CPU E5-1620 v4 @ 3.50GHz and 256GB of RAM.

For the 10 GbE test, we used Intel X710-DA2 adapters [link].

For the 40 GbE and InfiniBand tests, we used Mellanox ConnectX-3 adapters which can be configured as 40 Gbit Ethernet or 56 Gbit InfiniBand. Part number: MCX354A-FCBT [link].

Target & Initiator

We used SCST version 3.3.0 on Linux. Both target and initiator used Ubuntu 14.04 LTS for the OS with kernel 4.4.20.

This test is about comparing the transport protocols with each other. It is not a test of which Linux/FreeBSD/illumos/Windows target is best. Hence we want to use the same target for all tests and the same OS platform.

At first we used an Intel DC P3700 NVMe adapter for the target device. We could easily get 2,500 MB/s of random read performance out of it. However, we soon reached the peak performance of the adapter in the InfiniBand tests.

So we switched to using a 200 GB RAM disk instead for the target device. We are only interested in comparing the transport protocols here, not NVMe performance. So the RAM disk is perfect.

For the SRP tests, we used Bart Van Assche's SRP initiator. Version 2.0.37.

For the iSCSI and iSER tests, we used the standard Open-iSCSI initiator which is provided by all Linux distributions. Version 2.0.873.

The Test

We connected the RAM disk based target to the initiator where it appeared as a SCSI device. The test is direct to the block. No filesystem or volume manager is used.

We used FIO version 2.13 [link] for the benchmark. Is there anything else worth using these days on Linux?

We ran a lot of tests. Several hundred in fact. We tested blocksizes up to 1MB, different queue depths, and multiple workers. The percentage differences were pretty much the same as they were for a single worker at queue depth 32. So to keep things simple, we have shown the results for a single worker, queue depth of 32, and block sizes of 4K, 16K and 64K.

We ran each test a total of 20 times and then averaged them.

Both the read and write tests are random, not sequential.

The Results

Throughput values are in MB/s.

 

Random Read — Throughput

chart read bw

iSCSI (10 GbE) iSCSI (40 GbE) iSER (Eth) iSER (IB) SRP (IB)
4K 294.9 463.8 1,433.6 1,458.3 1,159.5
16K 934.5 1,476.7 4,030.8 4,084.9 3,609.5
64K 1,178.3 2,861.2 4,595.2 5,800.8 5,748.2

 

Random Read — IOPS

chart read iops

iSCSI (10 GbE) iSCSI (40 GbE) iSER (Eth) iSER (IB) SRP (IB)
4K 75,482 118,725 366,993 373,330 296,824
16K 59,808 94,506 257,970 261,435 231,010
64K 18,854 45,779 73,524 92,813 91,972

 

Random Write — Throughput

chart write bw

iSCSI (10 GbE) iSCSI (40 GbE) iSER (Eth) iSER (IB) SRP (IB)
4K 338.3 404.7 1,324.8 1,331.7 1,219.2
16K 1,003.5 1,231.9 3,988.5 4,125.2 2,806.5
64K 1,175.7 2,371.2 4,485.3 5,152.3 4,835.7

 

Random Write — IOPS

chart write iops

iSCSI (10 GbE) iSCSI (40 GbE) iSER (Eth) iSER (IB) SRP (IB)
4K 86,615 103,595 339,151 340,919 312,125
16K 64,225 78,840 255,263 264,014 179,615
64K 18,811 37,940 71,765 82,437 77,371

Conclusion

How things have changed since last time we tested. Not only has iSER on InfiniBand closed the gap with SRP, it is now showing higher performance.

But what was most surprising to us was the performance of iSER on Ethernet. At lower blocksizes — 4K and 16K — it performs better than InfiniBand based iSER. It's not until we get to larger blocksizes where SRP shows its edge. This is due to the data rate advantage of 56 Gbit InfiniBand over 40 Gbit Ethernet at maximum transfer rates.

The iSER performance is testament to the excellent work by Yan Burman's team at Mellanox on the iSER target.

We've read some articles which state that iSER on Ethernet provides marginally better performance than standard iSCSI over Ethernet. This is simply not true. The performance difference between IP based iSCSI and RDMA based iSCSI on Ethernet is huge.

It's not all rosy though. Getting iSER and SRP working on InfiniBand is very simple. RDMA on InfiniBand is a completely standard feature since InfiniBand was released 15 years ago.

On Ethernet that is not the case. Specific Ethernet adapters are required. For example, none of the latest Intel X710 and XL710 cards support RDMA. Intel has given up on iWARP/RDMA support for its latest adapters.

Specific Ethernet switches are also required. Choosing RDMA on Ethernet will mean getting the switch vendor to guarantee compatibility not only with the iSER RDMA requirements but also with the adapters you plan to build the network with as well.

That said, iSER is an excellent bit of technology that for many use cases will be an ideal choice. It's certainly easier to deal with than FCoE, which is looking pretty obsolete.

We've been fed the hype about converged Ethernet for several years now. With iSER on Ethernet, maybe the hype is now reality.