Why is there so much buzz about SmartNICs? One reason may be VMware’s announcement of Project Monterey. Project Monterey is an industry-wide collaboration to integrate SmartNICs into VMware Cloud Foundation. SmartNICs have been around for a while, we’re just seeing them emerge as a platform. To get started, let’s define some terms!
Quick Definition
Traditionally, we have used NICs (Network Interface Connectors) as a pipe. They bring data into a server, and then take data away from the server to another destination. The server was responsible for performing compute on any of the data.
As we’ll see in the next section, our infrastructure components have evolved. Developers need these modern tools to build applications that would not have been possible five or ten years ago. Building modern data applications requires something more than a passive data pipe! That’s where SmartNICs come in.
Here is PC Magazine’s definition of a SmartNIC:
A network interface card (network adapter) that offloads processing tasks that the system CPU would normally handle. Using its own on-board processor, the smartNIC may be able to perform any combination of encryption/decryption, firewall, TCP/IP and HTTP processing.
The VMware announcement blog post written by Kit Colbert has a great image of a SmartNIC.
Why is there Buzz About Smart NICs?
SmartNICs are needed now because of the need to better network and compute performance to power modern applications. The main principle is SmartNICs can offload everything from network capabilities to many data-plane and control-plane functions. Thankfully, hardware engineers started thinking about this a few years ago. Here are the top business reasons for using SmartNICs.
1. Moore’s Law and the Cost of Cores
Moore’s Law is finally catching up to servers. CPU performance is no longer doubling every two years, probably because as the geometries get smaller it’s harder to double performance.
Cores are an answer, but they are expensive. How do you balance buying a server with a higher core count with the performance the applications you’ll run on that server may need?
2. Networking Performance is More Complex Than Ever
According to this Mellanox-sponsored NextPlatform article (2019), in 2018 over 70% of Ethernet ports shipped with speeds about 10G/sec. According to them:
At that performance level, “the processing overhead required to serve the network interface starts to be come an issue – a fact exacerbated by the increasing complexity of modern-day data center networks which now may include support for performance acceleration techniques, virtualization, and overlay networks. One answer has become to offload some of the processing to the network interface controller (NIC)CPU costs are increasing”.
Additionally, SDN (software defined networking) has changed the game when it comes to networking. People don’t want to wait for new features when the hardware companies are finally able to role them out. Customers want a NIC that can be programmed to support functions and protocols as soon as they become available.
3. Fully Utilizing Multi-Socket Servers
Most servers host 2 or 4 sockets, but are these sockets well utilized? Many times, the second socket is used for a memory expansion, leaving several powerful cores idle. Additionally, the interconnect between the sockets (via PCI bus) will have a performance impact. For example, data may come into one socket at a certain speed. Ultimately is latency introduced as the data flows from the first socket to the second. Storage in particular may be impacted by this latency.
4. Storage Fault Domains
According to Fazil Osman of Broadcom, many customers use these powerful servers to build scale-out storage systems. With storage systems the fault domain is always important. For example, how many drives are behind a node? Storage people always think: what happens if that node fails? How much data will I lose? Can the data be replicated? The question becomes: can a SmartNIC be programmed to offload common storage problems in a scale-out system?
Osman’s presentation on SmartNICs is quite comprehensive. I used it to start getting myself up to speed on this topic, and would recommend it as a primer.
Different Types of SmartNICs
So far, we’ve covered the definition of a SmartNIC and what’s driving the buzz about SmartNICs.
Let’s talk briefly about the different types of SmartNICs. These were grouped different ways, based on the vendor. If there is a industry-wide definition for these, I’d love to see it! Let me know in the comments.
1. Multicore SmartNICs
Multicore SmartNICs are based on ASICs containing multiple CPU cores. These cores are usually Arm processers. According to this Achronix guide, these SmartNICs “may also incorporate fixed function hardware engines that can offload well-defined tasks such as standardized security and storage protocols.”
This is how the public clouds are using SmartNICs. These newer SmartNICs have a Linux-based operating environment, so you don’t need to know how to program a specific function processor language.
2. FPGA + NICs
FPGA stands for Field Programmable Gate Array. This allows you to program any data-plane function you want to offload from the NIC. It also allows you to change that programming as needed, meaning you don’t need to wait for the vendor to provide programming for something you need updated.
This architecture has very good performance. However, the person designing it must have expertise in the programming language to get the performance right. Additionally, larger devices can be expensive.
3. Network Function Processor
Network Function Processors are used mostly by Telco companies. Someone who knows those specific function processor languages is needed to program them.
Real Talk
SmartNICs are not new, public cloud providers have used them for years. However, this is also a methodology that can be used in on-premises datacenters. We just need to understand the possibilities. SmartNIC vendors include Broadcom, Ethernity Networks, Intel, Marvell, Mellanox (now Nvidia), Napatech, Netronome, Pensando, and Xilinx. Look for these companies in the Tech Field Day archives to hear directly from them.
I’m still in the early days of my learning curve about SmartNICs. The earliest information I found is from 2018. That means SmartNICs are a relatively nascent technology. Make sure you dig into the reality of this new technology. Don’t just rely on the buzz about SmartNICs!
I also used these sources while writing this post:
- https://www.design-reuse.com/articles/46833/how-to-design-smartnics-using-fpgas-to-increase-server-compute-capacity.html
- https://searchnetworking.techtarget.com/tip/An-introduction-to-smart-NICs-and-their-benefits