Recent research has shown the potential for using programmable network components such as switches for distributed data processing. Opportunities include in-network caching and the execution of distributed SQL operations such as joins or aggregations. However, a major weakness of the current generation of programmable switches is that the hardware still has many limitations not only with regard to what type of operations are supported in a switch (e.g., no loops), but also that the switches can often not sustain processing at line-rate.
As a first contribution of this paper, we propose a new switch architecture that can be employed as an in-network co-processor for analytical SQL workloads. Different from existing commercial switches, our switch architecture is based on an FPGA design and supports complex operations at line-rate. As a second contribution, we discuss how a typical distributed database architecture has to be changed to efficiently leverage the new switch architecture. In our evaluation we show that our new switch architecture can significantly speed-up distributed query processing by up to 7× compared to traditional shuffled-based approaches without in-network processing capabilities.