Papers accepted at Journal of Systems Research (JSys) & Machine Learning and Systems (EuroMLSys)
2024/05/01
TK researchers, in collaboration with partners from the Paderborn University, the Queen Mary University of London, and The University of North Carolina, published two papers in the Journal of Systems Research (JSys) titled “IPA: Inference Pipeline Adaptation for High Accuracy and Cost-Efficiency” and Machine Learning and Systems (EuroMLSys) titled “Sponge: Inference Serving with Dynamic SLOs Using In-Place Vertical Scaling”. In IPA, they tackle the challenge of optimizing multi-model inference pipelines to meet strict requirements in machine learning production systems by introducing an online system for adapting inference pipelines. In Sponge, they present an inference serving system that guarantees end-to-end request latency under a non-stable network using in-place vertical scaling, dynamic batching, and request reordering. Interested parties can find the presentation . here
Citation information:
- Saeid Ghafouri, Kamran Razavi, Mehran Salmani, Alireza Sanaee, Tania Lorido Botran, Lin Wang, Joseph Doyle, and Pooyan Jamshidi. 2024. [Solution] . In Journal of Systems Research, 4(1). IPA: Inference Pipeline Adaptation to achieve high accuracy and cost-efficiency
- Kamran Razavi, Saeid Ghafouri, Max Mühlhäuser, Pooyan Jamshidi, and Lin Wang. 2024. . In Proceedings of the 4th Workshop on Machine Learning and Systems (EuroMLSys '24). Sponge: Inference Serving with Dynamic SLOs Using In-Place Vertical Scaling