New Research Enables Faster, More Efficent Machine Learning Models
Whether it’s ChatGPT or Meta’s AI assistant, ML services aim to maximize throughput or serve as many people as possible. At the same time, these companies want to answer user queries as fast as possible or decrease latency.
Anyone running a ML-powered service will focus on these two fundamental aspects of computer systems. The problem is that these goals conflict with one another.
Two new papers co-authored by School of Computer Science Assistant Professor Anand Iyer explore how both goals can be achieved. Through his research, he discovered methods that can enable companies to save money while providing faster responses to users of interactive models.
Anyone running a ML-powered service will focus on these two fundamental aspects of computer systems. The problem is that these goals conflict with one another.
Two new papers co-authored by School of Computer Science Assistant Professor Anand Iyer explore how both goals can be achieved. Through his research, he discovered methods that can enable companies to save money while providing faster responses to users of interactive models.