00_Cover_171110_ON - page 39

R E S E A R C H @ H K U S T
37
Making Data Centers
Communicate Faster
Enter a single web search query and it can
set in motion communication between
thousands of physical machines in a data
center as the machines quickly retrieve
and collect the information corresponding
to your keywords. A major goal of Prof
Kai Chen and his research group is to
accelerate such communication (data
flow) between machines to help deliver
the cloud and big data applications that
now facilitate our way of life, including
search engines, financial services, social
networking, andmany others.
Like Prof Ke Yi’s work on databases,
Prof Chen’s approach to networked
system design is theoretically significant
and also practical. He seeks to achieve
very high throughput and low latency
(faster speed at the millisecond/
microsecond level) while not requiring
industry to make hard-to-adopt changes
to applications or customize hardware.
He is proving successful. In 2016, his
team became the first in Hong Kong to
have first-author papers accepted for the
eminent ACM SIGCOMM conference.
In these two papers, Prof Chen’s group
designed two systems, KARUNA and
CODA, that answered key problems
through innovative solutions that could
be directly implemented.
KARUNA optimizes cloud application
performance by delivering the first
mix-flow (data flows with and without
deadlines) scheduling solution for
data centers. Unlike existing solutions,
KARUNA maximizes the deadline meet
rate of data flows with deadlines while
minimizing the transmission time of
data flows without deadlines. This is
achieved by prioritizing deadline flows
while controlling their sending rates,
Data centers
provide the
infrastructure for
big data and cloud
computing. Our goal is
to make data center
communication faster
PROF KAI CHEN
Associate Professor of Computer Science
and Engineering
using the remaining bandwidth to
schedule non-deadline flows. The system
does not require prior knowledge of
data size or impractical switch hardware
modifications, filling an important gap in
data center flow scheduling.
CODA is a system that can auto-
matically identify and schedule coflows
(a collection of parallel flows sharing
a common performance goal) without
requiring changes to applications –
an impractical requirement of other
coflow-based solutions. This is achieved by
employing a machine learning algorithm
to rapidly identify coflows, complemented
by a coflow scheduler which is tolerant
of identification errors. Testbed and
large-scale simulations showed CODA’s
performance to be equivalent to solutions
requiring applications to be adapted.
Prof Chen has also worked closely
with major companies. Collaborations
with Huawei have led to technologies
for Software Defined Networks (SDN),
prototypes that use machine learning for
efficient communication, and patents.
Prof Chen received Huawei’s inaugural
Distinguished Collaborator Award in 2016.
At Tencent, the HKUST team has
contributed to new-generation machine
learning system Angel by designing an
efficient data flow scheduling scheme that
can improve themachine learning algorithm
convergence time by up to 90%. The
overall performance of Angel is 70 times
faster than previous systems tested. Tencent
has deployed it to support advertisement
and video recommendation services.
Making these advances possible is the
100-plus machine data center that Prof
Chen and his team have built from scratch,
providing an essential in-house testbed
at HKUST to try out the feasibility of
a proposed solution. Once outcomes reach
the required performance levels, large-scale
simulations can be used to demonstrate
scalability. Meanwhile, Prof Chen is
setting his sights on further frontier work,
including optical networking and artificial
intelligence(AI)-enabled networking,
currently undergoing testing at HKUST.
Prof Chen’s testbed implemented at HKUST.
1...,29,30,31,32,33,34,35,36,37,38 40,41,42,43,44,45,46,47,48,49,...64
Powered by FlippingBook