This paper analyzes the Alibaba production trace, which co-locates different workloads to improve resource efficiency.
Trace: Homogeneous cluster. Each server with 96 cores and 1 unit memory normalized.
Key insights
Memory becomes the new bottleneck of datacenters => require efficient memory reclaim techniques
Batch-processing applications are treated as second-class citizens and restricted to utilize limited resources => potentially overprotect latency-critical applications
More than 90% of latency-critical applications are written in Java => massive self-contained JVMs complicate resource management
Questions
There are many Java applications in Alibaba, which may cause these issues. Is this a common scenario?