This paper presents a machine learning approach to predict the amount of compute memory needed by jobs which are submitted to Load Sharing Facility (LSF®) with a high level of accuracy.

LSF® is the compute resource manager and job scheduler for Qualcomm chip design process. It schedules the jobs based on available resources: CPU, memory, storage, and software licenses. Memory is one of the key resources and its proper utilization leads to a substantial improvement in saving machine resources which in turn results in a significant reduction in overall job pending time. In addition, efficient memory utilization helps to reduce the operations cost by decreasing the number of servers needed for the end-to-end design process.

In this paper, we explored a suite of statistical and machine learning techniques to develop a Compute Memory Recommender System for the Qualcomm chip design process with over 90% accuracy in predicting the amount of memory a job needs. Moreover, it demonstrates the potential to significantly reduce job pending time.

Filed under: Recommender Systems