Optimizing resource utilization is a critical issue in cloud and cluster-based computing systems. In such systems, computing resources often consist of one or more GPU devices, and much research has already been conducted on means for maximizing compute resources through shared execution strategies. However, one of the most severe resource constraints in these scenarios is the data transfer channel between the host (i.e., CPU) and the device (i.e., GPU). Data transfer contention has been shown to have a significant impact on performance, yet methods for optimizing such contention have not been thoroughly studied. Techniques that have been examined make certain assumptions which limit effectiveness in the general case. In this paper, we introduce a heuristic which selectively aggregates transfers in order to maximize system performance by optimizing the transfer channel bandwidth. We compare this heuristic to traditional first-come-first-served approach, and apply Monte Carlo reinforcement learning to find an optimal policy for message aggregation. Finally, we evaluate the performance of Monte Carlo reinforcement learning with an arbitrarily-initialized policy. We demonstrate its effectiveness in learning optimal data transfer policy without detailed system characterization, which will enable a general adaptable solution for resource management of future systems.