Multi-Process Service
Since each MPS client process has fully isolated address space, each client context allocates independent context storage and scheduling resources.
Volta分配独立的地址空间和上下文存储,这一数量随着分配的可用线程数增加而增加。
The near optimal provision strategy is to non-uniformly partition the available threads based on the workloads of each MPS clients (i.e., set active thread percentage to 30% for client 1 and set active thread percentage to 70 % client 2 if the ratio of the client 1 workload and the client2 workload is 30%: 70%). This strategy will concentrate the work submitted by different clients to disjoint sets of the SMs and effectively minimize the interference between work submissions by different clients.
分配到不同SM
The most optimal provision strategy is to precisely limit the number of SMs to use for each MPS clients knowing the execution resource requirements of each client (i.e., 24 SMs for client1 and 60 SMs for client 2 on a device with 84 SMs). This strategy provides finer grained and more flexible control over the set of SMs the work will be running on than the active thread percentage.
直接限制SM数目更好
Volta MPS Device Memory Limit
内存也可以限定。但这里