Optimal round and sample-size complexity for partitioning in parallel sorting