Efficient In-Dram Near-Bank Processing For Emerging Parallel Computing Workloads