In the context of video segmentation with depth sensor, prior work maps the Metropolis algorithm, a simulated annealing based key routine during segmentation, onto an Nvidia Graphics Processing Unit (GPU) and achieves real-time performance for 320×256 video sequences. However that work utilizes depth information in a very limited manner. This paper presents a new GPU-based method that expands the use of depth information during segmentation and shows the improved segmentation quality over the prior work. In particular, we discuss various ways to restructure the segmentation flow, and evaluate the impact of several design choices on throughput and quality. We introduce a scaling factor for amplifying the interaction strength between two spatially neighboring pixels and increasing the clarity of borderlines. This allows us to reduce the number of required Metropolis iterations by over 50% with the drawback of over-segmentation. We evaluate two design choices to overcome this problem. First, we incorporate depth information into the perceived color difference calculations between two pixels, and show that the interaction strengths between neighboring pixels can be more accurately modeled by incorporating depth information. Second, we pre-process the frames with Bilateral filter instead of Gaussian filter, and show its effectiveness in terms of reducing the difference between similar colors. Both approaches help improve the quality of the segmentation, and the reduction in Metropolis iterations helps improve the throughout from 29 fps to 34 fps for 320×256 video sequences.