We present an implementation on a graphics processing unit (GPUs) of a maximum-likelihood (ML) contractinggrid algorithm that performs joint estimation of amplitude and time of scintillation pulses. ML estimation consists in finding the parameters that maximizes the likelihood. We proposed a multivariate normal model for the likelihood, which implies that we need mean pulses and covariance matrices for every parameter variation. In a previous work, we performed a Fisher information analysis to determine the minimum number of samples that contains the most timing information in a digital pulse, for a given acquisition rate. Here, we use just the right amount of samples to reduce the size of mean pulses and covariance matrices. In order to reduce the number of precomputed mean pulses and covariance matrices, we take advantage of the linearity of scintillation pulses, which allows us to scale a normalized pulse to obtain mean pulses of any amplitude. We make use of this information to limit the amount of computation and shared memory used in our GPU code, while preserving full timing performance. We developed our code on an Nvidia Tesla C2075 GPU and we were able to process 1.5 million events per second. The estimation algorithm and GPU implementation can be used in real-time for systems that require high temporal resolution such as time-of-flight positron emission tomography or in processes that have high gamma-ray count rate.