We consider a piecewise linear, discontinuous Galerkin method for the time discretization of a fractional diffusion equation involving a parameter in the range -1 < alpha < 0. Our analysis shows that, for a time interval (0, T) and a spatial domain , the uniform error in L-infinity((0, T); L-2()) is of order k(rho), where rho = ming(2, 5/2+alpha) and k denotes the maximum time step. Thus, if -1/2 < alpha < 0, then we have optimal O(k(2)) convergence, just as for the classical diffusion (heat) equation.