We describe an approach to solving a generic time-dependent differential equation (DE) that recasts the problem as a functional optimization one. The techniques employed to solve for the functional minimum, which we relate to the Sobolev Gradient method, allow for large-scale parallelization in time, and therefore potential faster "wall-clock" time computing on machines with significant parallel computing capacity. We are able to come up with numerous different discretizations and approximations for our optimization-derived equations, each of which either puts an existing approach, the Parareal method, in an optimization context, or provides a new time-parallel (TP) scheme with potentially faster convergence to the DE solution. We describe how the approach is particularly effective for solving multiscale DEs and present TP schemes that incorporate two different solution scales. Sample results are provided for three differential equations, solved with TP schemes, and we discuss how the choice of TP scheme can have an orders of magnitude effect on the accuracy or convergence rate.