A direct sum theorem for two parties and a function f states that the communication cost of solving k copies of f simultaneously with error probability 1/3 is at least k . R-1/3 (f), where R-1/3 (f) is the communication required to solve a single copy of f with error probability 1/3. We improve this for a natural family of functions f, showing that the 1-way communication required to solve k copies of f simultaneously with probability 2/3 is Omega(k.R-1/k (f)). Since R-1/k (f) may be as large as Omega(R-1/3 (f) . log k), we asymptotically beat the direct sum bound for such functions, showing that the trivial upper bound of solving each of the k copies of f with probability 1 - O (1/k) and taking a union bound is optimal! In order to achieve this, our direct sum involves a novel measure of information cost which allows a protocol to abort with constant probability, and otherwise must be correct with very high probability. Moreover, for the functions considered, we show strong lower bounds on the communication cost of protocols with these relaxed guarantees; indeed, our lower bounds match those for protocols that are not allowed to abort. In the distributed and streaming models, where one wants to be correct not only on a single query, but simultaneously on a sequence of n queries, we obtain optimal lower bounds on the communication or space complexity. Lower bounds obtained from our direct sum result show that a number of techniques in the sketching literature are optimal, including the following: (JL transform) Lower bound of Omega(1/epsilon(2) log n/delta) on the dimension of (oblivious) Johnson- Lindenstrauss transforms. (l(p)-estimation) Lower bound for the size of encodings of n vectors in [+/- M](d) that allow l(1) or l(2)-estimation of Omega(n epsilon(-2) log n/delta (log d + log M)). (Matrix sketching) Lower bound of Omega (1/c(2) log n/delta) on the dimension of a matrix sketch S satisfying the entrywise guarantee vertical bar(ASS(T) B)(i;j) (AB)(i;j) - (AB)(i,)j vertical bar <= epsilon parallel to A(i)parallel to(2)parallel to B-j parallel to 2. (Database joins) Lower bound of Omega(n 1/c(2) log n/delta log M) for sketching frequency vectors of n tables in a database, each with M records, in order to allow join size estimation.