Because motion estimation represents a major computational load in typical video encoding systems (e.g., around 50% of computation time when full search is used as in [11]), there has been extensive research into fast motion estimation techniques (and thus with the current state-or-art fast algorithms. e.g. [2], the above percentage can be reduced to around 10%). Given the nature of the process, two major classes of complexity reduction techniques have been proposed, These seek to speed up search times by (i) reducing the cost of each matching operation or (ii) reducing the number of points considered in the search region. In Fast matching (FM) techniques, a typical approach is to compute the cost function (e.g., SAD) based on a subset of pixels in a block. In fast search (FS) approaches, the complexity reduction comes from restricting the number of points in the search region, based on fixed rules (e.g. three step search) or on initialization based on motion vectors already computed for other blocks or the previous frame. In this paper we use as a baseline algorithm the initialize-technique (a modification of the algorithm of [2]) which belongs to the FS class. We concentrate on the case of real time software video encoding, which allows the flexibility-of using variable complexity algorithms (VCAs). Thus, we modify our baseline algorithm using a Lagrange multiplier approach similar to that of [1] which allows us to explicitly take into account the trade-offs between search complexity and residual frame energy. Furthermore, we combine this algorithm with a novel fast matching method for SAD estimation which allows us to estimate the SAD based on successive subsets of pixels is in a particular block. This method naturally possesses computational scalability because we can stop the computation once we have sufficient confidence in our estimate. This can be easily done in a hypothesis testing framework and gives us one more degree of freedom to central the complexity/residual energy trade-off. We show that; the combined algorithm achieves reductions of around 25% in computation time: with respect to the original algorithm without SAD estimation. As-in [12], these results are further improved by designing a test structure that is optimized far typical sequences and where tests for an early termination of the matching process are only included if they are thought to re worthwhile in terms of the overall complexity.