Sphere Decoding for Multiple-Input MultipleOutput ( MIMO) wireless systems is a complex operation, usually demanding custom accelerators in order to support real-time performance. The cost of these accelerators is disproportionately influenced by channel matrix preprocessing, which represents a relatively small fraction of the overall computational cost of detecting an OFDM MIMO frame in standards such as 802.11n, but consumes a very large amount of hardware resource. Modified Squared Givens' Rotations has been proposed to resolve this issue and shown to dramatically reduce accelerator cost. However, there is no analysis on the record of the complexity of this algorithm, nor its detection performance. This paper shows that, despite offering modest reductions in operational complexity, MFSD-SQRD enables dramatic cost reductions by explicitly addressing the overhead of matrix permutation steps. Further, it shows that for most SNR values of practical interest, the performance of MFSD-SQRD is not appreciably diminished relative to the standard SQRD approach to preprocessing. To the best of the authors' knowledge, the proposed modified SQRD preprocessing approach is the highest performance sub-optimal preprocessing approach on record.