Most DSP synthesis tools perform limited architectural transformations to optimize hardware and power. Multiplications are often implemented with shift and-add operations for hardware efficiency. In this paper, we propose an optimization that combines a numerical transformation called number-splitting with a shift-and-add decomposition scheme, The numerical transformation "globally" changes the constant multipliers and the data flow-graph of the system under design, enabling implementations with fewer shifts and adds. The decomposition of multiplications into shifts and adds is such that as much intermediate computation results (partial products) can be reused as possible. The total number of operations can be reduced to 30% for two's complement encoding, yielding significant poser and hardware saving.