Extremely large-scale multiple-input multiple-output (XL-MIMO) constitutes the design trend for base stations of future wireless communication systems, being capable of offering pencil-like beamforming that confronts path loss in an energy-efficient manner. However, wideband wireless applications with XL-MIMO antenna arrays are usually subject to near-field signal propagation conditions, frequency selectivity, and the spatial-wideband effect, whose ignorance in the beamforming optimization process will severely degrade the achievable performance. In this paper, we present an algorithmic framework for designing near-field reception beamforming of wideband multi-user XL-MIMO systems realized with holographic metasurface-based antenna arrays (HMAs). We first present a spherical-wave-propagation channel model, including the near-field effect, frequency selectivity, as well as the spatial-wideband effect. Based on this model, we formulate an HMA-based reception beamforming optimization problem for the uplink of multi-user XL-MIMO communications, whose optimal solution is challenging to obtain due to the nonlinear coupling between the high-dimensional analog combining weights and the digital combiner. To efficiently address the proposed framework via a convergent iterative approach, the considered sum-rate design objective is transformed into a sum-mean-square-error-minimization one. Our extensive numerical investigations showcase that the proposed HMA-based combining scheme can effectively deal with the practical effects under investigation, achieving a higher sum rate than conventional phase-shifter-based hybrid analog and digital combiners having the same antenna aperture.