When we digitize data from a hyperspectral imager, we do so in three dimensions; the radiometric dimension, the spectral dimension, and the spatial dimension(s). The output can be regarded as a random variable taking values from a discrete alphabet, thus allowing simple estimation of the variable's entropy, i.e., its information content. By modeling the target/background state as a binary random variable and the corresponding measured spectra as a function thereof, we can compute the information capacity of a certain sensor or sensor configuration. This can be used as a measure of the separability of the two classes, and also gives a bound on the sensor's performance. Changing the parameters of the digitizing process, bascially how many bits and bands to spend, will affect the information capacity, and we can thus try to find parameters where as few bits/bands as possible gives us as good class separability as possible. The parameters to be optimized in this way (and with respect to the chosen target and background) are spatial, radiometric and spectral resolution, i.e., which spectral bands to use and how to quantize them. In this paper, we focus on the band selection problem, describe an initial approach, and show early results of target/background separation.