At least twice the highest detectable frequency of the signal is exactly what you need to sample at, after anti-aliasing filtering.
Generally we would much rather demodulate all the signals first and then digitize them, since they would all be in baseband and thus require the least amount of space for a particular stream. Alternatively we would digitize the composite signal. While demodulation adds noise, it's generally assumed the noise is insignificant compared to other sources.