In this paper, training-based transmissions over a priori unknown Rayleigh block fading channels are considered. The input signals are assumed to be subject to peak power constraints. Prior to data transmission, channel fading coefficients are estimated in the training phase with the aid of pilot symbols. In this setting, the capacity and capacity-achieving input distribution are studied. The magnitude distribution of the optimal input is shown to be discrete with a finite number of mass points. The capacity, bit energy requirements, and optimal resource allocation strategies are obtained through numerical analysis. The bit energy is shown to grow without bound as SNR decreases to zero due to the presence of peakedness constraints. Capacity and energy-per-bit are also analyzed under the assumptions that the transmitter interleaves the data symbols before transmission over the channel, and per-symbol peak power constraints are imposed. Comparisons of the performances of training-based and noncoherent transmission schemes are provided.