From www.faqs.org/.../section-6.html:
1. Transform the image into a suitable color space. This is a no-op for grayscale, but for color images you generally want to transform RGB into a luminance/ chrominance color space (YCbCr, YUV, etc). The luminance component is grayscale and the other two axes are color information. The reason for doing this is that you can afford to lose a lot more information in the chrominance components than you can in the luminance component: the human eye is not as sensitive to high-frequency chroma info as it is to high-frequency luminance. (See any TV system for precedents.) You don't have to change the color space if you don't want to, since the remainder of the algorithm works on each color component independently, and doesn't care just what the data is. However, compression will be less since you will have to code all the components at luminance quality. Note that colorspace transformation is slightly lossy due to roundoff error, but the amount of error is much smaller than what we typically introduce later on.
2. (Optional) Downsample each component by averaging together groups of pixels. The luminance component is left at full resolution, while the chroma components are often reduced 2:1 horizontally and either 2:1 or 1:1 (no change) vertically. In JPEG-speak these alternatives are usually called 2h2v and 2h1v sampling, but you may also see the terms "411" and "422" sampling. This step immediately reduces the data volume by one-half or one-third. In numerical terms it is highly lossy, but for most images it has almost no impact on perceived quality, because of the eye's poorer resolution for chroma info. Note that downsampling is not applicable to grayscale data; this is one reason color images are more compressible than grayscale.
Andreas Kriegl 2003-07-23