AUGUST 1996

Image File Formats

by Wayne Corbin

Iwas recently asked what is the best format to store images in. Unfortunately the answer I give to this question is the same one I give to most of my questions. It depends on what you are doing with it.

Recently I started publishing a list of file extensions, and as you can see from that document there are a huge number of different file formats around. The common considerations used to decide on your format are:

What formats will the expected viewer programs accept ?
How important is image size ?
What quality of picture do I require ?

As an example I prefer GIF images over PCX images because you can get anywhere between 30 and 80 percent space saving without loss of image quality.

RASTER VERSUS VECTOR IMAGES

There are two kinds of images in use on computers - vector and raster. Raster images, used with paint programs, are composed of pixels, and each pixel has qualities (such as color) associated with it. The bigger the image, the more pixels it has, and the larger the file size. Vector images are not composed of pixels. These images include instructions used to reconstruct the objects that make up the complete image. For example, a circle might be described in terms of its center, radius, colour, shading, and the line weight used to draw it.

Both Types of image file has its uses. The primary difference between them has to do with pixels. If you try to enlarge an image made of pixels (raster images), you just turn the pixels into larger blocks. This enlarges rough edges, and at large sizes, the jagged edges are obvious. A vector image, on the other hand, does not use pixels; you can enlarge the image to any size you need and still get smooth line. By reducing a raster image the quality of the picture is compromised (Data is lost) while a Vector image can be scaled down without any quality loss (unless scaling is done to an extreme).

Trying to convert a photograph to a vector image would be an exercise in stupidity. You should use Raster images for photographic or freehand graphics and Vector graphics for Drawings and diagrams. If you are designing a logo, a vector format works better because you can scale the image to any size without the problems inherent with pixels. You can use a small version of the logo on letterhead, and a large version as the heading on a poster. If you plan to combine raster and vector images, you can either use a program that accepts both as imports or use a format that can incorporate both kinds of images such as EPS or CGM files.

IMAGE COMPRESSION

High-resolution images use large amounts of memory and hard disk space. For example, a full color image of a single A4 page can occupy more than 2OM!

Although huge hard disks to store this kind of data are more common, most users need some form of compression to make practical use of high-resolution images.

There are two different kinds of compression available. One method looks for repeating patterns in the source file. Instead of storing all the source data, it stores the patterns plus instructions for combining them to re-create the original file. Using this technique, the restored file is identical to the source file. eg (Gif, Tif) The other kind of compression is called lossy compression (not lousy!). As the name implies, some data is lost during compression. Instead of looking for repeating patterns, lossy compression methods perform sophisticated mathematical analysis on the source data. These methods vary, but they are all complicated, full of references to cosines, DCT (discrete cosine transform), forward DCTs, Q-factors, and quantization matrices.

The better the image, the more data you have to deal with, and the amount of data grows rapidly as image quality increases. It is not unusual to see file sizes of 24M for large, 24-bit color images. Even the best compression techniques begin to falter with this much data. There are many ways to store image data. The most common method used on PCs involves RGB-the proportions of red, green, and Blue that make up the image. Looking at an image as a combination of red, green, and blue is not efficient. However, there are other ways of looking at color. For example, you can separate a color into its hue, saturation, and brightness. Any such description is a mathematical model for a color, and each model has advantages and disadvantages. One of the most efficent and well known methods is the Fractal method. There are very few programs which support this standard (Fractal Painter, ...) because it is a patented algorithm and royalties are required for each program sold which incorportates the logic.

Lossy compression techniques do have one salvation: many offer a way to control the amount of loss. You can specify a factor that controls how severely the compression affects picture quality. If you can accept some image degradation, compression ratios of 40:1 or more are possible.

The most commonly supported lossy compression method is called the Joint Photographic Experts Group (JPEG). In addition to converting from RGB to make compression simpler, JPEG arranges the colors in the image to determine which are more commonly used. By specifying a Q factor, you can determine how many of the less frequently used colors are dropped (converted to more commonly used colors). This gives you some control over the degree of image loss involved in the compression. This explanation is over- simplified, but it gives you some idea how lossy compression can be varied. JPEG breaks the image down into small blocks and compresses each block before moving on to the next. This allows for compression-on-the-fly, but it introduces some problems too. Block boundaries can become over-emphasized, distorting an image at high compression ratios. JPEG also has a harder time handling colors that involve high frequencies (blue). This means that bluish colors have fewer variations. Because this limitation matches the eye's reduced capability to distinguish blue color variations, this isn't usually a problem. It should be noted that JPEG is a standard for Compressing image data but that doesn’t describe how I should store the compressed data, JFIF (JIF) is a standard for how JPEG compressed data should be stored.

Lossy compression has tremendous advantages when compared to conventional compression techniques. If nothing else, the ability to reach compression ratios of 10:1, 20:1, and beyond without seriously distorting an image dramatically overshadows the paltry 2:1 and 3:1 compression ratios used on text files. Because lossy compression involves sophisticated data handling, it takes time to perform the calculations. Several manufacturers offer hardware that has built-in image compression, but its use is limited to special situations. The benefits of hardware compression are significant-as much as 20 times faster than software compression. However, unless hardware compression/decompression becomes part of most PCs, software has to do the job.

Lossy compression has its problems. Compressing a restored file a second time (for example, after additional editing) results in further loss of data, and additional image degradation. Certain applications, such as medical imaging, cannot tolerate any loss at all.

IMAGE SIZES/RESOLUTIONS

There are many ways to store image data. The most common method used on PCs involves RGB-the proportions of red, green, and Blue that make up the image. Using in 8 bits (1byte) to store the color information enables you to store as many as 256 colors for each pixel. Using fewer than 256 colors results in a low-quality image. Expanding the number of bits to 16 or 24 gives you 32,768 or 16.7 million colors, respectively. These images are much more photo-realistic, particularly at higher resolutions.

Bits per Pixel	Maximum colours per pixel
1	2 (Black & White)
4	16
8	256
16	65,536.
24	16,777,216
32	16,777,216

Different image file formats are capable of holding different quantities of colors. Each file format will have a reference to the number of bits-per-pixel that the format is capable of supporting.

I was once told by an “image expert” in our club that colour is more important than resolution. A true colour low res image is easier for people to interpret than a black & white high resolution image.

There is no advantage at storing an A4 raster image at 300 DPI if you wish to display it only on a video screen. The viewer will be required to throw away data to shrink the image and will result in doubtful results. It is better to scan the image at 75 dpi in the first instance. Likewise scanning an image at 75 dpi to print on a 400dpi printer will give dreadful results.

RECOMMENDED IMAGE FORMATS TO EVALUATE FOR YOUR NEEDS:

JFIF, GIF, PCX, TIFF, MPEG

Guide to Image File Formats

BMP / DIB FILE FORMATS

OS/2 FORMAT: The OS/2 formats were the first of the two different formats designed. Images saved using this format may be used with OS/2's Presentation Manager. OS/2 BMP and DIB files are not compressed (RGB encoded).
WINDOWS FORMAT: An enhanced "DIB" file format was released with Microsoft Windows. Windows BMP and DIB files may be saved using no compression (RGB encoded) or using run length encoded compression (RLE encoded).
This format is very portable however requires a reasonable level of Storage.

Format characteristsics:
BMP-OS/2-RGB - Bits-per-pixel 1, 4, 8, 24.
BMP-Windows-RGB - Bits-per-pixel: 1, 4, 8, 24.
BMP-Windows-RLE - Bits-per-pixel 4, 8.
DIB-OS/2-RGB - Bits-per-pixel: 1, 4, 8, 24.
DIB-Windows-RGB - Bits-per-pixel: 1, 4, 8, 24.
DIB-Windows-RLE - Bits-per-pixel: 4, 8.

CLP FILE FORMAT

The CLP file format is used by the Windows Clipboard viewer. These files may be saved or loaded into the clipboard. The clipboard supports many different internal formats.

Format characteristsics:
CLP - Bitmap and DIB - Bits-per- pixel: 1, 4, 8, 24.

CUT FILE FORMAT

The CUT format comes from the Dr. Halo program. The CUT format does not contain palette information. The palette information for a CUT file is contained in a PAL file that has the same name (but with the PAL file extension). If no PAL file with the same name is contained in the same directory, the file is assumed to be a greyscale image.

Format characteristsics:
CUT - Bits-per-pixel: 8.

EPS FILE FORMAT

EPS is short for Encapsulated PostScript. EPS is a device-independent page description language for both text and graphics. The result of the device independence allows the files to cross platforms and produce identical output on any PostScript printer. The language is very complex with an ability to include more than just graphics.(Vector/Raster images)

Format characteristsics:
EPS - Bits-per-pixel 1, 4, 8, 24.

GIF FILE FORMAT

GIF files were designed to create the smallest possible image files for uploading and downloading from electronic Bulletin Board Systems (BBS). There are two GIF file versions; 87a and 89a. Version 87a was the first of the two versions to appear. Version 89a added new features to the 87a format.

Both versions may use an encoding method referred to as interlacing. When an image is saved by using four passes instead of just one, it is called interlacing. On each pass, certain lines of the image are saved to the file. If the program decoding a GIF file displays the image as it is decoded, the user will be able to see the four passes of the decoding cycle. This will allow the user to get a good idea of what the image will look like before even half of the image is decoded.

Some communication programs allow the user to download GIF files and view them as they are downloaded. If the image is interlaced, the user will be able to decide if the image is one they like before half of the download is complete. If the user does not like the image, the download can be aborted. This results in the saving of time and money for the person downloading the image. Both versions 87a and 89a may be interlaced and may contain more than one image

Format characteristsics:
All the GIF formats support bits-per-pixel: 1, 4, 8.

IFF FILE FORMAT

The IFF file format was developed by Electronic Arts for the Amiga computer. The file may contain more information than just the image. This extra information is generally for multimedia purposes. This is not a heavily supported format but is useful for corresponding with an Amiga.

Format characteristsics:
IFF - Bits-per-pixel 1, 4, 8.
IFF - Bits-per-pixel: 1, 4, 8.

IMG FILE FORMAT

IMG files were designed to work with the GEM environment. The files were originally the result of the GEM Paint program. Since the application Ventura Publisher worked in the GEM environment, it also supported the IMG file format. This iamge format is used in order to maintain compatibility with various desktop publishing applications.

Format characteristsics:
The IMG-Old Style - Bits-per-pixel: 1, 4, 8.
The IMG-New Style - Bits-per-pixel: 1, 4, 8.
(Images that are more than 1 bit-per- pixel are greyscaled images.)

JAS FILE FORMAT

The JAS file format was used before the JPEG file format had been sufficiently standardised. Since the JPEG file format has now become standardized the JAS file format has become obsolete. If you want to take advantage of a lossy format for higher compressions, use the JIF/JPG file format.

Format characteristsics:
JAS - Bits-per-pixel: 8, 24

JIF/JPG FILE FORMATS

The Joint Photographic Expert Group created a new standard known as JPEG. For sometime the JPEG existed as only a series of required steps to compress an image. No standard was given as to how the resulting compressed image should be saved to a file. As a result many JPEG files were being created but could not be read by any other application. Finally, a group of computer industry leaders developed a standard that is known as JPEG File Interchange Format (JFIF). Originally these new JFIF files used the extension of JPG. The latest standard by the RIF group calls for the use of JIF as the file name extension. Unless you have an application that requires the use of JPG you should use the JIF file name extension. The JPEG file format only supports 24 bits per pixel and 8 bits per pixel greyscaled images. Since the JPEG file format is a lossy format, you should load your image after saving it to ensure that the amount of loss is acceptable.

Format characteristsics:
JIF/JPG - Bits-per-pixel 8, 24

LBM FILE FORMAT

The LBM file format comes from Deluxe Paint. The file format uses a run length encoding compression to help reduce the size of the files. This is not a very popular image format.

Format characteristsics:
LBM - Bits-per-pixel: 1, 4, 8.

MAC FILE FORMAT

MAC files come from the Macintosh program MacPaint. Large libraries of clip art exist in the MAC format. When the MAC files started migrating from the Macintosh to the PC world, a header was added to the file format. The MAC format requires an image width of 576 pixels and a height of 720 lines.

Format characteristsics:
MAC- No header - Bits-per-pixel: 1.
MAC- Header - Bits-per-pixel: 1.

MSP FILE FORMAT

MSP files come from the Microsoft Paint program (that came with Windows versions prior to version 3.0). There are two versions of the MSP file format.

Format characteristsics:
MSP-Old Version - Bits-per-pixel: 1.
MSP-New Version - Bits-per-pixel: 1.

PCD File Format

The PCD file is the Kodak Photo CD file format. This allows photographs to be placed on CDs for use with compact disks players that are connected to computers and televisions. The images are placed on the CDs by photofinishers that use the Kodak Photo CD imaging workstation. Five different sizes of each image are placed on the CD. When you open a PCD file you will be prompted for which of the five sizes you would like to open. The largest size, 2048x3072 at 24 bits per pixel requires 18.9 megs of memory to load.

Format characteristsics:
PCD - Bits-per-pixel: 8, 24.

PCX FILE FORMAT

PCX files were originally created for use with the Zsoft Paintbrush program. With no standard to the industry, this format became the standard by default. This format is supported by more applications than any other format. Version 3 files do not contain palette information this can result in a different looking image depending on the viewer and the pallete it uses. If you don’t mind the space usage then this is the best format for compatibility with other people.

Format characteristsics:
PCX Version 0 - Bits-per-pixel: 1.
PCX Version 2 - Bits-per-pixel: 1, 4.
PCX Version 3 - Bits-per-pixel: 1, 4.
PCX Version 5 - Bits-per-pixel: 1, 4, 8, 24.

PIC FILE FORMAT

The PIC files that are supported come from Pictor/PC Paint. This PIC file format is not compatible with the Lotus PIC files.

Format characteristsics:
PIC - Bits-per-pixel: 1, 4, 8.

RAS FILE FORMAT

RAS files are Sun Microsystems raster file format files. There are three types of RAS files:

Type 0 - Old style.
Type I - Modem style.
Type 2 - Experimental.

Unless you are playing with unix forget this format.

Format characteristsics:
RAS-Type I-Modem Style - Bits-per- pixel:1, 8, 24, 32.

RLE FILE FORMATS

The RLE format comes in two types; Windows and CompuServe. The CompuServe format is very limited in what it can hold. Images must always be 1 bit- per-pixel and the size of the image must be either 256 by 192 or 128 by 96. Windows RLE files are Windows "DIB" files that use one of the RLE compression rou- tines. Saving an image as a DIB or BMP, using one of the RLE compressions would produce an identical file as saving the image as an RLE file. The only difference would be the file name extension. An RLE image file may be used as a replacement opening screen for Windows.

Format characteristsics:
RLE - CompuServe - Bits-per-pixel: 1.
RLE - Windows - Bits-per-pixel: 4, 8.

TGA FILE FORMAT

The Targa TGA format was developed by Truevision for their Targa and Vista products. It is an industry standard although not as widely supported as PCX or TIFF formats. TGA files may be saved as non-compressed or compressed (runlength encoded).

Format characteristsics:
TGA - No Compression - Bits-per- pixel: 8, 16, 24, 32.
TGA - Compressed - Bits-per-pixel: 8, 16, 24, 32.

TIFF FILE FORMAT

The Tagged Image File Format (TIFF) was designed to become the standard format. In order to become the standard, the format was designed to handle just about any possibility. The result of this design provided the flexibility of an infinite number of possibilities of how a TIFF image is saved. Therefore, no application can claim to support all TIFF variations. The best that an application can do is to support as many TIFF variations as possible, but there will always be an obscure variation that will cause a problem. The TIFF format differentiates between types of images. These categories are: black and white, greyscaled and colored.

Format characteristsics:
TIFF-No Compression - Bits-per-pixel: 1, 4,8,24.
TIFF-Huffman - Bits-per-pixel: 1.
TIFF-Pack Bits - Bits-per-pixel: 1, 4,8,24.
TIFF-LZW - Bits-per-pixel: 1, 4,8,24.
TIFF-Fax Group 3 - Bits-per-pixel: 1.
TIFF-Fax Group 4 - Bits-per-pixel: 1.

WMF FILE FORMAT

The WMF format is the Windows Meta File. A meta file is a Vector image rather than a Raster image like most of the other formats. When you open a WMF file, the image may be rescaled to a new size. If the original size of the image is know it will be displayed in the dialog box for your reference. You may then enter the width and height that you would like the meta file displayed as.

Format characteristsics:
WMF - Bits-per-pixel: 1, 4, 8, 24.

WPG FILE FORMAT

The WPG format is the format used by WordPerfect. It first appeared with the release of WordPerfect 5.0. With the release of version 5.1, the format was changed. A WPG file may contain an image made up of vector data or raster data (a bitmapped image).

Format characteristsics:
WPG Version 5.0 - Bits-per-pixel: 1, 4, 8
WPG Version 5.1 - Bits-per-pixel: 1, 4, 8