Most standardized file formats, as well as proprietary file formats, consist of a binary stream of data (in contrast to ascii text).
Examples include:
Word (.doc) files
PDF files
Audio (.wav, .aif, .snd ) files
MIDI (.mid) files
A standard unix utility for "reading" binary files is hexdump (od has been deprecated).
When reading binary data, it is generally necessary to read a certain number of raw bytes and then convert or cast those bytes to a specific data type (long, short, etc...).
With binary files, it is almost always necessary to account for the "endianness" of the file data.
Some processors, such as those by Intel, store multi-byte data with the lowest order bytes first (or left-to-right order). For example, a 32-bit integer would be arranged as byte0byte1byte2byte3. These systems are referred to as "little endian".
Other processors, such as those by Motorola, store multi-byte data with the highest order bytes first (or right-to-left order). For example, a 32-bit integer would be arranged as byte3byte2byte1byte0. These systems are referred to as "big endian".
Binary files must also assume a particular endianness, unless the file standard allows either.
It is critical that you know the particular endianness associated with a particular file type before attempting to read or write such a file. For example, audio WAV files are little endian. MIDI files are big endian. Data transmitted over networks is supposed to be big endian.
Is one way better than the other? Little-endian formats are typically more efficient for math routines because of the 1-to-1 correspondence of byte number and address offset. It is easy to determine the sign of big-endian formats (from the highest byte) and numbers are stored in the same order as they are printed, allowing binary to decimal conversions to be efficient.
Byte-swapping routines are provided in the Stk base class, as well as preprocessor definitions that identify the endianness of various operating systems.
It is possible to use routines for reading and writing multi-byte data that require no "special" swapping code (see What's this business about endianness?).
The Standard MIDI File format was adopted in 1988 as an extension to the MIDI specification primarily to allow the exchange of sequence data created on different programs.
There are three types of MIDI files:
Format 0: the MIDI data is represented in a single track, though perhaps using several MIDI channels.
Format 1: the MIDI data is represented by multiple tracks, all synchronized to a common time representation (the first track should provide a tempo map).
Format 2: the MIDI data is represented with multiple independent tracks, perhaps a collection of Format 0 sequences.
In addition to note data, MIDI files can contain Meta-Events, which include specifications for tempo, time signature, key signature, sequence and track names, lyrics, cue points, score markers, timing resolution, copyright notices, and sequencer-specific information.
All events in a MIDI file are time stamped with delta time values of variable length (up to 28-bits represented in 4 bytes of data). These values represent clock ticks, the exact duration of which are controlled via other MIDI file parameters or Meta-Events.