OptimFROG Lossless + DualStream + IEEE Float Audio Codec v5.xxx
Copyright (C) 1996-2015 Florin Ghido, all rights reserved.
Visit http://LosslessAudio.org/ for updates and more information.
@OptimFROG is also on Twitter. E-mail: florin.ghido@gmail.com




OptimFROG 4.5xx and later compressed file format specification
==============================================================

This document contains the compressed file format specification for
OptimFROG, and may be subject to change without prior notice.

The OptimFROG compressed file format was designed to be very flexible,
efficient, to permit adding new file formats, algorithm improvements
and additional (user) data without breaking backwards compatibility.


Sample is meant to specify a single sample value, not a sample point
(a value for each channel). For example, to calculate the duration of
some audio file you must compute samples / (channels * frequency).


All the data in an OptimFROG file is stored in blocks.

Every block has the following general fixed structure:

    4 bytes block ID
    4 bytes block length in bytes = N
    N bytes block data

The following block types are currently defined:


1. Main block, containing general information about the OFR file and
   the original audio file:

    4 bytes 'OFR '
    4 bytes main size = S >= 15
    6 bytes total number of samples in the audio file
    1 byte  sample type (see appendix A)
    1 byte  channel configuration (see appendix B)
    4 bytes sample rate
    2 bytes encoder ID (see appendix C)
    1 byte  compression (see appendix D)
    X bytes optional additional fields, where X = S - 15

If the main size is greater than 15, there are present additional
fields, which must be ignored (skipped).


2. Head block, containing all the data from the beginning of the audio
   file until the start of sample data:

    4 bytes 'HEAD'
    4 bytes head size = H
    H bytes audio file header data


3. Data block, containing compressed sample data and having all the
   information needed for independently decoding the block:

    4 bytes 'COMP'
    4 bytes data size = D
    4 bytes CRC32 of all the data after it (D - 4 bytes)
    4 bytes number of samples in the block
    1 byte  sample type
    1 byte  channel configuration
    2 bytes reserved for internal use
    2 bytes encoder ID
    P bytes compressed data stream, where P = D - 14


4. Tail block, containing all the data from the end of the sample data
   to the end of the audio file:

    4 bytes 'TAIL'
    4 bytes tail size = T
    T bytes audio file footer data


5. Recovery block, containing recovery information, used to repair
   header errors and sector level errors:

    4 bytes 'RECV'
    4 bytes recovery size = R
    R bytes recovery data


6. ID3v1.1/ID3v1.0 block, recognized by the start data 'TAG', and 128
   bytes in length. Must be (if present) the last block in the OFR
   file. The decoder is able to read or skip this tag.


7. ID3v2.x.x block, recognized by the start data 'ID3', and unknown
   bytes in length. Must be (if present) the first block in the OFR
   file. The decoder is able to skip this tag.


8. APEv2 block, recognized by the start data 'APET', and unknown
   bytes in length. Must be (if present) the last or just before
   the ID3v1.1/ID3v1.0 block in the OFR file. The decoder is
   able to read or skip this tag.


9. MD5 signature block, containing the MD5 hash of the original raw
   PCM input data:

    4 bytes 'MD5 '
    4 bytes hash size = M = 16
    M bytes MD5 hash (stored in the displayed order)

The raw PCM data is packed exactly as in the original file. Currently,
OptimFROG supports only sample formats in little-endian byte order.

In OptimFROG versions up to 5.001 inclusive, a MD5 hash value not
conforming to RFC 1321 was computed for files with a raw audio data
size, byteLength, from 56 to 63 modulo 64. No file from Audio CDs is
affected. Instead of appending byteLength * 8 as the message length in
bits, the value (byteLength + 64) * 8 was used. These hashes can still
be checked normally using any version of the command-line encoders.


The layout of the OFR file is not fixed, but the order of the blocks
must satisfy the following relation:
['ID3'] 'OFR ' [...] 'HEAD' 'COMP'* 'TAIL' [...] ['MD5 '] ['RECV']
                                           [...] ['APET'] ['TAG']


Appendix A - sample type
~~~~~~~~~~~~~~~~~~~~~~~~

Sample type is an extension to the number of bits for a sample, by
specifying only the type of the container used to store the data.
The compressor has the possibility to analyze the data and manage
optimally the situations where the effective number of bits is smaller
(for example 20 bit samples with the 4 LSBs set to zero in a
SINT24 container).

The following sample type values are currently defined:

    UINT8 = 0
    SINT8
    UINT16
    SINT16
    UINT24
    SINT24
    UINT32
    SINT32
    FLOAT32_0
    FLOAT32_16
    FLOAT32_24


Appendix B - channel configuration
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Channel configuration is an extension to the number of channels,
specifying also the channel mapping, helping the compressor in
selecting the optimal channel decorrelation scheme. The only required
information to be correct is that the channel configuration must give
the correct number of channels (for example, that the audio has 3
channels, even if the correct mapping is unknown).

The following channel configuration values are currently defined:

    MONO = 0
    STEREO_LR


Appendix C - encoder ID
~~~~~~~~~~~~~~~~~~~~~~~

The encoder ID is composed from the encoder version information and
the encoder build system. They are obtained as follows:

    version = (encoderID >> 4) + 4500
    system = encoderID & 0xF

The version is represented in the x.xxx format. The build system is an
identifier for the specific OFR executable used to create the file.
This is used to easily track system dependent problems and bugs.

The following build system values are currently defined:

    Win/x86 or Win/x86+SSE2 = 0
    Linux/x86 or Linux/x86+SSE2
    Linux/x64
    OSX/ppc
    OSX/x86+SSE2
    OSX/x64
    Win/x64
    unknown = 0xF


Appendix D - compression
~~~~~~~~~~~~~~~~~~~~~~~~

The compression is composed from the mode and the speedup parameters.
They are obtained as follows:

    mode = compression >> 3
    speedup = compression & 7

The following mode values are currently defined:

    fast = 0
    normal
    high
    extra
    best
    ultra
    insane
    highnew
    extranew
    bestnew

The following speedup values are currently defined:

    1x = 0
    2x
    4x


Additional notes
~~~~~~~~~~~~~~~~

All values are stored in little endian order, i.e. the least
significant byte first.

When there is an enumeration, the symbolic constants receive
consecutive integer values.
