Parsing IFF- style data using Python

Question

I have an IFF- style file (see below) whose contents I need to inspect in Python.

https://en.wikipedia.org/wiki/Interchange_File_Format

I can iterate through the file using the following code

from chunk import Chunk

def chunks(f):
    while True:
        try:
            c=Chunk(f, align=False, bigendian=False)
            yield c
            c.skip()
        except EOFError:
            break

if __name__=="__main__":        
    for c in chunks(file("sample.iff", 'rb')):
        name, sz, value = c.getname(), c.getsize(), c.read()
        print (name, sz, value)

Now I need to parse the different values. I have had some success using Python's 'struct' module, unpacking different fields as follows

struct.unpack('<I', value)

or

struct.unpack('BBBB', value)

by experimenting with different formatting characters shown in the struct module documentation

https://docs.python.org/2/library/struct.html

This works with some of the simpler fields but not with the more complex ones. It is all very trial-and-error. What I need is some systematic way of unpacking the different values, some way of knowing or inspecting the type of data they represent. I am not a C datatype expert.

Any ideas ? Many thanks.

SVOXVERS  BVER BPM }SPEDTGRDGVOL`NAME2017-02-15 16-38MSCLMZOOMXOFMYOFLMSKCURLTIMESELSLGENPATNPATTPATLPDTAa � 1pQ  10 `q !@QP! 0A �`A PCHNPLIN PYSZ PFLGPICO �m�!�a��Q�1:\<<<<:\�1�Q��a�!�mPFGCPBGC���PFFFPXXXPYYYPENDSFFFCSNAM OutputSFINSRELSXXXDSYYYhSZZZSSCLSVPRSCOL���SMICSMIB����SMIP����SLNK����SENDSFFFISNAM FMSTYPFMSFINSRELSXXX�SYYY8SZZZSSCLSVPRSCOL��SMICSMIB����SMIP����SLNKCVAL�CVAL0CVAL�CVALCVALCVALCVALCVALGCVALnCVAL\CVALCVAL&CVALoCVALDCVALCVALCVALCMID������������������SENDSFFFQSNAM EchoSTYPEchoSFINSRELSXXX�SYYY SZZZSSCLSVPRSCOL��SMICSMIB����SMIP����SLNK����CVALCVALCVAL�CVALCVALCVALCMID0������SENDSFFFQSNAM ReverbSTYPReverbSFINSRELSXXX\SYYY�SZZZSSCLSVPRSCOL��SMICSMIB����SMIP����SLNK����CVALCVALCVAL�CVAL�CVALCVALCVALCVALCVALCMIDH���������SENDSENDSENDSENDSEND

bigendian=False are you sure? those are amiga-related right? should be big endian. — Jean-François Fabre
– Jean-François Fabre ♦, Commented Sep 3, 2017 at 11:35
If you say so. I am not a C datatype expert. All input gratefully received :-) — Justin
– Justin, Commented Sep 3, 2017 at 12:16
I would like to but the question is difficult to answer. It doesnt have a minimal reproducible example which is difficult because of the binary input — Jean-François Fabre
– Jean-François Fabre ♦, Commented Sep 3, 2017 at 12:25

Jerry101 · Accepted Answer · 2018-12-10 00:22:37Z

If it's really an IFF file, it needs alignment and big-endian turned on, and the file would contain a single FORM chunk that in turn contains the FORM type such as SVOX and the contents chunks. (Or it could contain a LIST or CAT container chunk.)

An IFF chunk has:

A four-character chunk-type code
A four-byte big-endian integer: length
length number of data bytes
A pad byte for alignment if length is odd

This is documented in "EA IFF 85". See the "EA IFF-85" Repository for the original IFF docs. [I wrote them.]

Some file formats like RIFF and PNG are variations on the IFF design, not conforming applications of the IFF standard. They vary the chunk format details, which is why Python's Chunk reader library lets you pick alignment, endian, and when to recurse into chunks.

By looking at your file in a hex/ascii dump and mapping out the chunk spans, you should be able to deduce whether it uses big-endian or little-endian length fields, whether each odd-length chunk is followed by a pad byte for alignment, and whether there are chunks within chunks.

Now to the contents. A chunk's type signals the format and semantics of its contents. Those contents could be a simple C struct or could contain variable-length strings. IFF itself does not provide metadata on that level of structure, unlike JSON and TIFF.

So try to find the documentation for the file format (SVOX?).

Otherwise try to reverse engineer the data. If you put sample data into an application that generates these files, you can try special cases, look for the expected values in the file, change just one parameter, then look for what changed in the file.

Finally, your code should call c.close(). c.close() will call c.skip() for you and also handle chunk closing, which includes safety checks for attempts to read the chunk afterwards.

Collectives™ on Stack Overflow

Parsing IFF- style data using Python

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related