Skip to content Skip to sidebar Skip to footer

How To Modify A Compressed Itxt Record Of An Existing File In Python?

I know this looks too simple but I couldn’t find a straight forward solution. Once saved, the itxt should be compressed again.

Solution 1:

It's not so simple as you eyeballed it. If it were, you might have found out there is no straightforward solution.

Let's start with the basics.

Can PyPNG read all chunks?

An important question, because modifying an existing PNG file is a large task. Reading its documentation, it doesn't start out well:

PNG: Chunk by Chunk

Ancillary Chunks

.. iTXt Ignored when reading. Not generated.

(https://pythonhosted.org/pypng/chunk.html)

But lower on that page, salvation!

Non-standard Chunks Generally it is not possible to generate PNG images with any other chunk types. When reading a PNG image, processing it using the chunk interface, png.Reader.chunks, will allow any chunk to be processed (by user code).

So all I have to do is write this 'user code', and PyPNG can do the rest. (Oof.)

What about the iTXt chunk?

Let's take a peek at what you are interested in.

4.2.3.3. iTXt International textual data

.. the textual data is in the UTF-8 encoding of the Unicode character set instead of Latin-1. This chunk contains:

Keyword:1-79 bytes (character string)
Null separator:      1byte
Compression flag:    1byte
Compression method:  1byte
Language tag:        0or more bytes (character string)
Null separator:      1byte
Translated keyword:  0or more bytes
Null separator:      1byteText:0or more bytes

(http://www.libpng.org/pub/png/spec/1.2/PNG-Chunks.html#C.iTXt)

Looks clear to me. The optional compression ought not be a problem, since

.. [t]he only value presently defined for the compression method byte is 0, meaning zlib ..

and I am pretty confident there is something existing for Python that can do this for me.

Back to PyPNG's chunk handling then.

Can we see the chunk data?

PyPNG offers an iterator, so indeed checking if a PNG contains an iTXt chunk is easy:

chunks() Return an iterator that will yield each chunk as a (chunktype, content) pair.

(https://pythonhosted.org/pypng/png.html?#png.Reader.chunks)

So let's write some code in interactive mode and check. I got a sample image from http://pmt.sourceforge.net/itxt/, repeated here for convenience. (If the iTXt data is not conserved here, download and use the original.)

itxt sample image

>>>import png>>>imageFile = png.Reader("itxt.png")>>>print imageFile
<png.Reader instance at 0x10ae1cfc8>
>>>for c in imageFile.chunks():...print c[0],len(c[1])... 
IHDR 13
gAMA 4
sBIT 4
pCAL 44
tIME 7
bKGD 6
pHYs 9
tEXt 9
iTXt 39
IDAT 4000
IDAT 831
zTXt 202
iTXt 111
IEND 0

Success!

What about writing back? Well, PyPNG is usually used to create complete images, but fortunately it also offers a method to explicitly create one from custom chunks:

png.write_chunks(out, chunks) Create a PNG file by writing out the chunks.

So we can iterate over the chunks, change the one(s) you want, and write back the modified PNG.

Unpacking and packing iTXt data

This is a task in itself. The data format is well described, but not suitable for Python's native unpack and pack methods. So we have to invent something ourself.

The text strings are stored in ASCIIZ format: a string ending with a zero byte. We need a small function to split on the first 0:

defcutASCIIZ(str):
   end = str.find(chr(0))
   if end >= 0:
      result = str[:end]
      return [str[:end],str[end+1:]]
   return ['',str]

This quick-and-dirty function returns an array of a [before, after] pair, and discards the zero itself.

To handle the iTXt data as transparently as possible, I make it a class:

classChunk_iTXt:
  def__init__(self, chunk_data):
    tmp = cutASCIIZ(chunk_data)
    self.keyword = tmp[0]
    iflen(tmp[1]):
      self.compressed = ord(tmp[1][0])
    else:
      self.compressed = 0iflen(tmp[1]) > 1:
      self.compressionMethod = ord(tmp[1][1])
    else:
      self.compressionMethod = 0
    tmp = tmp[1][2:]
    tmp = cutASCIIZ(tmp)
    self.languageTag = tmp[0]
    tmp = tmp[1]
    tmp = cutASCIIZ(tmp)
    self.languageTagTrans = tmp[0]
    if self.compressed:
      if self.compressionMethod != 0:
        raise TypeError("Unknown compression method")
      self.text = zlib.decompress(tmp[1])
    else:
      self.text = tmp[1]

  defpack (self):
    result = self.keyword+chr(0)
    result += chr(self.compressed)
    result += chr(self.compressionMethod)
    result += self.languageTag+chr(0)
    result += self.languageTagTrans+chr(0)
    if self.compressed:
      if self.compressionMethod != 0:
        raise TypeError("Unknown compression method")
      result += zlib.compress(self.text)
    else:
      result += self.text
    return result

  defshow (self):
    print'iTXt chunk contents:'print'  keyword: "'+self.keyword+'"'print'  compressed: '+str(self.compressed)
    print'  compression method: '+str(self.compressionMethod)
    print'  language: "'+self.languageTag+'"'print'  tag translation: "'+self.languageTagTrans+'"'print'  text: "'+self.text+'"'

Since this uses zlib, it requires an import zlib at the top of your program.

The class constructor accepts 'too short' strings, in which case it will use defaults for everything undefined.

The show method lists the data for debugging purposes.

Using my custom class

With all of this, now examining, modifying, and adding iTXt chunks finally is straightforward:

import png
import zlib

# insert helper and class here

sourceImage = png.Reader("itxt.png")
chunkList = []
for chunk in sourceImage.chunks():
  if chunk[0] == 'iTXt':
    itxt = Chunk_iTXt(chunk[1])
    itxt.show()
    # modify existing dataif itxt.keyword == 'Author':
      itxt.text = 'Rad Lexus'
      itxt.compressed = 1
    chunk = [chunk[0], itxt.pack()]
  chunkList.append (chunk)

# append new data
newData = Chunk_iTXt('')
newData.keyword = 'Custom'
newData.languageTag = 'nl'
newData.languageTagTrans = 'Aangepast'
newData.text = 'Dat was leuk.'
chunkList.insert (-1, ['iTXt', newData.pack()])

withopen("foo.png", "wb") as file:
  png.write_chunks(file, chunkList)

When adding a totally new chunk, be careful not to append it, because then it will appear after the required last IEND chunk, which is an error. I did not try but you should also probably not insert it before the required first IHDR chunk or (as commented by Glenn Randers-Pehrson) in between consecutive IDAT chunks.

Note that according to the specifications, all texts in iTXt should be UTF8 encoded.

Post a Comment for "How To Modify A Compressed Itxt Record Of An Existing File In Python?"