APNG is an extension of the PNG format, adding support for animated images. In modern browser, the support is pretty good and we can use apng directly with an img
element. But if you ever need to have more control, having a deeper understanding is required. In this article, let's explore how APNG works by decoding it manually and playing it on canvas.
File format
Let's see the file format of APNG first.
APNG is a extension of the PNG format, so they have the same magic header, and also the same structure. I assume you already have a basic understanding of the file format of PNG files. If not, please refer to my previous article An Intro to PNG Decoder.
For the APNG spec, we normally refer to below 2 docs:
APNG file works by introducing a few new chunks to provide information about animation. Let's look at them one by one.
1. acTL
This chunk is used to provide information about how many frames we have and how many play we should do for the animation. It has structure as below:
byte
0 num_frames (unsigned int) Number of frames
4 num_plays (unsigned int) Number of times to loop this APNG. 0 indicates infinite looping.
2. fcTL
This chunk is used to provide information about each frame. For example, each frame has its own width and height. Let's see its structure.
byte
0 sequence_number (unsigned int) Sequence number of the animation chunk, starting from 0
4 width (unsigned int) Width of the following frame
8 height (unsigned int) Height of the following frame
12 x_offset (unsigned int) X position at which to render the following frame
16 y_offset (unsigned int) Y position at which to render the following frame
20 delay_num (unsigned short) Frame delay fraction numerator
22 delay_den (unsigned short) Frame delay fraction denominator
24 dispose_op (byte) Type of frame area disposal to be done after rendering this frame
25 blend_op (byte) Type of frame area rendering for this frame
As you can see, the above information is mainly used to control how to play each frame.
3. fdAT
In PNG files, we have IDAT
chunks, which contains the pixel data for PNG images. This chunk serves the same function. It has the same structure as IDAT
chunks, except preceded by a sequence number.
That's all the new chunks need to know. Besides that, one thing we need to pay attention is that to be compatible with normal PNG, we have both IDAT
chunks and the new fdAT
chunks. Now the thing is that we need to know if the IDAT
chunk is the first frame of the animation.
This is decided by the position of fcTL
chunk. If we have fcTL
chunk before the IDAT
chunk, then this fcTL
chunk is used to describe the animation behavior of the IDAT
data, which means that it is the first frame. If not, we can just start with the fdAT
chunks.
Strategy to play
Now we need to talk about the strategy to play APNG files. PNG files have the pretty complicated structure and we do not want to implement the decoder from scratch. What we want is to have control over the animation and also rely on the decoder from the browsers.
The strategy is that, we parse the APNG files, identify frames and animation data, then we extract each frames out as independent PNG images, and finally we play each images on the canvas according to the animation control info we have.
Let's see code
Enough with theory, let's see some code.
Let's see the main process first, it includes 4 steps.
async function main() {
// 1. load APNG image
const bytes = await loadImg(targetPath);
// 2. parse the image
const apng = parseApng(bytes);
// 3. load it into html img element
await Promise.all(apng.frames.map(loadImgElement));
// 4. play it on canvas
play(apng);
}
For loading image as typed array, its pretty simple.
async function loadImg(targetPath) {
const img = await fetch(targetPath);
const arrayBuffer = await img.arrayBuffer();
return new Uint8Array(arrayBuffer);
}
Parsing the key part, let's break it down.
First, we check the magic number.
const magic = new Uint8Array([137, 80, 78, 71, 13, 10, 26, 10]);
if (!bytes.slice(offset, offset + 8).every((v, i) => v === magic[i])) {
throw new Error("magic number check fail");
}
Then we iterate all the chunks to get all the chunks we needed.
const dv = new DataView(bytes.buffer);
const otherChunks = [];
const frames = [];
let frame;
let ihdrChunk;
let acTLChunk;
let IENDChunk;
while (offset < bytes.length) {
const chunkLength = dv.getUint32(offset) + 12;
const chunkType = new TextDecoder().decode(
bytes.slice(offset + 4, offset + 8)
);
const chunk = bytes.slice(offset, offset + chunkLength);
switch (chunkType) {
case "IHDR": {
ihdrChunk = chunk;
break;
}
case "acTL": {
acTLChunk = chunk;
break;
}
case "fcTL": {
if (frame) frames.push(frame);
frame = {
fcTLChunk: chunk,
dataBytes: [],
};
break;
}
case "IDAT": {
if (frame) {
// only data-bytes
// 4(data-length) + 4(type) + data-bytes + 4(crc)
frame.dataBytes.push(chunk.slice(8, chunkLength - 4));
}
break;
}
case "fdAT": {
if (!frame) throw new Error("unexpected fdAT chunk");
// same structure as IDAT chunk
// but with an extra sequence field at first
frame.dataBytes.push(chunk.slice(12, chunkLength - 4));
break;
}
case "IEND": {
IENDChunk = chunk;
break;
}
default:
otherChunks.push(chunk);
}
offset += chunkLength;
}
Then we get width and height information from ihdr
chunk.
const apng = {};
if (!ihdrChunk) {
throw new Error("no ihdr chunk");
}
const ihdrChunkDv = new DataView(ihdrChunk.slice(8).buffer);
apng.width = ihdrChunkDv.getUint32(0);
apng.height = ihdrChunkDv.getUint32(4);
Then we get num of plays info from acTL
chunk.
if (!acTLChunk) {
throw new Error("no acTL chunk");
}
const acTLChunkDv = new DataView(acTLChunk.slice(8).buffer);
apng.numPlays = acTLChunkDv.getUint32(4);
if (apng.numPlays === 0) {
apng.numPlays = Infinity;
}
Next step, we need to iterate each frame, and recreat image from each frame.
const chunks = [magic];
// set frame width and height
ihdrChunk.set(fcTLChunk.slice(0, 4), 0);
ihdrChunk.set(fcTLChunk.slice(4, 8), 4);
chunks.push(encodeChunk("IHDR", ihdrChunk.slice(8, ihdrChunk.length - 4)));
chunks.push(...otherChunks);
chunks.push(...dataBytes.map((dataPart) => encodeChunk("IDAT", dataPart)));
chunks.push(IENDChunk);
const image = new Blob(chunks, { type: "image/png" });
And then we get all the info about animation for each frames.
// control info
const dv = new DataView(fcTLChunk.slice(8).buffer);
const sequenceNum = dv.getUint32(0);
const width = dv.getUint32(4);
const height = dv.getUint32(8);
const xOffset = dv.getUint32(12);
const yOffset = dv.getUint32(16);
const delayNum = dv.getUint16(20);
const delayDen = dv.getUint16(22);
const disposeOp = dv.getUint8(24);
const blendOp = dv.getUint8(25);
And this is all the parsing part.
Next step is to load images into HTML img elements. This is trivial.
function loadImgElement(frame) {
return new Promise((resolve) => {
const url = URL.createObjectURL(frame.image);
const imageElement = document.createElement("img");
imageElement.onload = () => {
frame.imageElement = imageElement;
resolve();
};
imageElement.src = url;
});
}
And lastly, we can play all images on canvas.
function play(apng) {
const canvas = document.createElement("canvas");
canvas.width = apng.width;
canvas.height = apng.height;
document.getElementById("container").appendChild(canvas);
const ctx = canvas.getContext("2d");
let frameIndex = 0;
let numFramePlays = apng.numPlays * apng.frames.length;
const draw = () => {
const frame = apng.frames[frameIndex];
frameIndex = frameIndex >= apng.frames.length - 1 ? 0 : frameIndex + 1;
if (frame.disposeOp === 1) {
ctx.clearRect(0, 0, apng.width, apng.height);
}
ctx.drawImage(
frame.imageElement,
frame.xOffset,
frame.yOffset,
frame.width,
frame.height
);
if (--numFramePlays <= 0) return;
setTimeout(draw, frame.delay);
};
draw();
}
We probably need to refactor this setTimeout
by requestAnimatinoFrame
for accuracy. We can see the result in here: https://yaox023.github.io/apng/.
At last
At last, I got the idea by reading the source code of this repo: apng-js. And if you want to check the functioning code and play it by yourself, see my repo: apng.