EPUB 2.0.1 Structure: A Simplified Overview

Author: Alyssa Riceman

Posted: 2021-09-10

Updated: 2023-10-20

Introduction

This is an overview of the internal structure of EPUB 2.0.1 files. (Which I’ll henceforth call just ‘EPUB’, no version number specified, for the sake of conciseness.) My goal in writing this is to provide a useful resource for programmers who want to create programs which generate well-formed EPUB files; I intend to summarize all essential information about the format’s internal structures for that purpose, while hopefully being briefer and less intimidating to read than the format’s official specifications (archive 1, archive 2, archive 3).

While I will be going into a fair amount of detail, this is an overview, not a fully detailed exposition. In particular, if you’re designing an EPUB reader, I’d recommend looking at the official specifications instead; my summary will convey the information necessary to write well-formed EPUB files, but the format has various optional frills—deprecated features, optional but not-typically-used file format support, and so forth—which a writer doesn’t need to know how to generate but which a reader does need to know how to parse, and my summary won’t cover all of those.

Basic Structure

At its highest level, an EPUB file is a ZIP file. Specifically, it’s a ZIP file with the following broad structure:

[Zip file root]/
    mimetype
    META-INF/
        container.xml
        [Optionally some other metadata files]
    [The part with the actual book content]

The other metadata files are pretty noncentral to the EPUB format; I’ll discuss them briefly, later, but not in great depth.

The part with the actual book content isn’t rigidly defined in terms of folder/file structure, but it consists of an OPF file describing the overall shape of the book content, an NCX file serving as the table of contents, plus the book content being summarized. A relatively conventional structure has the metadata files and the book content in a folder titled OEBPS, and the OPF and NCX files respectively named content.opf and toc.ncx, leading to this overall structure:

[Zip file root]/
    mimetype
    META-INF/
        container.xml
    OEBPS/
        content.opf
        toc.ncx
        [The various files making up the book, including optional subfolder structure for e.g. separating out text and images and so forth]

XML files stored within the ZIP file should all be well-formed XML 1.0 files. When they include references to one another or to other files in the ZIP, the references should always be relative, rather than absolute.

The ZIP file housing all of this has to meet a few broad format criteria in order to be a valid EPUB:

It has to be a single file; it can’t use the multi-file zip option.
It has to encode its filenames as UTF-8.
No filename can be longer than 255 bytes.
No file’s canonical path from the zip file’s root—counting both file and directory names—can be longer than 65535 bytes.
Filenames can’t contain the characters /, ", *, :, <, >, ?, or \, and they can’t end with ..
Filenames are case-sensitive, but nonetheless you’re not allowed to include multiple files in one directory whose names differ only by case. No OEBPS/chapter1.xhtml and OEBPS/Chapter1.xhtml files together in the same EPUB, or anything along those lines.
Files can be uncompressed or deflated, but they can’t be compressed (by the zip program) via means other than deflate.
It can’t be encrypted; while parts of the internal EPUB content are allowed to be encrypted, that has to happen internally to the EPUB, not at the ZIP-file level.

…as well as having a properly placed mimetype file. Which brings us to:

mimetype

The mimetype file needs to be the first file in the ZIP archive’s linear file-order, and it needs to be uncompressed, unencrypted, and not contain any extra fields within its ZIP header. Its contents should be the ASCII string:

application/epub+zip

Taking all of these requirements together, the mimetype serves as an easy way for readers to check whether the ZIP file they’re looking at is an EPUB file. However, it’s somewhat inconvenient for whoever is building the file, since normal use of zip tools doesn’t involve manually specifying the order in which files should be placed in the ZIP or whether they should be compressed. Make sure, when zipping the epub, that you use a tool which supports those bits of functionality and that the resulting file has its mimetype in the right place.

(If you did it right, then, looking at the file in a hex editor, you should see the string mimetypeapplication/epub+zip starting at offset 30.)

META-INF/container.xml

The container.xml file is an XML file letting the reader know where the main files are that describe the rest of the book. A minimal container file looks like this:

<?xml version="1.0"?>
<container version="1.0" xmlns="urn:oasis:names:tc:opendocument:xmlns:container">
    <rootfiles>
        <rootfile full-path="OEBPS/content.opf" media-type="application/oebps-package+xml" />
    </rootfiles>
</container>

The XML version declaration, container root element with the displayed version and xmlns attribute values, and rootfiles element are mandatory. The rootfiles element can contain one or more rootfile elements, each of which needs a full-path attribute and a media-type attribute. There shouldn’t be any elements or attributes besides those.

Of the rootfile elements, at least one—and, ideally, only one—should have a media type of application/oebps-package+xml. That one (or the first of them, if there’s more than one) is the core EPUB rootfile. But there can be other rootfiles besides that one, specifying different renditions of the text; for instance, you can point to an EPUB rootfile for ordinary reading use plus a PDF rendition of the book for use by printers. Whether your reader will do anything with non-core rootfiles is a different question, and the answer is probably “no”, so you mostly shouldn’t worry about this and should just define the one core rootfile.

Within each rootfile element, the full-path attribute should define a path from the ZIP file’s root (NOT from the location of container.xml) to the file defining one rendition of the text. (It can be either a standalone file, as with the PDF example above, or a file which points in turn to a bunch of others, as with the OPF file (whose structure will be discussed in greater depth below).) The media-type attribute should list the media type for that file.

Other Files in META-INF

There are several other files which can optionally be included in META-INF, per the EPUB specification. These are:

manifest.xml
metadata.xml
signatures.xml
encryption.xml
rights.xml

None of these files are essential to an EPUB book, and most books will have no need of them. But, to briefly summarize what they’re each for:

manifest.xml is an OpenDocument Manifest file meeting this schema. It’s never made clear, within the EPUB specification, what it’s supposed to be useful for.

metadata.xml is very vaguely defined, but it’s an XML file, with all its elements namespaced, and it’s supposed to be used to hold some sort of metadata about the overall EPUB file. (Not about the core OPF book content; the OPF file has its own metadata section. This is for metadata for the overall EPUB file, on a higher level than just the OPF.) In practice readers will generally ignore this, given its lack of standardization, so you’re unlikely to have much use for it.

signatures.xml is an XML file listing digital signatures for files in the EPUB, for use if you want to sign your EPUB’s files. See Section 3.5.4 of the OCF specification for details, if you want them.

encryption.xml is an XML file listing off encryption information for files in the EPUB, for use if you want to encrypt your EPUB’s files. See Section 3.5.5 of the OCF specification for details, if you want them.

rights.xml is very vaguely defined, but it has to be a well-formed XML file, and it’s supposed to be used to list DRM-related information.

In all of these files, as in container.xml, any paths you might want to include should be relative to the ZIP file’s root rather than to META-INF.

Any files in META-INF other than these five and container.xml will be ignored by readers, so there’s no point putting them in.

The OPF File

The OPF file serves as the core file defining the structure of the EPUB. It should have a .opf extension. Technically, you can string together multiple XML files into a single OPF publication, with one (the main point of entry) getting the .opf extension and the rest getting ordinary .xml extensions; but, in practice, this is rarely helpful and you’re better off sticking with just a single file.

The broad structure of an OPF file is as follows:

<?xml version="1.0"?>
<package version="2.0" xmlns="http://www.idpf.org/2007/opf" unique-identifier="AN_APPROPRIATE_ID">
    <metadata xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:opf="http://www.idpf.org/2007/opf">
        [Metadata content]
    </metadata>
    <manifest>
        [Manifest content]
    </manifest>
    <spine toc="ANOTHER_APPROPRIATE_ID">
        [Spine content]
    </spine>
    <guide>
        [Guide content]
    </guide>
</package>

The XML version, package element with the displayed version and xmlns attribute values and with a defined unique-identifier attribute, metadata element, manifest element, and spine element with a defined toc attribute are mandatory. The guide element is optional, but is useful often enough that in practice you’ll still usually end up including it.

Note that, while the package element needs a unique-identifier attribute and the spine element needs a toc attribute, those attributes’ values can be arbitrary strings; they don’t need to be exactly as shown in the overview here, and in fact they usually won’t. (Within the official EPUB specification, the given examples show their values as, respectively, "BookId" and "ncx".)

Each of the metadata, manifest, spine, and guide elements have a bunch of internal content which I skipped over in that structural overview for the sake of clarity. I will now go into each in turn.

(I won’t go over the tours element, which is listed in the OPF format specification and which readers support, but which is deprecated such that you probably shouldn’t use it.)

Metadata

The metadata element lists metadata for the book. The broad structure of the metadata element is as follows:

<metadata xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:opf="http://www.idpf.org/2007/opf">
    <dc:title>[The title of this book]</dc:title>
    <dc:identifier id="AN_APPROPRIATE_ID" opf:scheme="UUID">[IMAGINE A UUID HERE]</dc:identifier>
    <dc:language>[An RFC 3066/ISO639 language code; if you're writing in English, probably 'en']</dc:language>
    [Optionally a bunch of other metadata elements (which can also go before or interspersed with the above three, there are no hard limits on ordering)]
</metadata>

The metadata element, the dc:title element, the dc:identifier element with a defined id attribute, and the dc:language element are mandatory. The xmlns attributes on the metadata element, and the opf:scheme attribute on the dc:identifier element, are optional but highly recommended. (More detail on the opf:scheme attribute momentarily.)

There are fifteen metadata elements, standardized by Dublin Core, which serve as the primary basis of the EPUB metadata structure. With the exception of the previously-noted dc:title, dc:identifier, and dc:language elements, you can have zero or more of each. You need at least one of each of those three. To briefly summarize each DC element:

dc:title: the title of your book. You need at least one; you can have more than one if you want, but probably then your reader will just default to the first one you define.
dc:identifier: a string or number to uniquely identify your book. Examples could include an ISBN, a UUID, or an ASIN. You need at least one ID, and at least one of your IDs must have an id attribute specified whose content is the same as the OPF package element’s unique-identifier attribute. (Thus ensuring that each package has at least one unique identifier.) You can optionally stick an opf:scheme attribute on each identifier in order to explicitly note what sort of identifier it is; there’s no formal standardized listing of possible scheme attribute values, but a few I’ve encountered in the wild are "UUID", "calibre", "MOBI-ASIN", and "EbookISBN".
dc:language: a language code, compliant with RFC 3066 and ISO639, to identify your book’s language. In practice, you should generally just use one of the ISO language codes from here, preferring two-character codes over three-character codes when both options are present.
dc:creator: lists an author or other creator of your book. If you’ve got more than one of those, use more than one of these elements, ordered by intended display order. (First author gets the first creator element, in other words.) You can optionally stick an opf:file-as attribute on the element to specify an alternate name rendering for use in alphabetical listing and other such categorization systems; for example, <dc:creator opf:file-as="Riceman, Alyssa">Alyssa Riceman</dc:creator> would get me sorted by last name rather than first. You can also optionally stick an opf:role attribute on the element to describe what role the listed person had in the creation of the book; its value should be one of the MARC relator codes, which are defined on the two subpages of this page. (You can find a shortlist of those codes, selected for likely applicability to EPUB creator tags, in Section 2.2.6 of the OPF specification.) If none of those codes works, you can put an arbitrary custom role into that slot, with its value prefaced with oth.. For instance, oth.rubber_duck to credit the person who you used as a rubber duck when figuring out the code in your book.
dc:subject: used for arbitrary tagging. This is where you’d put the sorts of things you’d put in AO3 tags, where I (in my own EPUB collection) put notes on whether a given book is original fiction or original nonfiction or fanfiction, and so forth. Very versatile.
dc:description: a description of the book’s content. This is where you’d put e.g. the book’s back-cover synopsis.
dc:publisher: the book’s publisher. (As usual, if there’s more than one, use one tag for each.)
dc:contributor: like dc:creator except used to note less-significant contributions. The sorts of people you’d list on an acknowledgements page somewhere rather than putting their names on the front cover. Less likely than dc:creator to be displayed anywhere prominent in readers’ interfaces. Supports the same opf:file-as and opf:role attributes that dc:creator does.
dc:date: date of creation, publication, or modification. (But most typically publication.) Formatted per the specification here. In practice, you probably don’t need to go more fine-grained than YYYY-MM-DD. You can optionally stick an opf:event attribute on it, which supports the values "creation", "publication", and "modification", in order to make explicit which sort of date a given dc:date element is noting.
dc:type: a general category to which the book belongs (genre or suchlike).
dc:format: the media type of the book. Thoroughly unnecessary in EPUB; you and your reader already know it’s an EPUB.
dc:source: a link or reference to “a prior resource from which the publication was derived”.
dc:relation: a link or reference to a resource related to your book.
dc:coverage: a description of “the extent or scope of the content of the resource”, e.g. a place, period of time, or jurisdiction.
dc:rights: a copyright notice or similar declaration. Can be done as a link, but it’s recommended to inline it instead.

All of these with the exceptions of dc:identifier, dc:language, dc:date, dc:type, and dc:format can be optionally tagged with an xml:lang attribute, whose value is set to a language code of the same sort used in dc:language, in order to note that bit of metadata as being in a specific language. There’s no standardized implementation for what this means, but if your reader is smart enough to try to display content based on language settings, it lets you (for example) have multiple dc:title tags in different languages and let the reader pick the appropriate display title for the user’s language settings.

In practice, you won’t need to use most of these, especially the later ones, and most readers won’t display most of the later ones anywhere convenient. dc:creator, dc:subject, dc:description, dc:publisher, and the mandatory ones are all pretty useful; the rest are mostly skippable.

If you want to tag a bit of metadata that doesn’t fit neatly into any of the DC categories, you can do so via a meta tag with name and content attributes. For instance, the element <meta name="translated_from" content="ja"> would indicate that your book’s translated-from metadata value is ja—or, in human terms, that the book is translated from Japanese—where none of the DC elements would neatly support listing that information. (Of course, you’re at the mercy of your reader when it comes to displaying these nonstandard metadata elements, so in practice it might not help very much.)

Manifest

The manifest element lists the files making up your book. (Where, by ‘your book’, I mean the main EPUB body of your book, the part that the OPF file is describing. The manifest isn’t going to list your mimetype file or other higher-level structural elements, or your OPF file itself; but it lists everything below the OPF in the book structure.)

The broad structure of the manifest element is as follows:

<manifest>
    <item id="a_manifest_item" href="example.html" media-type="application/xhtml+xml" />
    <item id="another_manifest_item" href="example2.xml" media-type="application/xhtml+xml" />
    <item id="ANOTHER_APPROPRIATE_ID" href="toc.ncx" media-type="application/x-dtbncx+xml" />
    [Probably a bunch more items, for most books, although all you strictly need are the NCX and one file to go into the spine]
</manifest>

The manifest element is mandatory. It contains a list of item elements, each of which needs an id attribute (which should be unique among id attributes within the overall OPF file), an href attribute (pointing to a file, with path relative to the location of the OPF file in which the item is defined, and no fragment identifiers in the path), and a media-type attribute appropriate to the pointed-to file’s type.

There should be one item listing for each file in the book. The order in which the items are listed doesn’t matter, but no file should have more than one href pointing to it within the manifest.

Anything listed in the manifest needs to be contained within the EPUB’s ZIP archive; conversely, any files in the archive which aren’t listed in the manifest shouldn’t be referenced from anywhere in the files that are.

There’s a relatively limited list of file types assumed to be natively supported by all EPUB readers, which the readers are guaranteed to know how to deal with or fall back from. Those file types are:

GIF 89a (media-type="image/gif")
JPEG (media-type="image/jpeg")
PNG 1.0 (media-type="image/png")
SVG 1.1 (media-type="image/svg+xml")
XHTML 1.1 (media-type="application/xhtml+xml")
DTBook XML (2005 version) (media-type="application/x-dtbook+xml")
CSS 2.0 (with some modifications, more on this later) (media-type="text/css")
NCX (2005 version) (media-type="application/x-dtbncx+xml")
OTF (version not specified) (media-type="application/vnd.ms-opentype")
DTD (version not specified) (media-type="application/xml-dtd")
RELAX NG (version not specified) (media-type="application/relax-ng-compact-syntax")
OEBPS 1.2 Document (deprecated) (media-type="text/x-oeb1-document")
OEBPS 1.2 Stylesheet (deprecated) (media-type="text/x-oeb1-css")

(Any of these which contain text, rather than binary, data need to have the text encoded as either UTF-8 or UTF-16, incidentally.)

However, you can include files of formats other than these in the manifest. To do so, you need to define a fallback chain for that file, ending in a file of a supported format, so that readers which don’t know how to handle the unsupported formats can still display something when display of an unsupported file is called for by the book flow.

To mark the fallback option for an item in the manifest, set its fallback attribute’s value to the value of the id of the item it’s supposed to fall back on if needed. You can chain these together, so that item 1 falls back on item 2, item 2 falls back on item 3, and so forth. You can’t make the fallback chains loop, though; they need to terminate eventually, and they need to terminate on files of supported types.

For XML files which aren’t SVG, XHTML, DTBook, or NCX, the rules are a bit different. For an item pointing to one of those files, you need to give it a required-namespace attribute, whose value should be the XML namespace used by that file. If any modules are required to render the XML properly, not default to the specified namespace, then you also need to give it a required-modules attribute whose content is a comma-separated list of the names of those modules (with spaces, if present, replaced with hyphens). You also have the option to, instead of or in addition to giving it a fallback attribute, give it a fallback-style attribute whose value is the id of a stylesheet which can be used to render the XML if the reader has no native knowledge of how to do so. (I don’t know how stylesheet-based XML rendering works, and thus can’t with any confidence recommend ever doing it; but the option nonetheless exists and seems worth noting, for the sake of completeness.)

Spine

The spine element lists a linear reading order for the XML files (XHTML, DTBook XML, or unsupported XML with appropriate fallbacks in place) which make up your core book content. The broad structure of the spine is as follows:

<spine toc="ANOTHER_APPROPRIATE_ID">
    <itemref idref="a_manifest_item" />
    <itemref linear="no" idref="another_manifest_item" />
    [Probably a bunch more itemrefs, for most books, although all you strictly need is a single itemref not marked as nonlinear]
</spine>

The spine element with a defined toc attribute is mandatory. The toc attribute’s value needs to be identical to the value of the id of the manifest entry for an NCX file, which will serve as the book’s table of contents. The spine needs to contain one or more itemref elements, listed in the order in which they should appear in the book; each one needs an idref attribute whose value is identical to the id of the manifest entry for one of the XML files which comprise the book’s content. The linear attribute is optional; more on it momentarily.

A given idref shouldn’t appear in the spine more than once. Any XML in the manifest which is reachable by the readers—via the table of contents, via the guide, via hyperlink from a file in the spine or reachable via one of the aforementioned methods, et cetera—has to be listed in the spine.

If an item in the spine should be outside of the main reading flow—a footnote, for example, which is linked but which readers probably don’t want to have to page through without following the link—it can be marked with the attribute linear="no"; not all readers respect this, but many do, and will take non-linear spine members out of the main reading flow and only show them when the reader follows a link to them. But the spine has to contain at least one linear element; you can’t have an entire spine full of non-linear itemrefs. If an item is marked as non-linear, there should be some sort of reference to it (hyperlink, TOC, et cetera) so that you can be confident readers will have a way to get to it at all.

Guide

The guide element lists links to various significant parts of your book. Unlike the metadata, manifest, and spine elements, it’s optional, and you don’t need to include it. But you can, if you want.

The broad structure of the guide is as follows:

<guide>
    <reference type="text" title="Main text" href="example.html#start" />
    [Optionally a bunch more references]
</guide>

If the guide exists, it needs to contain one or more reference elements, each of which needs a type attribute and an href attribute with defined values. The title attribute is optional, and the specific values I used here for all three elements are just examples, not mandatory.

The list of valid types is as follows:

cover: book cover image
title page: title page
toc: table of contents
index: index
glossary: glossary
acknowledgements: acknowledgements page
bibliography: bibliography
collophon: apparently these are a thing?
copyright-page: copyright page
dedication: dedication
epigraph: epigraph
foreword: foreword
loi: list of illustrations
lot: list of tables
notes: …notes of some sort, I guess?
preface: preface
text: start of the book’s main body, after all the frontmatter and prefaces and so forth (e.g. the prologue or the first chapter, in a chapter-based book)

If you want to include a type other than these ones, you can use arbitrary type names as long as they’re prefaced with other.. So you could, for example, have a reference element whose type is other.tldr to point at the TL;DR summary of your book. (Usual disclaimers apply, readers will plausibly ignore this.)

The EPUB format specification makes no mention of what happens if you include multiple guide elements of the same type. Undefined behavior is scary, so I’d recommend avoiding doing so.

The href points at the file being referenced. The cover image for the cover reference, the foreword for the foreword reference, and so forth. Unlike the href elements in the manifest, you are allowed (although not required) to use fragment identifiers in the guide’s href elements, as shown in the example above.

The title element is mentioned nowhere in the EPUB format specification except in its example code. (I have the impression the IDPF put less effort into defining the guide than the other parts, probably on account of how it’s not mandatory like the other parts are.) Nonetheless it seems to be standard to include it, and to have it be a simple human-readable description of the type.

The NCX File

The NCX file serves as the EPUB’s table of contents. As mentioned in the previous section, it has to be listed in the OPF’s manifest, and be identified in the OPF’s spine.

The NCX format is unusual in that it was originally specified for the DTBook format, rather than either for EPUB or for general web use. If you want to read the specification, it’s the eighth section of the specification here (archive). The NCX version used in the EPUB format is slightly different from the original DTBook version, but not substantially so.

A minimal NCX file would look like this:

<?xml version="1.0"?>
<ncx version="2005-1" xmlns="http://www.daisy.org/z3986/2005/ncx/">
    <head>
        <meta name="dtb:uid" content="[IMAGINE A UUID HERE]" />
    </head>
    <docTitle>
        <text>[The title of this book]</text>
    </docTitle>
    [Optionally a docAuthor here]
    <navMap>
        <navPoint>
            <navLabel>
                <text>The first TOC entry!</text>
            </navLabel>
            <content src="example.html">
            [Optionally one or more navPoints nested under this one]
        </navPoint>
        [Optionally more navPoints]
    </navMap>
    [Optionally a pageList here]
    [Optionally a navList here]
</ncx>

…which is kind of a lot! But necessary, so let’s summarize it.

The XML version declaration is mandatory. In theory, you could put a doctype declaration under it; in practice, I actively recommend not doing so, for reasons I’ll get into shortly.

The root element is the ncx. It needs version and xmlns attributes with the values shown in that example. It contains:

a head element containing one or more meta elements, each of which has a name and a content value much like the ones usable in the OPF file’s metadata section. More on these below.
a docTitle element containing a text element containing the book’s title. This is redundant with the mandatory dc:title element in the OPF’s metadata, but it’s mandatory in the NCX nonetheless.
zero or more docAuthor elements, each containing a text element containing one of the book’s authors. Redundant with the OPF’s dc:creator elements, and unlike the docTitle it’s not required, so I’d suggest skipping it.
a navMap element containing zero or more navInfo elements, zero or more navLabel elements, and one or more navPoint elements. More on this below; it’s the main body of the table of contents, and as such is the central important bit here.
optionally, a pageList element containing zero or more navInfo elements, zero or more navLabel elements, and one or more pageTarget elements. You probably don’t need this—most EPUBs don’t need to list contents by page number—but I’ll go into it more below, just in case.
zero or more navList elements, each containing zero or more navInfo elements, one or more navLabel elements, and one or more navTarget elements. These are like mini-TOCs, which unlike the main TOC are flat and un-nested. You probably don’t need to use these, but I’ll go into them below just in case.

As I mentioned, you probably want to not include a doctype declaration for the NCX. This is because, in the original NCX specification, it was mandatory to include a playOrder attribute in all navPoint, pageList, and navTarget elements, defining (in integer order, starting with 1) linear order through the book. The EPUB specification allows skipping this, since it’s redundant with the OPF’s spine and is kind of inconvenient; but compliance with the NCX DTD requires inclusion of the playOrder elements, and the EPUB specification does require that you comply with the DTD if you include it. Thus, better not to include it.

The meta Elements

There are four natively-supported metadata names in an NCX file. They are:

dtb:uid: a unique identifier for your book. Unlike the others, this one is mandatory. Just use the main identifier from your OPF file, probably.
dtb:depth: an integer listing how deeply the navMap is nested. 1 indicates a flat listing, 2 indicates that some navPoints contain other navPoints, 3 indicates that some navPoints contain other navPoints which, themselves, contain other navPoints, and so forth.
dtb:totalPageCount: an integer listing the number of pageTargets in the pageList. This is probably going to be 0, since you’re probably not including a pageList.
dtb:maxPageNumber: an integer listing the largest page number in the pageList. This is probably going to be 0, since you’re probably not including a pageList.

If you want to include more for some reason, you can include whatever other metadata names you want, albeit without the dtb: prefix (which is reserved for those four). But in practice there’s not much reason to, since your real metadata-hub is the OPF, not the NCX.

(In the original NCX specification, all four dtb: meta names had to have defined values. But the EPUB version of NCX strips that requirement out, fortunately.)

The navMap Element

As mentioned, a navMap contains zero or more navInfo elements, zero or more navLabel elements, and one or more navPoint elements.

navInfo elements contain text elements which contain comments on their parent elements. You mostly don’t need to worry about them. navLabel elements will be important momentarily, but are ignorable within the immediate context of the navMap. So mostly the bit that matters is the part with the one or more navPoints.

Each navPoint represents an entry in your table of contents. A navPoint element contains one or more navLabel elements, a content element, and zero or more additional navPoint elements, allowing for arbitrary nesting. (This allows for nested tables of contents. I can have a navPoint for Chapter 1 and then additional navPoints nested inside that one for each individual section of Chapter 1, for example.)

Putting nesting aside: the navLabel elements contain text elements which contain whatever text you want the table of contents to associate with a given link. (<text>Chapter 1</text>, for instance, if your navPoint is pointed at Chapter 1.) The content element has no contents, but has a src attribute containing a path (relative to the NCX file) to the XML item it’s referencing. (And it should be an XML item. One listed in the OPF file’s spine, specifically.) Unlike the OPF manifest but like the OPF guide, fragment identifiers are allowed in the path.

The reason you can have multiple navLabel elements in a given navPoint is that they can, optionally, be given xml:lang tags, same as are used in various metadata elements in the OPF, so as to offer the label in different languages. Also like the metadata elements in the OPF: there’s a good chance your reader won’t be smart enough to do anything beyond using the first one, so this option is a lot more useful in theory than in practice.

The pageList Element

Contains a list of numbered pages. As mentioned, this is optional and you probably don’t want to include one of these. If you do, though, it’s relatively similar to the navMap. Like the navMap, it can have navInfo and navLabel elements but you have very little reason to give it either. Instead of having one or more navPoint elements, though, it has one or more pageTarget elements.

Each pageTarget element needs an id attribute, unique within the NCX file. It also needs a type attribute, whose value can be either "front" (for roman-numeral-numbered pages in the front of the book), "normal" (for normal Arabic-numeral-numbered pages in the main book body), or "special" (for other pages). It also should have—not strictly mandatory, but highly recommended—a value attribute, containing an integer representation of the page number being targeted. Then it needs one or more navLabel elements and a content element, same as in a navPoint except minus the possibility of nesting.

The navList Element

Very much like a navMap, except instead of containing nestable navPoint elements it contains non-nestable navTarget elements which only include navLabel and content attributes, no option of additional navTargets. Also, the navList itself requires a navLabel, in order to label what it’s a list of.

These are designed for use with things like lists of illustrations, which might be worth listing but which don’t belong in the main table of contents. Probably most of the time you don’t need to bother with these, but I’m including them here anyway for the sake of completeness.

Miscellaneous Notes on File Formats

So. That’s all of the book’s key structural elements summarized. But then you also need the pieces actually making up the book.

In practice, your book is probably going to consist mostly of GIF, JPEG, PNG, and SVG images, XHTML text, CSS stylesheets, and the one NCX file. Also officially supported are DTBook XML, XML 1.0 (in the marginal “must include fallback information” way described in my summary of the OPF manifest), and the deprecated OEBPS 1.2 document and stylesheet formats and XML 1.1 format; but, in practice, you’re probably not going to use those much.

(There are various formats treated as supported by the manifest which aren’t on the above list; that’s an eccentricity of the manifest. When I talk about ‘supported formats’ from here on out, I mean the twelve formats (if we count XML 1.0 and 1.1 separately) listed in the previous paragraph.)

So here’s a list of oddities in how EPUB files handle the aforementioned seven relatively-common formats:

XHTML img elements should only be used to point to the four supported image formats; don’t rely on fallbacks here. Their alt attributes need to be defined. Any images embedded need to be in the manifest. Tracing back the implications, this last point means that you should only embed images contained within the EPUB, not remote ones from the web; the OPF’s manifest, after all, is only allowed to contain relative links, not absolute links, and there’s no way to do a relative link to a web resource from an EPUB file.
XHTML object elements, when listing objects not in the pile of supported media formats or using a classid attribute, need to include fallback information and bottom out with objects which are of supported formats. Any param elements within objects need to appear before the objects’ renderable content. If you use codetype or type attributes in your objects, the types need to match the types listed for those objects in the OPF manifest.
XHTML script elements need defined type attributes. Also, there’s a good chance they won’t run, since it’s anti-recommended in the specification for readers to support running scripts.
XHTML style elements need type attributes. Putting aside deprecated options, their value should always be "text/css".
Contrary to the XHTML specification, the align="char" attribute is banned in EPUBs. You can still use style="text-align:<string>", or equivalent CSS, where needed.
SVG file animation and scripting are disabled in EPUB and won’t be rendered. (Also, tangentially: you can’t put SVG files inline in DTBook XML, only in XHTML.)
You can inline non-supported XML formats in XHTML, with appropriate fallbacks; this is useful for, for example, MathML markup. More on this below.
CSS values which take URIs need to point at files contained within the OPF manifest, much like XHTML img elements and with the same implication of “no pointing to remote resources”.
CSS size values need units specified; they can’t be unitless. (This seems like the sort of thing which might well be already part of the CSS specification, but I don’t know CSS well enough to be sure, so I’m mentioning it here.)
There’s a limited set of CSS properties supported by the EPUB specification. It’s not precisely the same as those in the CSS 2.0 spec; it’s missing some properties, and it adds others. You can still mostly use CSS in normal fashion and trust it to work, but if you really need to be sure a given property is supported, see the big table in Section 3.3 of the OPS specification.

Inline non-supported XML

As mentioned above, it’s possible to inline non-supported XML formats within your XHTML documents. An instance of this might look like:

<ops:switch>
    <ops:case required-namespace="http://www.w3.org/1998/Math/MathML">
        [MathML representing the equation "2 + 2 = 4"]
    </ops:case>
    [More cases, if you've got more unsupported XML formats you might want to fall back on]
    <ops:default>
        <p>2 + 2 = 4</p>
    </ops:default>
</ops:switch>

The ops:switch element is the root of the inline non-supported XML block. It contains zero or more ops:case elements, each with a required-namespace attribute and (if applicable) required-modules attribute of the same sorts previously described in my discussion of fallbacks for unsupported XML files in the manifest. Each ops:case element should then contain XML of whatever unsupported sort is ientified by those attributes. The ops:switch also contains a single ops:default element, whose contents need to be well-formed XHTML.

The fallback chain is then implemented as: the reader looks at the opf:case elements in the order in which they’re defined. If it hits one whose namespace it knows how to display—if the reader supports MathML, in this example—then it displays the contents of that case. (So the first case it knows how to display is the one that gets displayed, in other words.) If it doesn’t know how to display any of the cases, then it displays the contents of the opf:default instead, as the final definitely-supported fallback.

Conclusion

…and that’s it! Writing this, I tried to make it the sort of summary which I wish I had been able to find when I was doing my own research on the EPUB format a few weeks back. At that, I think I succeeded, and I hope this post will prove as useful to others as it would have been to my past self.

Tags: EPUB