EPUB
EPUB is an e-book file format that uses the ".epub" file extension. The term is short for electronic publication and is sometimes styled ePub. EPUB is supported by many e-readers, and compatible software is available for most smartphones, tablets, and computers. EPUB is a technical standard published by the International Digital Publishing Forum. It became an official standard of the IDPF in September 2007, superseding the older Open eBook standard.
The Book Industry Study Group endorses EPUB 3 as the format of choice for packaging content and has stated that the global book publishing industry should rally around a single standard. The EPUB format is implemented as an archive file consisting of XHTML files carrying the content, along with images and other supporting files. EPUB is the most widely supported vendor-independent XML-based e-book format; that is, it is supported by almost all hardware readers, except for Kindle.
History
A successor to the Open eBook Publication Structure, EPUB 2.0 was approved in October 2007, with a maintenance update approved in September 2010.The EPUB 3.0 specification became effective in October 2011, superseded by a minor maintenance update in June 2014. New major features include support for precise layout or specialized formatting, such as for comic books, and MathML support. The current version of EPUB is 3.1, effective January 5, 2017. The format specification underwent reorganization and clean-up; format supports remotely-hosted resources and new font formats and uses more pure HTML and CSS.
In May 2016 IDPF Members approved World Wide Web Consortium merger, "to fully align the publishing industry and core Web technology".
Version 2.0.1
EPUB 2.0 was approved in October 2007, with a maintenance update intended to clarify and correct errata in the specifications being approved in September 2010. EPUB version 2.0.1 consists of three specifications:- Open Publication Structure 2.0.1, contains the formatting of its content.
- Open Packaging Format 2.0.1, describes the structure of the
.epub
file in XML. - Open Container Format 2.0.1, collects all files as a ZIP archive.
Open Publication Structure 2.0.1
An EPUB file uses XHTML 1.1 to construct the content of a book as of version 2.0.1. This is different from previous versions, which used a subset of XHTML. There are, however, a few restrictions on certain elements. The mimetype for XHTML documents in EPUB isapplication/xhtml+xml
.Styling and layout are performed using a subset of CSS 2.0, referred to as OPS Style Sheets. This specialized syntax requires that reading systems support only a portion of CSS properties and adds a few custom properties. Custom properties include
oeb-page-head, oeb-page-foot,
and oeb-column-number
. Font-embedding can be accomplished using the @font-face
property, as well as including the font file in the OPF's manifest. The mimetype for CSS documents in EPUB is text/css
.EPUB also requires that PNG, JPEG, GIF, and SVG images be supported using the mimetypes
image/png, image/jpeg, image/gif, image/svg+xml
. Other media types are allowed, but creators must include alternative renditions using supported types. For a table of all required mimetypes, see of the specification.Unicode is required, and content producers must use either UTF-8 or UTF-16 encoding. This is to support international and multilingual books. However, reading systems are not required to provide the fonts necessary to display every unicode character, though they are required to display at least a placeholder for characters that cannot be displayed fully.
An example skeleton of an XHTML file for EPUB looks like this:
...
Open Packaging Format 2.0.1
The OPF specification's purpose is to "... the mechanism by which the various components of an OPS publication are tied together and provides additional structure and semantics to the electronic publication." This is accomplished by two XML files with the extensions.opf
and .ncx
.;.opf file
The OPF file, traditionally named
content.opf
, houses the EPUB book's metadata, file manifest, and linear reading order. This file has a root element package
and four child elements: metadata, manifest, spine,
and guide
. Furthermore, the package
node must have the unique-identifier
attribute. The.opf file's mimetype is application/oebps-package+xml
.The
metadata
element contains all the metadata information for a particular EPUB file. Three metadata tags are required : title, language,
and identifier
. title
contains the title of the book, language
contains the language of the book's contents in RFC 3066 format or its successors, such as the newer RFC 4646 and identifier
contains a unique identifier for the book, such as its ISBN or a URL. The identifier
's id
attribute should equal the unique-identifier
attribute from the package
element.The
manifest
element lists all the files contained in the package. Each file is represented by an item
element, and has the attributes id, href, media-type
. All XHTML, stylesheets, images or other media, embedded fonts, and the NCX file should be listed here. Only the .opf
file itself, the container.xml
, and the mimetype
files should not be included. Note that in the example below, an arbitrary media-type
is given to the included font file, even though no mimetype exists for fonts.The
spine
element lists all the XHTML content documents in their linear reading order. Also, any content document that can be reached through linking or the table of contents must be listed as well. The toc
attribute of spine
must contain the id
of the NCX file listed in the manifest. Each itemref
element's idref
is set to the id
of its respective content document.The
guide
element is an optional element for the purpose of identifying fundamental structural components of the book. Each reference
element has the attributes type, title, href
. Files referenced in href
must be listed in the manifest, and are allowed to have an element identifier.An example OPF file:
;.ncx file
The NCX file, traditionally named
toc.ncx
, contains the hierarchical table of contents for the EPUB file. The specification for NCX was developed for Digital Talking Book, is maintained by the DAISY Consortium, and is not a part of the EPUB specification. The NCX file has a mimetype of application/x-dtbncx+xml
.Of note here is that the values for the
docTitle, docAuthor,
and meta name="dtb:uid"
elements should match their analogs in the OPF file. Also, the meta name="dtb:depth"
element is set equal to the depth of the navMap
element. navPoint
elements can be nested to create a hierarchical table of contents. navLabel
's content is the text that appears in the table of contents generated by reading systems that use the.ncx. navPoint
's content
element points to a content document listed in the manifest and can also include an element identifier.A description of certain exceptions to the NCX specification as used in EPUB is in of the specification. The complete specification for NCX can be found in of the Specifications for the Digital Talking Book.
An example.ncx file:
"http://www.daisy.org/z3986/2005/ncx-2005-1.dtd">
Open Container Format 2.0.1
An EPUB file is a group of files that conform to the OPS/OPF standards and are wrapped in a ZIP file. The OCF specifies how to organize these files in the ZIP, and defines two additional files that must be included.The
mimetype
file must be a text document in ASCII that contains the string application/epub+zip
. It must also be uncompressed, unencrypted, and the first file in the ZIP archive. This file provides a more reliable way for applications to identify the mimetype of the file than just the .epub
extension.Also, there must be a folder named
META-INF
, which contains the required file container.xml
. This XML file points to the file defining the contents of the book. This is the OPF file, though additional alternative rootfile
elements are allowed.Apart from
mimetype
and META-INF/container.xml
, the other files are traditionally put in a directory named OEBPS
.An example file structure:
--ZIP Container--
mimetype
META-INF/
container.xml
OEBPS/
content.opf
chapter1.xhtml
ch1-pic.png
css/
style.css
myfont.otf
An example container.xml, given the above file structure:
Version 3.0.1
The EPUB 3.0 Recommended Specification was approved on 11 October 2011. On June 26, 2014 EPUB 3.0.1 was approved as a minor maintenance update to EPUB 3.0. EPUB 3.0 supersedes the previous release 2.0.1.EPUB 3 consists of a set of four specifications:
- EPUB Publications 3.0, which defines publication-level semantics and overarching conformance requirements for EPUB Publications
- EPUB Content Documents 3.0, which defines profiles of XHTML, SVG and CSS for use in the context of EPUB Publications
- EPUB Open Container Format 3.0, which defines a file format and processing model for encapsulating a set of related resources into a single-file EPUB Container.
- EPUB Media Overlays 3.0, which defines a format and a processing model for synchronization of text and audio
- While good for text-centric books, EPUB was rather unsuitable for publications that require precise layout or specialized formatting, such as comic books.
- A major issue hindering the use of EPUB for most technical publications was the lack of support for equations formatted as MathML. They were included as bitmap or SVG images, precluding proper handling by screen readers and interaction with computer algebra systems. Support for MathML is included in the EPUB 3.0 specification.
- Other criticisms of EPUB were the specification's lack of detail on linking within or between EPUB books, and its lack of a specification for annotation. Such linking is hindered by the use of a ZIP file as the container for EPUB. Furthermore, it was unclear if it would be better to link by using EPUB's internal structural markup or directly to files through the ZIP's file structure. The lack of a standardized way to annotate EPUB books led to difficulty in sharing and transferring annotations and therefore limited the use scenarios of EPUB, particularly in educational settings, because it cannot provide a level of interactivity comparable to the web.
In November 2014, EPUB 3.0 was published by the International Standards Organization as ISO/IEC TS 30135.
In January 2020, EPUB 3.0.1 was published by the International Standards Organization as ISO/IEC 23736.
Version 3.2
EPUB 3.2 was announced in 2018, and the final specification was released in 2019.Features
The format and many readers support the following:- Reflowable document: optimize text for a particular display
- Fixed-layout content: pre-paginated content can be useful for certain kinds of highly designed content, such as illustrated books intended only for larger screens, such as tablets.
- Like an HTML web site, the format supports inline raster and vector images, metadata, and CSS styling.
- Page bookmarking
- Passage highlighting and notes
- A library that stores books and can be searched
- Re-sizable fonts, and changeable text and background colors
- Support for a subset of MathML
- Digital rights management—can contain digital rights management as an optional layer
Digital rights management
The EPUB specification does not enforce or suggest a particular DRM scheme. This could affect the level of support for various DRM systems on devices and the portability of purchased e-books. Consequently, such DRM incompatibility may segment the EPUB format along the lines of DRM systems, undermining the advantages of a single standard format and confusing the consumer.
DRMed EPUB files must contain a file called
rights.xml
within the META-INF
directory at the root level of the ZIP container.Adoption
EPUB is widely used on software readers such as Google Play Books on Android and Apple Books on iOS and macOS, but not by Amazon Kindle's e-readers or associated apps for other platforms. Kindle uses mainly the Mobipocket format, or their proprietary formats AZW, AZW3 or KFX. iBooks also supports the proprietary iBook format, which is based on the EPUB format but depends upon code from the iBooks app to function.;Data interchange: EPUB is a popular format for ebook creation because it can be an open format and is based on HTML, as opposed to Amazon's proprietary format for Kindle readers. Popular EPUB producers of public domain and open licensed content include Project Gutenberg, PubMed Central, SciELO and others.
Security and privacy concerns
EPUB requires readers to support the HTML5, JavaScript, CSS, SVG formats, making EPUB readers use the same technology as web browsers.Such formats are associated with various types of security issues and privacy-breaching behaviors e.g. Web beacons, CSRF, due to their complexity and flexibility.
Such vulnerabilities can be used to implement Web tracking and Cross-device tracking on EPUB files
Security researchers also identified attacks leading to local files and other user data being uploaded.
The "EPUB 3.1 Overview" document provides a security warning:
EPUB also requires PNG, JPEG and GIF.
Implementation
An EPUB file is an archive that contains, in effect, a website. It includes HTML files, images, CSS style sheets, and other assets. It also contains metadata. EPUB 3 is the latest version. By using HTML5, publications can contain video, audio, and interactivity, just like websites in web browsers.Container
An ePub publication is delivered as a single file. This file is an unencrypted zipped archive containing a set of interrelated resources.An OCF Abstract Container defines a file system model for the contents of the container. The file system model uses a single common root directory for all contents in the container. All resources for publications are in the directory tree headed by the container's root directory, though EPUB mandates no specific file system structure for this. The file system model includes a mandatory directory named META-INF that is a direct child of the container's root directory. META-INF stores container.xml.
The first file in the archive must be the mimetype file. It must be unencrypted and uncompressed so that non-ZIP utilities can read the mimetype. The mimetype file must be an ASCII file that contains the string "application/epub+zip". This file provides a more reliable way for applications to identify the mimetype of the file than just the.epub extension.
An example file structure:
--ZIP Container--
mimetype
META-INF/
container.xml
OEBPS/
content.opf
chapter1.xhtml
ch1-pic.png
css/
style.css
myfont.otf
toc.ncx
There must be a META-INF directory containing container.xml. This file points to the file defining the contents of the book, the OPF file, though additional alternative rootfile elements are allowed. Apart from mimetype and META-INF/container.xml, the other files are traditionally put in a directory named OEBPS. An example container.xml:
Publication
The ePUB container must contain:- At least one content document.
- One navigation document.
- One package document listing all publication resources. This file should use the file extension .opf. It contains metadata, a manifest, fallback chains, bindings, and a spine. This is an ordered sequence of ID references defining the default reading order.
- Style sheets
- Pronunciation Lexicon Specification documents
- Media overlay documents
Contents
Contents also include CSS and PLS documents. Navigation documents supersedes the NCX grammar used in EPUB 2.
Media overlays
Books with synchronized audio narration are created in EPUB 3 by using media overlay documents to describe the timing for the pre-recorded audio narration and how it relates to the EPUB Content Document markup. The file format for Media Overlays is defined as a subset of SMIL.Software
Many editors exist including calibre and Sigil, both of which are open source. Another open source tool, called epubcheck, can be used for validating and detecting errors in the structural markup, image, and XHTML files.Readers exist for all major hardware platforms with the exception of Amazon Kindle, such as Adobe Digital Editions and calibre on desktop platforms, Google Play Books and Aldiko on Android and iOS, and Apple Books on macOS and iOS.
Reading software
The following software can read and display EPUB files:Software | License | Platform | DRM formats supported | Notes |
Adobe Digital Editions | Proprietary | Microsoft Windows, Apple Mac OS X, Android, iOS | Adobe Content Server | Requires online activation for ePub files with DRM. |
Aldiko | Proprietary | Android | Adobe Content Server | Supports ePub for Android devices. |
Apple Books | Proprietary | OS X, iOS | FairPlay | Supports EPUB 2 and EPUB 3. Books not readable directly on computers other than Macs. |
Bluefire Reader | Proprietary | Apple iOS, Android | Adobe Content Server | Supports ePub for Android and iOS devices. |
calibre | GPL | Windows, OS X, Linux | - | Primarily for library management, conversion, and transferring to devices, it includes an EPUB reader and editor. |
FBReader | Proprietary | Windows, Linux, Android, PDAs, OS X | - | |
Foliate | GPL | Linux | - | Supports also Mobi, AZW |
Google Play Books | Proprietary | Web application, Android, Apple iOS | Lektz DRM | Supports downloading purchased books as ePub and/or PDF. |
Kitabu | Proprietary | OS X | - | Supports ePub3, ePub2, Fixed layout. |
Kobo | Proprietary | Windows, OS X, Android, Apple iOS, Kobo eReader Software, | Adobe Content Server | Supports EPUB 2 and EPUB 3. |
Lector | GPL | Linux | - | Supports also Mobi, AZW, CBR/CBZ, PDF, DjVu, FB2 |
Lektz Readers | Proprietary | Web application, Google Android, OS X, iOS, Windows | Lektz | eBook Readers for PDF, ePUB/2 and ePUB3 providing uniform experience across different platforms - iOS, Android, Windows PC, Mac Desktop and Web. |
Libby | Proprietary | Windows, Android, Apple macOS, iOS, iPadOS | Free app for eBooks and audiobooks from local libraries. | |
Lucifox | GPL | Windows, OS X, Linux | - | Ebook reader add-on with annotations for Firefox. Supports open standard ebooks in EPUB 3- and EPUB 2 format and retrieval of books from OPDS book catalogues. |
Okular | GPL | Windows, OS X, Linux | ||
Snapplify | Proprietary | All Web browsers, Apple iOS, Android | Adobe Content Server Snapplify SnappSafe DRM | Supports downloading purchased books as ePub and/or PDF. Supports PDF, ePUB2 and ePUB3 standard of eBooks. |
Sora | Proprietary | Windows, Android, Apple macOS, iOS, iPadOS | Free app for eBooks and audiobooks from schools. | |
STDU Viewer | Freeware | Windows | Supports many documents format including ePub. | |
Sumatra PDF | GPL | Windows | Adobe Content Server | Supports ePub for devices. |
See also :Category:EPUB readers|the Wikipedia category for articles about EPUB readers. Note that Microsoft Edge used to support EPUB books but no longer does.
Microsoft Edge | Proprietary | Windows 10 | Microsoft Edge no longer supports EPUB books. |
Editing software
Software | Platform | License | Notes |
ABBYY FineReader | Microsoft Windows | Proprietary | Version 11 exports to EPUB format. |
Abiword | FreeBSD, Linux, Windows | GPL | Support EPUB 2.0 format export since 2.9.1 release |
Adobe InDesign | Windows, OS X | Proprietary | Exports to EPUB format. Versions prior to 5.5 create EPUBs that require significant editing to pass ePubCheck or ePubPreFlight. As from InDesign CC 2014, InDesign can export in ePub3 fixed-layout format. |
Adobe RoboHelp | Windows | Unknown | Online documentation tool that supports export to EPUB format |
Atlantis Word Processor | Windows, Portable app | Shareware | Converts any document to EPUB; supports multilevel TOCs, font embedding, and batch conversion. |
Booktype | Web | GPL | Book production platform that outputs to many formats, including ePub. The platform can import content in various formats and supports collaborative editing. |
calibre | Windows, OS X, FreeBSD, Linux | GPL | Conversion software and e-book organizer. Allows plugins, including for editing EPUB files; there is for instance a plugin to merge several EPUB files into one. |
eLML | Windows, OS X, FreeBSD, Linux | Unknown | The eLesson Markup Language is a platform-independent XML-based open source framework to create eLearning content. It supports various output formats like SCORM, HTML, PDF and also eBooks based on the ePub format. |
Feedbooks | Web | Unknown | Free cloud service for downloading public domain works and for self-publishing. |
Help & Manual | Windows | Proprietary | Single source publishing tool that generates ePUB amongst several other documentation formats. |
HelpNDoc | Windows | Free for personal use, commercial otherwise. | Help authoring tool that generates EPUB files and other formats. |
iBooks Author | OS X | Unknown | Desktop publishing and page layout application. Free from Apple. Can export.ibooks format, which is a proprietary format based on EPUB. There are restrictions on the commercial distribution of works created with iBooks in the.ibooks format. These restrictions apply to the.ibooks format only and it can be argued that a file renamed to.epub is not distributed in the.ibooks format. |
iStudio Publisher | OS X | Proprietary | Desktop publishing and page layout application. |
LibreOffice | Windows, OS X, Linux | Mozilla Public License, GNU Lesser General Public License | Text processor with a functionality to export as ePub3 format since version 6.0. Also allowed to export as ePub format via installing extension, such as eLaix. |
Lulu.com | Web | Proprietary | Converts.doc,.docx, or PDF manuscripts to an ePub in order that they may be sold on the Website in question. |
Madcap Flare | Windows | Proprietary | Single source publishing tool that can export content as ePUB. |
oXygen XML Editor | OS X, Windows, FreeBSD, Linux | Proprietary | oXygen XML Editor is the first tool that supports creating, transforming, and validating the documents that comprise the EPUB package. |
Pages | OS X | Unknown | Word processor that can export to EPUB format. |
Pages | Apple | Unknown | Word processor for mobile devices that can export to EPUB format |
Pandoc | Unix-like, Windows | GPLv2 | Can output EPUB Versions 2 and 3 |
Playwrite | OS X | Proprietary | Native EPUB-based word processor. Native to EPUB 3 with EPUB 2 compatibility. |
QuarkXPress | OS X, Windows | Proprietary | Desktop publishing tool, page layout application. Exports also to the ePUB format. |
Serif PagePlus | Windows | Proprietary | Desktop publishing program that can export to the EPUB 2 and EPUB 3 format. Comes with built-in output conversion profiles for targeting specific devices, as well as generic devices. Also includes pre-tested blank eBook templates, or can open and edit existing PDF files and publish as EPUB. |
Scrivener | Windows, OS X | Proprietary | Program for writers. Includes organization capabilities for fiction writers. Publishes to multiple formats. |
Sigil | Windows, FreeBSD, Linux, OS X | GPL | Can open and edit EPUB books, instead of just converting from other formats to EPUB. Since version 0.7, supports embedding video or audio in EPUB. |
eXeLearning | Windows, Linux, OS X | GPL | Can be used to create educational interactive Web content, HTML5, IMS, SCORM and EPUB3 books |