Help get this topic noticed by sharing it on Twitter, Facebook, or email.
I’m frustrated

Extreme load times experienced when reading O'Reilly ePub books w/ huge .ncx Table of Contents files

I just purchased a new Kobo Glo device to replace a Sony eReader PRS-T1 and attempted to test read some O'Reilly books, such as JavaScript: The Definitive Guide, 6th ed (5.2 MB file size). I ended up having to wait almost two minutes just to navigate to a chapter (for a constant point of reference, let's use Chapter 19, the chapter on jQuery)! I experienced similar magnitudes of performance degredation on the Sony eReader I recently replaced as I am experiencing on the Kobo Glo. After some research and experimentation (for brevity, the full details of the research and experimentation performed included below) and discovered that the root cause of these extreme load times is the combined length and complexity of the .ncx Table of Contents files O'Reilly includes w/ their books.

To prove this point, I modified a version of the above mentioned JavaScript book and flattened the TOC, compressing it to about 6% of it's original size. The resulting seek times were on par with every other eBook (3--5 seconds).

What I would like O'Reilly to do to fix these problems is to modify the program used to generate the ePub files such that (1) the NCX TOC is flat, containing only the Parts and Chapters of a book and not the sections of each chapter and (2) the ePub files contain a real, "printed" TOC that contains all the details currently included in the NCX TOC. In this way, those of us who wish to dig into our favorite O'Reilly titles can do so without having to wait inordinate amounts of time to navigate to a chapter or page in a book.

Full details of Research and Experimentation

I did some experimentation on my end to try and isolate the cause of the problem, first trying another O'Reilly Book (C# 4.0 in a Nutshell [3MB file size], Ch. 12; waited 20--30 seconds for the chapter to load), a couple of books published by Informit (Code Reading: An open source perspective [9.3MB file size] & Domain-Specific Languages [27MB file size]; average seek time of 3--5 seconds in each book). I also referenced a (Urp!) DRM-protected book published by Wrox called "Microsoft SQL Server 2008 Bible" (34 MB file size) and experienced load times of 3--5 seconds.

Having demonstrated the the problem was not caused by the size of the ePub files and having observed that the NCX TOC files for the non-O'Reilly eBooks are much smaller than those of O'Reilly eBooks, I decided to pull out Sigil and create a hack-ish copy of the JavaScript book w/ a full in-book TOC and a flat NCX TOC. After using Visual Studio 2012 to flatten the NCX TOC (resulting size of the new TOC was about 6% of the original size), I saved the new NCX TOC to the JavaScript book and loaded it onto my Kobo Glo and navigated to a couple of chapters. The resulting load times were roughly 3--5 seconds: exactly on par w/ the other eBooks.

Therefore, the cause of the extreme load times for O'Reilly books on my eReader is the size of the NCX TOC files O'Reilly includes in their eBooks.
1 person has
this problem
+1
Reply
  • Hi Robert,

    Apologies that you had a negative experience reading JavaScript: The Definitive Guide, 6e on your Kobo Glo. Thank you very much for your excellent feedback, and for taking the time to troubleshoot this issue and offer some suggested fixes. Your feedback is most appreciated.

    For some context on our current EPUB specs, several years ago we made a decision to exclude a "printed" version of the book's Table of Contents in the main book flow, and to solely use the NCX file to communicate TOC data. There were two main reasons we settled on this approach:

    1. The NCX file is the canonical mechanism in the EPUB 2.0.1 specification for communicating Table of Contents data to the ereader, and all major ereaders have a built-in UI for navigating the book via the NCX TOC. As such, we felt having an additional HTML version of the TOC in the book content was redundant.

    2. Not only was having the additional HTML version of the TOC redundant, but it was also potentially an annoyance to the reader, as he/she would have to flip through these extra TOC pages to get to the beginning of the actual book content.

    So that's the rationale behind our present TOC policies for EPUB. However, the most fundamental precept of O'Reilly's ebook development group is that we continuously strive to make our ebooks the best experience possible for all our customers, and we are willing to revisit pretty much any of our existing policies when new data comes to light that militates against them. Your findings on how a large NCX file degrades the reader experience on Kobo Glo certainly make a compelling case that we should revisit our TOC specs. So we are going to do just that and look into how we can better optimize the file size of our NCX documents, as well as reconsider inclusion of an additional HTML TOC in the main content flow.

    I can't promise that we will ultimately implement your exact suggestion, because when optimizing our EPUB content, we need to do so in light of the full ecosystem of ereaders that our customers use to consume our ebooks, which includes dozens of different hardware devices and software platforms that can vary quite a bit in terms of technical specs and rendering capabilities. What works best for Kobo Glo may or may not work well on NOOK Simple Touch, and we do our best to take tradeoffs like this into account when designing our EPUBs to ensure that our content is as device-agnostic as possible. But what I can promise is that as we continue to iterate and improve our EPUB-generation toolchain, we are going to make TOC optimization a top priority.

    Finally, I just want to add, on a more general note, that O'Reilly welcomes and encourages its customers to "hack" their EPUB files if there are customizations they want to make, just as you've done using Sigil. We make all O'Reilly ebooks available for purchase completely DRM-free, because we believe that when you buy an ebook from us, you own that content and should feel free to tweak it however you see fit for your personal use.

    Please let us know if you have any further questions about our EPUB specs or policies, or if you have any further feedback about your ebook reading experience on Kobo Glo or other ereaders. Again, we greatly appreciate all your feedback.

    Best,
    Sanders Kleinfeld

    ---
    Sanders Kleinfeld
    Publishing Technology Engineer
    O'Reilly Media Inc.
    sanders@oreilly.com
    @sandersk
  • (some HTML allowed)
    How does this make you feel?
    Add Image
    I'm

    e.g. kidding, amused, unsure, silly happy, confident, thankful, excited indifferent, undecided, unconcerned sad, anxious, confused, frustrated