Converting the Book to an HTML web site

July 1st, 2009 by marti | 4 Comments »

Several people have asked how I produced the html pages for the book website. What a perfect topic for a blog post!

(NB: Much of this material appears in a response I made to a comment from Will Fitzgerald.)

The answer is that I spent a lot of time on the conversion, because there wasn’t any out-of-the-box solution that I could live with.

You may ask: why html rather than just making pdfs of each chapter? I feel that html is often more readable online, although sites like scribd and slideshare are doing a nice job of embedding pdf files. Also, I wanted to make it easy to navigate from the mentions of figures to the figures themselves, to pop up the citations, and to link the citations to places to find them in the literature. And oh yes, I needed to provide search over the book!

So on to the conversion. I wrote the manuscript in latex. There is a program called latex2html that automatically does the conversion, but the version installed on my system at least is from 2002 and doesn’t generate modern html. Therefore, I wrote special purpose code to convert the latex to html (although I did use bibtex2html for the references).

In retrospect, because of all the cross-referencing (in figures, tables, citations, etc), I probably should instead have started with the output of latex2html on everything and then modified the output.

I put a lot of thought into whether to break the text up into chapters or sections, because I didn’t want each page too long but I wanted the context retained. I think chapters works best in the end, and it helps a lot with the cross-referencing. I broke the text up into sections for the search tool, though.

I was a bit concerned about having comments appear on the chapters, as that text is frozen and should always look the same, so I decided on a link from the chapter pages to comment pages on the blog. I am also going to make a link from each chapter page to a list of relevant blog posts as they appear. This way, again, the chapters will remain pretty much static while linking in a straight-forward way to new content and corrections (in the unlikely event there are any … ha ha ha). We’ll see if that works or not. I did make a category for each chapter. I could also do it for sections but I think that is overkill.

I also wanted to use some of the whizzy features that javascript makes easy to employ now. For years I’ve avoided javascript in my interfaces because so many browsers were incompatible (and let’s face it, so I didn’t have to learn how to code with it) but now most browsers can handle it and there are terrific libraries like jQuery. I was careful to make sure that anything javascript was duplicated in title tags, or else irrelevant, so that screen readers don’t miss anything important.

So I have the pop-ups for the citations, implemented using qTip, and the cool figure reveals, implemented in slimbox2. I did have a problem with one script clobbering another, but that will be described in a subsequent post on implementing the search function.

I know that if I were reading this book I would want to have links to the actual articles that the citations refer to. But everyone has different access rights to journals and so on, so I wrote a hack to make the search over Google scholar; in the script, for each reference, I grab the text between the start of the cite and the journal name (the authors and title) and stick that in a query to scholar. It sometimes doesn’t work right when I grab too much text or weird characters get in there.

The graphic design took a long time. I wanted to make it readable, and not too website-ish. I also wanted to incorporate the image from the cover of the book. So although I don’t think the design is as cool and cutting-edge as some sites are, I do think it is aesthetically pleasing and leads to a good online-reading experience. If you think otherwise, please let me know how to improve it!

Also, I realize I need to improve the main landing page and integrate blog posts into it. That is future work. Suggestions welcome on that, though.

4 Comments

  1. Thanks for explaining the process you went through. And thanks for doing all that extra work – not very many authors go to this length to make their book useful online.

    And at first glance, the information architecture decisions you made in structuring the book as a web site make sense. After a year of using it, then we shall see. (^:

  2. Marti says:

    Yes, and if you decide you think it should be changed in some way in particular, please do let me know.

  3. Erinah says:

    It’s great that the online version of the book was published this way, in html, not using e.g. pdfs, embedded readers. It is easier to read and search on any of my devices, much more comfortable.

  4. Marti says:

    Thank you, Erinah. I too feel that html is generally easier to work with, and of course it is better for search engine crawling!

Leave a Reply