Search User Interfaces named a best book of 2009

December 6th, 2009 by marti | 4 Comments »

Avi Rappoport alerted me to the fact that Chris Sherman of Search Engine Land named Search User Interfaces one of the
best SEO books of 2009.

I was pleased with how Chris explained this choice, given that my book does not talk about SEO directly. He says:

“With our focus on search marketing at Search Engine Land, it’s unusual for me to review an academic textbook, but Search User Interfaces is an unusual book—and I mean that as high praise.”

He goes on to explain what he found valuable about the book in more detail. I am really glad to know the book is of value to practitioners as well as academics.

Search User Interfaces is Shipping!

October 4th, 2009 by marti | 7 Comments »

I’ve heard from colleagues that they are receiving their copies of the physical book in the mail from Amazon. If you’ve been waiting to order a copy until it is in stock, procrastinate no longer! And of course I encourage people to write a review on Amazon and other sites.
Order here

User-Generated Query Suggestions Found to be Better than Machine-Generated

July 26th, 2009 by marti | 1 Comment »

At SIGIR 2009, an excellent paper was presented by Diane Kelly, Karl Gyllstrom, and Earl W. Bailey that compares user-generated versus algorithmically-generated query suggestions. As far as I know, this is the first paper to do such a comparison within a usability study. With a healthy-sized pool of 55 participants and 20 TREC topics, they found that query reformulation suggestions derived from human-issued search logs performed better than system-generated queries (at least using their system), in terms of number of suggestions used, number of relevant documents saved, and a precision score (although the latter was not a statistically significant difference).

They also compared multi-term queries to single-term suggestions (where clicking on the term added it to the current query, thus refining it; query suggestions replaced the current query), finding the query-based suggestions were better and subjectively preferred over the single-term suggestions, with some participants noting that query-like suggestions presented whole ideas. Query-based suggestions were seen as useful for a cold-start or when the searcher had run out of ideas. Terms were seen as more useful for refining an already specific query. Note that the result for single terms may have been affected by the presentation of this option, in a paragraph-style list, which some participants regarded as “jumbled” looking.

The reference is: Kelly, Gyllstrom, Bailey, A Comparison of Query and Term Suggestion Features for Interactive Searching , proceedings of ACM SIGIR 2009 (no link available yet).

This is relevant to Section 6.3: Automated Term Suggestions

Putting a bit of Bing in the book

July 20th, 2009 by marti | 3 Comments »

I finished writing the book and sent the files to my publisher towards the end of January 2009. Towards the end of April they sent me a round of copy-editing comments which I then either retained or rejected. On June 24th the publisher gave me the final proofs, to which I was instructed only to make small corrections of fact and typos. Also on June 24, I launched the online version of the book

BUT something happened between the end of April and the end of June. Yes, a day after I put the book online I was criticized for not mentioning Bing, which had only appeared on the face of the internet two weeks earlier! Of course I did have numerous mention of Microsoft’s search. (Actually, there was a Bing in my book, but it was an author’s name in the references.)

Last week I completed my final round of small edits, and even though it is against the rules, I’ve added a little reference to Bing, since I think the name will have staying power. It appears next to the first image from the old Microsoft search. This way searches on the name won’t come up empty.

I also added corrections to errors that readers of this column so kindly pointed out to me. All of the changes in the proofs are now reflected in the online version of the book, and I don’t expect it to change again after this.

I wish there was a bit more time before the final proofs are needed in case any of you kind readers find more errors. But that would delay production of the hardcopy book, so now things are frozen. Fortunately for the dead-tree version, Google is highly unlikely to rebrand itself any time soon.

How did you convince your publisher to let you put your book online for free

July 6th, 2009 by marti | 14 Comments »

Interestingly, no one so far has asked me why I am putting my book online for free. Rather, they’ve been asking how I got my publisher to agree to it. Assumptions have really changed!

The short answer to how is that I chose Cambridge University Press in part because I saw they had allowed Chris, Prabhakar, and Hinrich to put their book Introduction to Information Retrieval freely online before publishing it in dead tree form.

CUP hasn’t done this with all that many books, but so far it seems to lead to more sales. So let’s hope that happens with Search User Interfaces as well.

The second reason I chose CUP is they have published the bulk of the books related to human information seeking, including Gary Marchionini’s Information Seeking in Electronic Environments and Rik Belew’s Finding Out About. (BTW, I just noticed you can search these books and see hits in context at the CUP website!)

As for why, I’ve long been of the mindset that my writing has value so far as it has impact, and the best way to do that is to maximize how many people read it. Along with many others, I’ve long been an advocate of freeing the content of academic journals, since the knowledge produced by research should be dispersed as widely as possible, and in my field at least the academics to all the writing, reviewing, and editing. Not to mention that the government pays for a large chunk of the research being reported. People used to laugh at me when I started saying this, but thanks to the hard work of lots of other people, that ship in the slow process of turning.

But I suspect my primary reason for putting the book up free is laziness: now when I’m having a conversation about a search interfaces topic I can just say “if you want to know more, go to section S.X of my book.” It’s like a mental subroutine call.

Mama, don’t let your babies name their books with garden path noun compounds

July 4th, 2009 by marti | 3 Comments »

When I realized that I was writing a book and not just a few chapters for a more general book, I instantly knew that the title had to be Search User Interfaces. It is a phrase that I use a lot in my work, and I was trying to summarize all of the literature on that topic.

However, usually when I tell people the title of the book, they look at me a bit blankly. This surprised me because everyone uses search on the web, although I guess they call it “googling” (a term I never use outside of scare quotes) rather than “searching” or “web searching” or “using web search engines”. (Ok, I admit “googling” is more concise and clear, but many of us were working on search before Google arose. And from my time at Xerox PARC I learned never to use the product name (“xerox”) to describe the process of using the product, less you destroy the trademark claims of your employer. So to this day I say “photocopy”, not that anyone xeroxes any more.)

Anyhow, I eventually came up with a better theory for why the book title throws people off. The phrase “search user interfaces” is a garden path noun compound.

What does that mean? Well, the only other time I’ve written blog posts was during a brief stint when I consulted for Powerset. While there I wrote a post called noun-noun compound is like a chocolate box in which I explained how my research group was working on programs to automatically interpret phrases that consist of a series of nouns. The key issue is how do the different nouns relate to one another semantically? For instance, a steel knife is a knife made of steel, while a butter knife is not a knife made of butter, but rather a knife used to spread butter. Some times it can be ambiguous: is a chocolate box a Gumpian box filled with chocolates, or a whimsical box made of chocolate? This is the kind of analysis our research tried to do.

The other relevant blog post I wrote was called search engines leaking oil for holes and discussed the linguistic phenomenon of garden path sentences. These sentences are confusing because the first few words make the reader assume they are going along one path, and then the next word switches the meaning in a new direction. So for a phrase like blog posts digest stories leads you, the reader, down the garden path by making you think it is talking about “blog posts” but then you expect a verb and see “digest” which isn’t really what blog posts typically do, in the eating sense. You have to back up one work and see that this is more of a headline, saying, this blog posted a digest of stories. Ok, a hokey example, but it illustrates the point. Same goes for “search engines leaking oil for holes”. The trick is that here the word “search” is used as a verb, but because you see the word “engines” right after it, and you’re used to reading about the noun compound “search engines”, you think that is what is being talked about, and then you get this weird picture of search engines leaking oil, which would be ok except for the “for holes” part. The sentence is really a command to auto mechanics to search any engines they encounter that are leaking oil to see if the cause is holes.

Ok, returning back to the original topic. I suspect that when people hear me say “search user interfaces” they think I’m telling them to do a security check of their computer screen. I made the title into a garden path noun compound.

Converting the Book to an HTML web site

July 1st, 2009 by marti | 4 Comments »

Several people have asked how I produced the html pages for the book website. What a perfect topic for a blog post!

(NB: Much of this material appears in a response I made to a comment from Will Fitzgerald.)

The answer is that I spent a lot of time on the conversion, because there wasn’t any out-of-the-box solution that I could live with.

You may ask: why html rather than just making pdfs of each chapter? I feel that html is often more readable online, although sites like scribd and slideshare are doing a nice job of embedding pdf files. Also, I wanted to make it easy to navigate from the mentions of figures to the figures themselves, to pop up the citations, and to link the citations to places to find them in the literature. And oh yes, I needed to provide search over the book!

So on to the conversion. I wrote the manuscript in latex. There is a program called latex2html that automatically does the conversion, but the version installed on my system at least is from 2002 and doesn’t generate modern html. Therefore, I wrote special purpose code to convert the latex to html (although I did use bibtex2html for the references).

In retrospect, because of all the cross-referencing (in figures, tables, citations, etc), I probably should instead have started with the output of latex2html on everything and then modified the output.

I put a lot of thought into whether to break the text up into chapters or sections, because I didn’t want each page too long but I wanted the context retained. I think chapters works best in the end, and it helps a lot with the cross-referencing. I broke the text up into sections for the search tool, though.

I was a bit concerned about having comments appear on the chapters, as that text is frozen and should always look the same, so I decided on a link from the chapter pages to comment pages on the blog. I am also going to make a link from each chapter page to a list of relevant blog posts as they appear. This way, again, the chapters will remain pretty much static while linking in a straight-forward way to new content and corrections (in the unlikely event there are any … ha ha ha). We’ll see if that works or not. I did make a category for each chapter. I could also do it for sections but I think that is overkill.

I also wanted to use some of the whizzy features that javascript makes easy to employ now. For years I’ve avoided javascript in my interfaces because so many browsers were incompatible (and let’s face it, so I didn’t have to learn how to code with it) but now most browsers can handle it and there are terrific libraries like jQuery. I was careful to make sure that anything javascript was duplicated in title tags, or else irrelevant, so that screen readers don’t miss anything important.

So I have the pop-ups for the citations, implemented using qTip, and the cool figure reveals, implemented in slimbox2. I did have a problem with one script clobbering another, but that will be described in a subsequent post on implementing the search function.

I know that if I were reading this book I would want to have links to the actual articles that the citations refer to. But everyone has different access rights to journals and so on, so I wrote a hack to make the search over Google scholar; in the script, for each reference, I grab the text between the start of the cite and the journal name (the authors and title) and stick that in a query to scholar. It sometimes doesn’t work right when I grab too much text or weird characters get in there.

The graphic design took a long time. I wanted to make it readable, and not too website-ish. I also wanted to incorporate the image from the cover of the book. So although I don’t think the design is as cool and cutting-edge as some sites are, I do think it is aesthetically pleasing and leads to a good online-reading experience. If you think otherwise, please let me know how to improve it!

Also, I realize I need to improve the main landing page and integrate blog posts into it. That is future work. Suggestions welcome on that, though.

Launching “Search User Interfaces”

June 24th, 2009 by marti | 11 Comments »

This blog is a companion for my new book, which I am making available online today.

My intended audience is  practitioners and researchers who spend a lot of time thinking about user interfaces for search engines, or who want to bone up on the field.

In this book I try to stick to the evidence. My intention is for all claims to be backed up by usability studies, log studies, or some other form of proof, unless otherwise noted.

Ten years ago when I wrote the first chapter on search interfaces for the textbook Modern Information Retrieval, little was known about what works from a usability perspective and what does not.  Although we as a community still have more questions than answers, there now is enough known to fill a book.

In putting this information into a book there is a danger of rapid obsolescence. Search is a fast-paced field and many examples will soon become out of date.   I hope that despite this,  much of what I wrote is fundamental and will stand the test of time.

Even so, it is important to update the text with discussions of new developments and how they relate to what has come before, and that is what this blog is for.

And of course I want to hear from search aficionados about what you think about these developments!  And what you think about the contents of the book.   I hope you’ll join the conversation!