At SIGIR 2009, an excellent paper was presented by Diane Kelly, Karl Gyllstrom, and Earl W. Bailey that compares user-generated versus algorithmically-generated query suggestions. As far as I know, this is the first paper to do such a comparison within a usability study. With a healthy-sized pool of 55 participants and 20 TREC topics, they found that query reformulation suggestions derived from human-issued search logs performed better than system-generated queries (at least using their system), in terms of number of suggestions used, number of relevant documents saved, and a precision score (although the latter was not a statistically significant difference).
They also compared multi-term queries to single-term suggestions (where clicking on the term added it to the current query, thus refining it; query suggestions replaced the current query), finding the query-based suggestions were better and subjectively preferred over the single-term suggestions, with some participants noting that query-like suggestions presented whole ideas. Query-based suggestions were seen as useful for a cold-start or when the searcher had run out of ideas. Terms were seen as more useful for refining an already specific query. Note that the result for single terms may have been affected by the presentation of this option, in a paragraph-style list, which some participants regarded as “jumbled” looking.
The reference is: Kelly, Gyllstrom, Bailey, A Comparison of Query and Term Suggestion Features for Interactive Searching , proceedings of ACM SIGIR 2009 (no link available yet).
This is relevant to Section 6.3: Automated Term Suggestions