National Level Resource Discovery services?

My favourite discovery game from childhood. Far easier to play than starting your own webscale service...

I’ve spent a large portion of the past couple of years working with a local discovery layer (Aquabrowser) and am currently investigating equivalent ‘webscale’ discovery index solutions such as Summon, Primo Central or EBSCO Discovery that may supplement or replace it.

I’ve occasionally found myself explaining the two solutions to  non-library techy or developer colleagues. When we discuss the large webscale indexes such as Summon,  folk have on more than one occasion asked me – why not do this yourself, its just Lucene and Solr (or ElatsicSearch/Sphinx) scaled up … right?

Not exactly. Here are three reasons why …

1) Data

For this to happen, we would firstly need to get our hands on the data / full text indexed by the commercial solutions. This is no easy task.

Web scale suppliers have signed up publishers to partnership programmes to allow for harvesting and crawling of content. Agreements are most likely bi-laterial rather than universal and no real standard yet exists for this interchange.  Its easier and probably cheaper (at least initially) for me to buy into someone elses’ hard work here. But this is itself rather dangerous, it amounts to quite a serious outsource and potential loss of control. The only real influence on change could be by switching vendor.

The Library Loon has recently commented on two important recent Open Data releases from Nature and OCLC and the potential impact this can have on Discovery services.  If Open data in libraries really needs a better use case, this is surely it.

The problem is, the data that is most valuable is the stuff libraries themsleves do not own. It would be great to see more publishers follow Natures example and an end to the silly games of withholding data from competitor services.

2)  Infrastructure

Take your ‘just Lucene / Solr’, ingest and normalize varied data from 2-300+ different sources, scale for hundreds of thousands of consecutive users and accommodate well over fifty million records. Then keep it mirrored worldwide with 24×7 uptime.

Again, for a single library, the cost of an annual sub versus the startup costs for a DIY service are simply not comparable.

So why not seek a partner, I may be asked? This brings me onto …

3) Management and ownership

Collaboration is hard, especially at an institutional level (2+2 =3 etc.). I recently read this long but fascinating insight into the running of various web Portal services such as Intute, and how with a bit of dynamic thinking they  ‘could’ have morphed into a service such as Summon. Its a very personal piece, although I do agree with the inherent sillyness around trying to catalogue even ‘the best of the web’, its not simply relevant to search-engine centric web usage.

From what is described there, the budgets and resources were in place to potentially attempt this. But this was six years ago, pre-crash when we had money. Things in the UK HE sector are very different now.

With a change in operation of the JISC, its not exactly clear who could take this on, although it is possible that if  some future combination of Archives Hub and COPAC started to absorb open data from publishers, it may evolve on its own.

Does it matter?

Right now we have three large commercial players in the library web-scale market, all in close competition. Hopefully, this should surely be enough to keep things fresh and current.

I have argued that the lack of development over the past 20 years in LMS products, especially with the OPAC has assisted in the marginalization of library services. So that this is not repeated, I would again agree with the Library Loon and hope web scale discovery service vendors continue to grow and innovate with their products and rely less on the coverage of material to act as a selling point. Summon has recently launched discipline centric searches and Primo Central has some fascinating ideas around relevancy ranking understanding user context. I hope this trend at least continues.

About these ads

2 thoughts on “National Level Resource Discovery services?

  1. Pingback: More on what went wrong with Intute. And – is there a can of worms? « Roddy Macleod's Blog

  2. I think I agree with most of this, but the thing that worries me most is “Right now we have three large commercial players in the library web-scale market, all in close competition. Hopefully, this should surely be enough to keep things fresh and current.”

    Experience suggests that three large players in a relatively small market, with relatively small margins for suppliers, is not enough to keep things fresh or current. See integrated library management systems.

    As you hightly the key is the Data side – and why libraries and publishers need to be making data for discovery services available openly to all comers – to level the playing field and at least take away one of the barriers to entering the market. This obviously doesn’t guarantee new entries, but it opens up the possibility and would allow existing vendors to focus on improving functionality.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s