A Critique on the British Libraries Newspaper Archive from 1600-1900
In 2004 the
British Library was funded to set up a database of newspapers from 1600-1900 with
a grant of two million pounds from the Joint Information Systems Committee.[1] It
represents one percent or 2.2 million pages of the BL archive and can only be
added too in the future, funding permitting. The British Library has used a
secondary partner to produce this database, Gale who are part of an American
group called Cengage Learning. This group manage over six hundred sites for a
variety of different applications such as eBooks and large print media as well
as databases of this kind. The layout is fairly standard and perhaps already a
little dated when compared to more commercial sites (please see below). A
specific article can be called up or the whole page viewed at one time with
usual controls to magnify each item. There are also options to mark items and
bookmark them, however marked items appear not to be saved from one visit to the
next and bookmarks require the recipient to have access to the database as well
in order to view them. The basic search can be by date, publication, location
or a specific collection. However the advanced search not only has the Boolean
operators but also a variable degree of 'fuzzy' search options with low, medium
or high (please see below). Image quality is reasonable with options not
only to download as a PDF but also to print direct from the site (as opposed to
copying an internet image and printing yourself). This could be a limitation as
not all the scanned images are of the same standard and also rely on the
quality of the original document as well. This leads onto the OCR which can be erratic
at times, certainly the higher fuzzy search option you use the more
miscellaneous hits you will receive (to be expected). However, as far as I can
ascertain there is no option to look at the underlying metadata or XML.
Although each document page has an individual number which you could use to
refer to if you wished to forward a mistake to the management company.
This site looks quite dated when
compared to their newer site, The British Newspaper Archive by the company
Brightsolid which is a subscription site that the University does not seem to
have access to (no Athens
login).[2] Although
I could not look at the underlying text for the site being criticised, if it is
similar to the Burney Collection (which this site uses as well) then a chance
that every other word is incorrect does not seem best practice.[3] If
fifty percent of search terms are missed due to lack of OCR recognition at best
this could be thought of as negligent and at worst possible fraud. This could
be even more disturbing if this could apply to the newer subscription site as
well. However is it better to have fifty percent of some primary sources rather
than a hundred percent of none?
Word count 504
[1]
Information from their official 'About this Site' pages - please note, must be
logged into University site and logged into Athens to use this link. http://find.galegroup.com/bncn/page.do?page=/bncn_about.jsp
consulted 10/3/12
[3]
Information from Week One Worksheet - week commencing 23/1/12
Basic Search
Page from Basic search Norfolk Incendiary
Advanced Search
Advanced Search Erroneous Result
No Results? Ranter Norfolk
Fuzzy Search (Med) on Primitive Methodist Norfolk
Looks good to me - on the images, have you tried resizing them before you upload them?
ReplyDelete