
As the new century dawned, Marie Curie informed the scientific world of
her discoveries of the radioactive elements polonium and radium, and
provided the key to a basic change in the way we study matter and
energy
In 1905 Einstein described the special theory of relativity, and the
general theory of relativity ten years later ”completely overturning
previous thought and replacing it with a startling new framework
In 1922 Frederick Banting and Charles Best discovered a pancreatic extract
insulin” which had antidiabetic characteristics
In 1928 Sir Alexander Fleming observed that colonies of the bacterium
Staphylococcus aureus could be destroyed by the mold Penicilium
notatum
Backfile data: The foundation for today's knowledge
These are just a few of the discoveries from the first half of the
twentieth century that caused fundamental shifts in scientific
understanding and ushered in new eras of scientific development. But other,
less heralded achievements from this era also continue to have a deep
impact on research. Until recently, many of the published records of these
achievements lie buried in storehouses or restricted collections
— accessible to few if any researchers.
This wealth of historical research data is now available online ” via
the Century of Science initiative within Web of Science.
Over the last two years, nearly 120 Thomson Scientific staff have been
dedicated to selecting, finding, and indexing this important research
material. The first question ” was there a demand for this
information?” was clearly answered by Thomson customers worldwide
with a resounding "yes!".
The educated use of citation data:
Discerning which journals should be included in the collection
The second question was tougher ”which journals should be included?
As Jim Testa, Director of Editorial Development, Thomson Scientific said,
"We started with an endless list of journals. The question was, which were
the most useful?" The Thomson Scientific editorial team set out
to include research that was relevant, significant, and useful to today's
researcher. In order to do this, they turned to the most powerful tool
within Web of Science citation navigation.
"Everything begins with ISI® data. The only way you can
successfully create the file that makes up Century of Science is
with ISI cited reference data. That is how you determine what is useful,
influential, important, stated Testa.
Data does not reveal its truths automatically, added Keith MacGregor
Executive Vice President, Academic & Government Markets, Thomson
Scientific. Cited reference searching puts data into context. This makes it
an invaluable tool not only for the researcher, but also for the
information provider trying to compile decades of significant research.
Building the collection: A step-by-step process
The first selection criteria were based on citation patterns within the
current Web of Science database from 1945 to 2004. Which articles
from 1900 to 1944 were highly cited? And in which journals were they
published? This initial dataset was comprised of 200,000 journal items
including articles, reports, editorials, reviews, and commentaries.
Clearly, the content needed further refinement. The next step was to
identify the journal items that had at least 50 citations to them and to
use this criteria to select the most important journals of the day. This
step allowed the editorial team to focus on a collection of 2,000 journals.
Then, journal title abbreviations were identified and unified. Before the
era of government funding and the advent of citation indexing, there were
fewer journals being published with less standardization. said Testa.
People would often cite a source cryptically and inexactly because
researchers in the field knew their principal source publications well,
added Maureen Handel, Manager of Journal Selection, Thomson Scientific. The
older the material got, the more the challenges grew. Journal names
changed, abbreviations changed, naming conventions varied widely.
The daunting task of identifying journals based on their citations was
greatly aided by using Web of Science. For example, said Handel, a
cited reference search based on the title variant MNRAS reveals that it
represents the Monthly Notices of the Royal Astronomical Society.
This unification process further narrowed the selection down to
approximately 1,500 journals.
A second, article-based dataset was then brought into play, consisting of
titles that published at least one article with 100+ citations. After the
same unification process, these two datasets were combined and refined,
yielding a dataset of journal titles that published five or more articles
with 100+ citations, or that had a total of 1,500+ citations. This was the
base for the final selection of highly cited titles.
Beyond citations: Ensuring full geographic and disciplinary coverage
Now, the editorial team had a carefully chosen collection of highly cited
journals. But citation patterns only told part of the story. In order to
ensure a truly representative and comprehensive collection, they had to go
beyond the journal level beyond the straightforward crunching of numbers to
make sure they were including the crucial seminal articles of the time
period.
Geographic patterns and a meaningful balance across scientific disciplines
were important considerations, as was ensuring that significant articles
published in short-lived or otherwise obscure journals, or in journals
covering seemingly unrelated disciplines, were included. A good example is
the Transactions of the Ophthalmological Society of Australia,
which included a critical paper describing the association of infection
with German measles in early pregnancy with cataracts and lesions of the
newborn's heart.
Since worldwide communications weren't the standard 100 years ago, what
happened in one part of North America or Europe or Asia might not be
disseminated outside that region. As the 20th century progressed, English
became the language of science but in earlier years, many significant
papers only published in regional journals, and were never translated into
English. Century of Science staff traveled worldwide to find these
journals, and over 30 translators at the Limerick office translated the
items into English. The final count over 850,000 items from more than 250
journals. The next challenge how to find these older publications.
Libraries, societies, and publishers: partners behind the scenes
Who was keeping these archives? Where could the information be found?
Almost everywhere, it turned out. Valuable yet almost forgotten journals
were found in boxes in storage facilities. In renowned libraries' archives.
In professional associations' historical files. Thomson's partners
immediately saw the value of compiling and disseminating this backfile data
and helped our editorial team retrieve and compile the information. "Their
efforts were key to the success of this massive undertaking, stated
MacGregor. "They helped Thomson take valuable research information from
scattered archives and rare book collections and turn it into fully
indexed, accessible, linked data available for global circulation.
Our global partners include:
Temple University
University of Pennsylvania
Princeton University
Trinity College (The University of Dublin)
University College Cork
Heidelberg College
Waseda University
The College of Physicians of Philadelphia
University of Uppsala
Royal Society of Chemistry Library and Information Centre
It was a true mutual commitment; a mutual effort. For
example:
- Two universities in Ireland Trinity College and University College Cork
were especially involved in the retrieval and indexing of material.
Graduate students served as photocopiers and translators. Since Trinity
College Library in Dublin is the largest library in Ireland and one of five
Royal libraries used for deposit of all British and Irish publications, it
had rare journals that couldn't be found elsewhere. Some of the material
was very rare and leather bound this necessitated additional care and
cost.
- Thomson Scientific staff had to debind issues of Yale University's
American Journal of Science, which had incomplete pagination. The
journal's editorial staff provided a thorough list of where each plate
belonged, so that each journal could be rebound in the correct
order.
- University College Cork was in the process of moving their warehoused
archives, but still wanted to participate in this project. Thomson
Scientific staff worked with a graduate student assigned to this project to
gather the necessary journals and send them to Limerick.
This mutual commitment of time and effort was facilitated by the
composition of the Thomson Scientific editorial team librarians,
scientists, and information specialists' professionals whose interests,
knowledge, and values engendered confidence and a trust that these old,
rare, and valuable resources would be handled with the care and respect
they deserved.
"Our partners kept these meticulous collections for years, and they were
pleased that someone was finally going to use them", said Marian Gloninger,
Manager, Publisher Relations, Thomson Scientific. "We visited each library
to see their collection. We asked them ‘what can we do
to help you provide us with this material?" In many cases, we took as much
as 45 years of content out of their building, to be shipped to our Limerick
office.
The Thomson Limerick office:
built, outfitted, and staffed especially for the Century of
Science initiative
The Limerick office and its dedicated Century of Science staff was
another factor that encouraged our partners' cooperation. This facility
built, outfitted, and staffed especially for this project had a skilled
full-time work force of over 100 individuals, as well as an additional 150
part-time employees.
Materials had to be handled specially, explained Gloninger. For instance,
the University of Cork in Ireland has hardbound volumes of The
Lancet, bound by hand in traditional cork binding. Thomson Scientific
funded the preservation of these volumes; debinding them so the information
could be scanned, rebound, and fully restored.
"We encountered surprises every day"
As content arrived at Limerick, staff members encountered unexpected
challenges. "As meticulous as our normal indexing procedures are, we had to
add special procedures for these older archives", said Phil Heller,
Publisher Relations and Senior Director, Linkage Business Development,
Thomson Scientific. "The production staff was ready to go, but the
preparation was extensive. Translating older foreign-language journals.
Keeping track of journals that frequently changed their name — or
even their language. Countries that changed names."
Translation skills were very important, as bibliographic data was
translated from German, French, Russian, Dutch, and Swedish into English.
Much of the staff had a scientific background that was critical for
properly translating technical data.
Gloninger said: "There was no standardization of publication practices like
there is now. We found more bibliographic references per article than
expected. Sometimes references were embedded in the articles, sometimes
not. How many issues were in a volume? How many articles per issue? Did the
table of contents match the actual content?"
"The content came in the door as an unknown. We encountered surprises every
day."
"A testament to scientific achievement"
The addition of Century of Science material further deepens the
content covered in Web of Science, and increases the value of its
most powerful search tool — cited reference searching. The
Century of Science initiative brings together the modern library
and remote storage of archives and rare book collections, and makes the
content of both immediately available to researchers worldwide.
Keith MacGregor summed it up aptly: "The breadth, scope and timetable for
the project — not to mention the age of the journals — have
created many unique challenges. Despite the complexities, we remain
dedicated to this initiative because the end result will be a testament to
scientific achievement as well as a significant enhancement to Web of
Science.
|