Wednesday, 12 November 2008
Outputs available: "Semantic Analysis Technology" event, 3 November 2008
This half-day event included presentations by Luca Scagliarini of Expert System, Jeremy Bentley of SmartLogic, Rob Lee of Rattle Research and linked presentations by BBC information architects Helen Lippell, Karen Loasby and Silver Oliver - followed by a rather interesting panel discussion. There were more than hundred people in attendance.
The talks represented the different approaches in text processing and advanced techniques in automatic resource indexing that help to resolve ambiguities in content searching and linking:
1. Luca Scagliarini ( Expert System) "Whales & cat fur: using a semantic net to improve precision & recall" [ pdf] [ mp3]
Luca pointed out that the present information discovery suffered from both information overload and information underload due to a lack of meaning-based text processing. He reviewed current technologies and illustrated problems with shallow automatic linguistic analysis and the lack of 'understanding' of the meaning that is encoded in the relationships between verbs, prepositions and nouns. He illustrated how a 'deep semantic analysis' based on the analysis of relationships works in Expert System's new semantic intelligence software, Cogito. Cogito utilizes an innovative 'semantic network' to achieve improved machine 'understanding'. The semantic network contains 350,000 definitions and 2.8 million relationships for the English language vocabulary.
Jeremy Bentley ( SmartLogic) "It’s just semantics" [ pdf][ mp3]
Jeremy provided an overview of issues in information organization: unstructured information, the doubling of number of resources every 19 months, the problem of 'findability' and the issues with black box solutions. He illustrated the relevance of metadata and the relevance of taxonomies built specifically to reflect the way a business works. He explained how this could be exploited in managing the semantic layer of an information and content architecture and how an ontology can be used for automatic analysis of contexts and semantics, in queries and search engines.
Rob Lee ( Rattle Research): "Connecting concepts: joining up the BBC" [ slideshare][ mp3]
Rob Lee talked about Muddy Boots, a BBC project dealing with linked data and the creation of dynamic semantic richness. The BBC's remit to link to external sources has provoked lots of thinking and doing in the area of dynamic linking. Rob illustrated how datasets in the public domain such as MusicBrainz or DBpedia (which structures content from Wikipedia so that it can be used in semantic web systems) can be used to contextualise and index BBC resources as well as to extend them with external links.
Helen Lippell, Karen Loasby, Silver Oliver: "Tales from the trenches of auto-categorisation: three case studies in the implementation of auto-categorisation systems" [ pdf][ mp3]
Helen, Karen and Silver presented three different implementations of auto-categorisation systems at the BBC. They demonstrated the advantages and issues with each of these approaches. Helen's presentation entitled "Teaching computers to read newspapers Aka Automatic classification at FT.com in the early noughties" was about experience in a joint project by the FT, Lexis-Nexis and Dialog. The goal was to connect thousands of resources through a single interface. The tool used was Verity Intelligent Classifier (VIC) and the classification process used a taxonomy with a set of rules that could be finely tuned. Karen spoke about "Content Management Culture in the BBC" a metadata orientated project to produce BBC content that could be described in detail. The approach applied was a rule-based automatic classification system combined with the author's review and corrections. Silver talked about a "Statistical-based auto-categorisation" project designed to connect and cross-reference distributed BBC content and resources horizontally.
See outputs from other ISKO UK events.
CFP: IFLA Satellite Meeting: Classification and Indexing Section, August 2009, Florence
Classification and Indexing Section
Florence, Italy
20-21 August 2009
Theme: "Looking at the Past and Preparing for the Future"
The IFLA Classification and Indexing Section is pleased to announce
a satellite pre-conference which will explore the theoretical and
methodological aspects of rethinking semantic access to information
and knowledge and will offer a general survey of innovative projects
deployed to cope with the challenges of the future, offering a unique
opportunity for librarians, academics and other information
professionals to be informed about the state of the art in subject indexing.
Librarians, academics and other information professionals around the
world are invited to submit paper proposals for the satellite
meeting, focusing on:
- Systems, tools and standards in subject indexing
- Retrieval in multilingual, multicultural environments
- Web indexing and social indexing
If you are interested in contributing, please send:
An abstract of 300-500 words in English including a title.
An outline of the presentation.
Brief biographical information of the author(s)/presenter(s) with
current employment information.
Your mailing address.
All this by December 15, 2008 to: Patrice Landry at:
e-mail : patrice.landry@nb.admin.ch
fax: +41 31 322 84 63
The submissions will be reviewed by a selection committee of the
Classification and Indexing Section Standing Committee. The selection
will be based on the abstracts and rated on how well they fit the
programme theme. Authors will be contacted by February 15, 2009.
For successful applicants the deadline for submission of full papers
is June 15, 2009 to allow time for review of papers and all other
organization needs. The papers must be original submissions, not
published elsewhere, and should be no longer than 15 pages,
double-spaced. Papers should be in English.
Presentations at the satellite meeting will be limited to approx. 20
minutes and will be a summary of the original paper and may use
PowerPoint. The conference will be conducted in English and all
presentations will be required to be in English.
Please note that no financial support can be provided. The expenses
of attending the meeting in Florence will be the responsibility of
the author(s) / presenter(s) of accepted papers.
For information on the IFLA Classification and Indexing Section,
please see http://www.ifla.org/VII/s29/index.htm.
For additional information on this call for papers, you may contact
Leda Bultrini ( leda.bultrini@arpalazio.it) or Patrice Landry
( patrice.landry@nb.admin.ch) by e-mail.
Thursday, 30 October 2008
CFP: Classification at a Crossroads, The Hague, 29-30 October 2009
The conference aims at exploring how new developments in information standards and technology influence and affect applications and services using classification, Universal Decimal Classification in particular, and its relationships to other systems.
The programme will highlight many ways in which the use classification can be improved. Attention will be paid to the applications of classification in supporting multilingual access, user-friendly representations of classification in resource discovery and semantic searching expansion and classification application across distributed systems.
Papers and posters are now invited covering the following topics:
- Classification and semantic technologies, e.g. experiences with vocabulary standards for expressing and porting classification data into the Semantic Web, vocabulary registries, terminology services
Classification in supporting information integration, e.g. classification use in alignment of vocabularies, classification as a common subject language in co-operative systems, experiences in multi-database systems, classification mapping to other subject languages, classification enhancement with social tagging
Verbal and multilingual access to classification, e.g. textual searching and display, management of subject-alphabetical indexes, extraction of thesauri from classification schemes
Classification authority control and library systems, e.g. issues with MARC formats, authority file development, maintenance and sharing of data
Visual representations/interface to classification, e.g. issues in classification browsing and faceted representation in classification tools and information systems
Experiences with classification outside the traditional library environment, e.g. use in different types of digital repositories (eprints, VLE), resource discovery on the Web, alerting services, specialised bibliographic services and databases, organization of physical objects etc.
The International UDC Seminar 2009 is organized by the UDC Consortium and hosted by The National library of The Netherlands (Koninklijke Bibliotheek). The UDCC is a self-funded, non-commercial, organization, based in The Hague, established to maintain and distribute the Universal Decimal Classification (UDC) and supports its use and development.
To read more about conference and to submit abstracts (300-500 words) go to the conference website conference webiste.Thursday, 23 October 2008
RDA Full Draft Out W/C 3 November
The Joint Steering Committee has announced that the full draft of RDA will be available for constituency review the week of 3 November.
The plan is to make it available in a prliminary version of the software.
Full post here.
Monday, 20 October 2008
Back to basics: BC2 then, now, and in the future
24 OCTOBER 2008 at 2.15 p.m.
University of London Library, Senate House, Malet Street, London, WC1E 7HU
in the Palaeography Seminar Room
‘Back to basics: BC2 then, now, and in the future’
The original Bibliographic Classification of H. E. Bliss was widely acclaimed as the finest of the general classification schemes of the early twentieth century. Its second edition has also been regarded as the model of a modern subject indexing and retrieval tool, embracing as it does the developed classification theory of the next generation. This event takes a comprehensive look at the fundamentals of BC2, the only fully faceted system of classification in the western world. The speakers will cover principles of BC2, how and why it was conceived, its use as a pattern for faceted vocabularies, its influence on other retrieval tools, and plans for the further development of BC2 as a thesaurus and in a web-enabled format.
Speakers include Jack Mills, Vanda Broughton, Jean Aitchison, and Leonard Will
The BCA Annual Lecture will take place at 3.15 p.m., immediately after the 2008 AGM of the Bliss Classification Association. The Lecture is open to anyone interested in matters relating to classification, indexing, and the problems of subject access and retrieval generally, and you are warmly invited to attend.
Entry is free, but if you would like to come, please email Vanda Broughton at v.broughton[at]ucl.ac.uk
Wednesday, 1 October 2008
Invitation: Semantic Analysis Technology, London, 3 November 2008
Wednesday, 17 September 2008
IVOA recommending SKOS
A few interesting excerpts from the document explaining the context and the rational:
"Astronomical information of relevance to the Virtual Observatory (VO) is not confined to quantities easily expressed in a catalogue or a table. Fairly simple things such as position on the sky, brightness in some units, times measured in some frame, redshifts, classifications or other similar quantities are easily manipulated and stored in VOTables and can currently be identified using IVOA Unified Content Descriptors (UCDs). However, astrophysical concepts and quantities use a wide variety of names, identifications, classifications and associations, most of which cannot be described or labelled via UCDs.
There are a number of basic forms of organised semantic knowledge of potential use to the VO. Informal “folksonomies” are at one extreme, and are a very lightly coordinated collection of labels chosen by users. A slightly more formal structure is a “vocabulary”, where the label is drawn from a predefined set of definitions which can include relationships to other labels; vocabularies are primarily associated with searching and browsing tasks. At the other extreme are “ontologies”, where the domain is formally captured in a set of logical classes, typically related in a subclass hierarchy. More formal definitions are presented later in this document.
An astronomical ontology is necessary if we are to have a computer (appear to) “understand” something of the domain. There has been some progress towards creating an ontology of astronomical object types to meet this need. However there are distinct use cases for letting human users find resources of interest through search and navigation of the information space..."
"As the astronomical information processed within the Virtual Observatory becomes more complex, there is an increasing need for a more formal means of identifying quantities, concepts, and processes not confined to things easily placed in a FITS image (Flexible Image Transport System), or expressed in a catalogue or a table. We propose that the IVOA adopt a standard format for vocabularies based on the W3C's Resource Description Framework (RDF) and Simple Knowledge Organization System (SKOS). By adopting a standard and simple format, the IVOA will permit different groups to create and maintain their own specialised vocabularies while letting the rest of the astronomical community access, use, and combine them. The use of current, open standards ensures that VO applications will be able to tap into resources of the growing semantic web. Several examples of useful astronomical vocabularies are provided, including work on a common IVOA thesaurus intended to provide a semantic common base for VO applications."