Open Library

adding vernacular fields from marc records

Bug #325854 reported by Karen Coyle on 2009-02-05

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Open Library	Confirmed	Medium	Edward Betts

Bug Description

Some MARC records for books in non-latin-based languages will have parallel fields: one for the transliteration into a latin-based character set, and one in the original (vernacular) character set. The languages covered in MARC to date are: Chinese, Japanese, Korean, Hebrew, Arabic, Cyrillic, Greek. In MARC records coming from US libraries, the transliterated information will be in the 'regular' MARC field (e.g. a transliterated title will be in a 245 field). The vernacular character will be in an 880. They are linked through the $6 subfield in both fields:

245 10 $6 880-02 $a Hung Jen-kan /$ccShen Wei-pin chu.
880 10 $6 245-02 $a[Chinese characters]

The $6 has the tag of the parallel field, and an incremented occurrence number. The key thing is the occurrence number - regardless of the field tag, the fields with the same occurrence number are linked. (The tag helps you figure out where to display them if you are displaying directly from the MARC record.)

There is nothing explicit in the fields to say which script is used. In theory, the 245 could be vernacular Chinese while the 880 could be transliterated, or even another character set. The language code in the MARC record could be clue, or one would need to use the Unicode range to determine the script (which is not totally accurate).

Tags:

Revision history for this message

Edward Betts (edwardbetts) wrote on 2009-03-02:

http://catalogue.nla.gov.au/Record/9/Details?&#details has:

In this example some of the 880 text is in Chinese characters, some is the same as 651.

Revision history for this message

Karen Coyle (kcoyle) wrote on 2009-03-02: Re: [Bug 325854] Re: adding vernacular fields from marc records

Yes, it can be a mixture, it depends on the contents of the field. In
this case, the name is in Chinese characters, but the subject heading is
made up from the LCSH, therefore the other subfields are in English.
Also, as you can see, you can mix character sets in a single subfield.
But it's still considered an equivalent field to the one it is replacing.

--
-----------------------------------
Karen Coyle / Digital Library Consultant
<email address hidden> http://www.kcoyle.net
ph.: 510-540-7596 skype: kcoylenet
fx.: 510-848-3913
mo.: 510-435-8234
------------------------------------

Edward Betts (edwardbetts) on 2009-11-16

Changed in openlibrary:
assignee:	nobody → Edward Betts (edwardbetts)
status:	New → Confirmed
importance:	Undecided → Medium

Edward Betts (edwardbetts) on 2010-07-07

tags:

added: marc

Yaron (sh-yaron) on 2010-11-09

tags:	added: rtl
tags:	removed: rtl

Edward Betts (edwardbetts) on 2010-11-30

tags:

added: language

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.