2007,六月 26

Hypermedia is a term created by Ted Nelson, and used in his 1965 article Complex information processing: a file structure for the complex, the changing and the indeterminate. It is used as a logical extension of the term hypertext, in which graphics, audio, video, plain text and hyperlinks intertwine to create a generally non-linear medium of information. This contrasts with the broader term multimedia, which may be used to describe non-interactive linear presentations as well as hypermedia. Hypermedia should not be confused with hypergraphics or super-writing which is not a related subject.

The World Wide Web is a classic example of hypermedia, whereas a non-interactive cinema presentation is an example of standard multimedia due to the absence of hyperlinks.

The first hypermedia system was the Aspen Movie Map, while the first truly universal hypermedia was Hypercard. Most modern hypermedia is delivered via electronic pages from a variety of systems. Audio hypermedia is emerging with voice command devices and voice browsing.

2006,七月 21

本来是研究数字对象,结果不得不把数字资源的保存策略梳理一下,其实保存策略是数字对象模型的重要部分.

数字资源的保存在数字图书馆建设中是一个非常重要,非常复杂的问题,它几乎涉及了数字图书馆的所有系统:数字资源的加工获取、数字资源的封装,档案存储、数据管理、资源服务、数据库系统、元数据管理、搜索引擎、海量文件系统、唯一标识符,海量存储系统,存储管理系统,媒体监测系统,身份认证系统,安全管理体系,容灾备份体系,数据统计分析等.

在数字信息无处不在的今天,你能找到20年前的数据吗?10年前的5寸软盘在今天有谁还能够读取?这不管对谁都是一个异常困难的事情,无可否认的事实是我们忽略了数资源的保存.然而20年前,甚至1000年的图书字画,我们很容易读取,这对我们熟悉的纸质档案资料不算什么。对于数字资源即便是我们能够读取,我们能判断它不是赝品吗?他没有被人修改吗?同样也是很困难的。即便是能够读取,我们能够完整再现它们吗?这同样也是疑问?然而我们要保存数字资源就必须解决这些疑问,才能在一个相当长的时间内保证我们所存储的数字资源是安全的可信的,这也是数字资源保存策略的根本所在。

数字资源长期保存要求如果从数字资源使用的角度来考察它就要求资源的保存机制能够满足以下5个方面的要求:存得上,找得到,读得出,信得过,用得起。(这里是5d,E.Fox有个5S)

存得上:需要具备完备的资源收集摄取策略,包括摄取流程规范,验证标准,资源提交格式,接收反馈机制,需要严格的资产管理机制,数据备份策略,海量文件系统,资源封装方案,资源在不同状态下的逻辑与物理形态。

找得到:这要求对资源有明确的描述,资源是什么?同时也需要对资源的定位很明确,满足资源的迁移和变更。这涉及元数据方案,标示符方案,资源组织,多维聚类方案,虚拟资源组织,数据仓储互操作。怎样去维护海量资源的定位,分布存储,这对数字图书馆建设者来说也是一个挑战。

读得出:数字资源我们能够找得到,但是怎样去还原资源原来的表现形式却是一个很重要的问题。例如,曾经被广泛使用的wps,现在用户已经很少了,如何打开一个wps1.0格式的文件对许多用户来说都是难题。这涉及资源的加工组织标准问题,我们应该保存怎样格式的资源,需要图书馆管理者认真考虑的问题,它同时还涉及专利的问题,我们应该采用不受专利制约的资源格式,例如,曾经广泛使用的GIF格式的资源,现在因为专利问题,GIF专利拥有者声明要对使用其文件格式的系统进行收费,致使许多系统不再支持这一文件格式,这对读者继续使用这种资源形成了障碍。目前,一个图书馆的资源阅读,要求读者去安装7、8个资源阅读器,这无论从安全性上,还是方便性上,都为读者带来了负面的影响。“读得出”要求资源的保存格式能够满足未来的很长一段时间的技术环境,并且这种技术环境是我们能够用得起的。

信得过:怎样让资源的托管者,资源的管理者,和资源的使用者都确认系统是可信的,是系统存在的基础。作为资源的托管者,能够保护它的权益,保证它托管资源的目的能够实现,保证资源的知识产权不被侵犯是很重要的。对资源的管理者来说,系统能够完成他们的管理任务,系统能够忠实地记载与资源有关的变更操作存取纪录,系统能够应付各种可预见或不可预见的环境变更,并在这种变更后能够恢复资源,使资源在被管理的过程中不被篡改是很重要的。作为资源的使用者,要让其确信就是他要找的资源,这些资源就是原资源。这一要求涉及版权管理,资源封装,资源元数据方案,系统容载在备份,

用得起:对一个数字图书馆系统来说,其运行的经济支撑是不言而喻的,这个系统必须保证是一个用得起的系统。这涉及系统建设的成本和未来的运行维护成本,系统的建设成本通常说来是没有问题的,但是其运行维护成本却影响未来系统能否持续运行的关键因素。这包括对软件系统的选择,系统操作的简易性,海量存储体系的单位成本和维护成本。

资源永久性保存,虽然是一个非常复杂的问题,但在一定范围内的相关的规范会给我们一些相应的指导,尤其是OAIS文档参考模型,给出了资源保存中各个环节的相应说明。其中至少涉及了以下技术问题:

  1. 保存技术策略选择:选择什么内容,什么方式进行保存;
  2. 保存规划管理:完整性,一致性的监测策略,数据迁移机制等;
  3. 保存工作流管理:需要完整的数字保存工作流控制;
  4. 保存媒体迁移:媒体数据的备份,迁移,恢复,转换;
  5. 数据格式:各式标准,格式规范,格式注册等;
  6. 封装技术:对媒体数据的封装,压缩,编码;
  7. 安全监测技术:病毒扫描,恶意代码的清除;
  8. 完整性校验技术:完整性,一致性,可读写性的检查;
  9. 数据功能校验技术:对功能性数据进行功能验证;
  10. 数字对象模型:数字对象模型是数字资源组织的关键;
  11. 元数据保存技术:维持元数据的可理解能力;
  12. 保存标识体系:命名标准与解析体系的制定;
  13. 内容管理:数据对象的存储和管理;
  14. 元数据管理:元数据的存储和管理;
  15. 索引技术:对对象数据和元数据的索引技术;
  16. 海量文件存储技术:选择怎样的存储体系;
  17. 检索浏览技术:采用的检索浏览架构技术;
  18. 定位技术:通过对唯一标识符的定位实现资源的定位;
  19. 认证和授权技术:保证合法的访问,保护知识产权;
  20. 互操作技术:与其它系统的互操作技术;

程序设计语言之一

笔者从接触计算机编程到现在也有20多年了,到现在还是觉得编程趣味无穷。这期间因学习工作乃至生活的需要接触了许多编程语言,有些是个人喜欢的,有些是不喜欢的。不管喜欢不喜欢,总得用,就像我们学英语,要用就得学,不过个人体会编程语言就向一门外语,只有使用它,才能理解它,掌握它,个人学了几十年的英语,口语水平还不及美国的10岁小孩,原因就是使用机会很少。国内有很多程序语言作家著作写了一本又一本,其实他只是写了很少的程序,也就难免误人子弟了。

有些编程语言很正统,也很流行,例如现时的Java,大部分软件专业的都能写点程序(不过到了个人简历上很有可能就是精通了)。有些语言,尤其脚本语言,因为“不能体现编程者的高水平”,往往很少有人掌握,例如javascript,编程人员不屑学,界面设计人员学不会,这种现状就直接体现在我们的网站与欧美国家的网站的设计水平的差距上了。

以下是本人用过的程序设计语言。

basic, 大学第一学期学的语言,也是本人学的第一个程序设计语言,那时还在国产DJS-130上调试的程序,存储器是王安的磁鼓,64k,还是小型机,占据了30平米的实验室,monitor就是tty(电传打字机)。实习题目是编了一个自动打印日历的程序,编出后很是自豪,心情不亚于收到大学录取通知书。看看现在的计算环境,真是令人感慨,IT无愧于人类有史以来发展最快,最伟大的产业。basic后来被一个叫比尔.盖茨的人发扬广大,尤其是在中国。basic是一个入门很快的语言,也就成了计算机语言中的通俗语言,其风格中有着很深的盎格鲁-萨克逊烙印。basic 还有着很强的生命力,尽管有很多人不喜欢它。

汇编语言/机器码,现在很少人使用了(学习了),不过本人上学时,却是计算机专业的必学课程,实习时在PDP-11上开发了个猜字小程序,后来单片机上也经常使用汇编语言。在 PC机的8088时代,汇编语言是很有用的一个工具,你可以很方便的调用各种dos中断。当然学习 ASM可以让你很好的理解计算机的运算原理。

cobol,那个专为报表处理开发的程序设计语言,最早的许多文字处理,企业管理,LAS都是用它开发的,学习是在apple-II编程的,编译时要2个软盘来回倒。后来还在 VAX-11上用过它,编了本人最早的MIS软件。本人很讨厌它的出错列表,让你摸不着头脑。cobol快走入历史了,也应该走入历史了,个人感觉它是世界上最笨的编程语言了,本人到现在还感觉不出它是哪个民族的风格,不会是日本人吧?

pascal,最美丽,最直观的编程语言了,感觉是法兰西风格的,有点追求完美,事情却往往相反,不可能有完美。个人感觉它是最好的教学语言了,尤其是数据结构,编译原理。它是让我体会到编程乐趣的语言,也是一门让我对程序设计和算法实现有深刻体会的语言,这在很大程度上要归功于我的大学老师-吴萍,她能把枯燥的编程语言讲得出神入化,即便是上午的第4节课也没人逃课,到现在我还保留着pascal的讲课笔记(我是一个很少完整记笔记的学生,只有2门课认真地记有完整的笔记),同学中没人知道吴老师现在在哪里,听说全家去南方了。所有编程语言的名字都有很明确的出处,可是pascal我却不清楚,当然了这对掌握一种语言来说无关紧要,也许是为了纪念一位法国的数学家吧?

fortran,公式翻译语言,从这个语言的风格中,你会感受到作为一个工程技术人员的处境:很敬业(很死板),很辛苦,有时不得不面对现实,尽管你很有创造力。在我学生时代,它是使用最为广泛的语言,教学也很重视,给了一个很大的课程设计,作了一个多月,在一个类似IBM360(富士通)机器上实现的,不过本科毕业后就很少接触这门语言了。在80年代末那个搞导弹的不如卖鸡蛋的年代,我正在读软件专业的研究生,我的老板(导师)给我们讲了一个真实的故事。一位军工系统老专家向我老板咨询编程能不能自动化,他们的点火控制程序是用fortran编写的,搞设计的是几个年轻人。有一次年轻人在关键时刻的前夜与这位主任讲条件,要主任承诺给他们分房子,否则就不干了。这让主任很恼火,发点奖金他还可以努力,房子他哪有办法,最后年轻人也没有不干,还是光荣完成了任务,当然房子也没有马上分到。此事让这位专家耿耿于怀。fortran 也是让中国的老一辈软件人员值得骄傲的语言,他们做出了与世界同步的编译器在我国军事工程上广泛使用。

ada,一个很符合web2.0口味的名字,取自 Ada Lovelace August 夫人的名字,她是学界认可的第一位真正的程序员。它是最严谨的语言了,面向软件工程设计,在没有构件的年代,使用ada,进行团体编程非常方便,语法结构采用 类 pascal, 是几个法国人的作品,却是为美国人设计的,它一度是国防项目的强制语言。其实它也适合其它领域,却没有流行起来。我在上面花了近2年时间,最后作了一个10多万行代码的国家项目。这个项目是在小型机上完成的,有好几次不得不对系统从裸机开始恢复。ada也在进步,但是却没有被广泛采用,始终停留在军工项目上,但它确实是一个非常优秀的编程语言。优秀的并非是大众所接受的,ada就是一个活生生的例证。

delphi,在c/s时代表现非常优秀的语言,它把pascal 发扬广大,曾经占有很大的市场份额,他的成功得益于提供了很多基于源码的扩充部件,对编程人员来说,delphi可深可浅,深可直接嵌入汇编语言,浅可作为vb的替代。目前为止也是非常优秀的客户端设计工具。本人10年前做的一个delphi程序还在企业里运行。在我的印象里,borland不是一个很会做营销的公司,虽然产品优秀,但时常晕头转向。它曾经是软件编程人员的就业敲门砖,找个高手要很大代价。

c/c++, 它无疑是当今IT世界里最为强大的语言了,没有哪一个计算机平台你找不到c编译器,尤其是大名鼎鼎的GCC,c已经成为一种文化,无处不在,也不属于某个人或某个团体组织。c++是面向对象时代的骄子,如果你想成为一个优秀的软件设计者你就应该熟练掌握c语言。c语言就像英语,会点是容易的,精通还是很难的,更重要的是要实践,当今的那个软件神童不掌握c语言呢?c不是指vc,google肯定不是用vc编程的。一个不争的实事,盖茨的ide让中国程序员变傻了。个人觉得c/c++语言适合编写小项目和大项目,不适合编写中等项目,对于运行效率要求比较高的程序还是要用c语言来实现比较经济。c语言的最大缺点是其调试成本很高,如果碰上个空指针之类的故障,可能是致命的。个人经历是将近一个月的时间才找到问题所在,简直要把人气晕了。

未完待续

2006,六月 1

[注]蓝色为本人理解加注,非原意。



title:概念地图在知识表示和知识评价中的应用
title subtitle: 概念地图的基本内涵
creator:马费城 郝金星
source:中国图书馆学报,2006年第3期,p5-9
description:概念地图是针对特定主题的个人结构化知识的一种图示方法。构成概念地图的基本要素有:关系、命题、等级、意群、交叉关系和示例等。


title:无人值守参考咨询服务研究
creator:吴云标 栗慧 等
source:中国图书馆学报,2006年第3期,p88-92
description:
参考咨询的基本范式是研究参考咨询服务的思想方法基础。John V.Richardson Jr.提出了一个参考咨询服务的基本范式。

本论文值得一读。

受本文启发,个人理解的一个咨询的过程可分解为6步:

  1. 问题表述与展开:期间可进行某种形式的互动,进行探测式交互,以便进一步确定问题。
  2. 问题解答:代理执行解答,中间可以进行启发式交互操作
  3. 咨询答案生成:答案合成
  4. 自动咨询评价:评价代理进行评价,可回溯执行解答,或者作为咨询者的参考
  5. 咨询结果展现:以多媒体的形式展现咨询结果
  6. 咨询者评价反馈:咨询者可以反馈满意度等


title:虚拟参考咨询服务规范研究及其应用
creator:张春红 肖珑 梁南燕
source:大学图书馆学报,issn1002-1027,2006年第2期,p57-61
description:
介绍了国际常见的虚拟参考咨询标准规范:
  1. IFLA数字参考咨询指南,IFLA Digital Reference Guidelines,2003-11。包括2部分:服务管理(从9个方面规范管理),数字参考咨询工作。
  2. RUSA-虚拟参考服务实施与维护指南,Guidelines for Implementing and Maintaining Virtual Reference Services,2004-06 。3各部分:虚拟参考咨询的准备、服务对象和内容、服务的组织。关注于保障和管理。
  3. Question Point成员指南,Qustion Point Member Guidelines,2002-06。分为6部分:通用指南、质量与准确性、响应时间、得体的回答、有效的监控、期望的行为。涉及法律知识产权。
  4. K-12数字参考服务信息咨询专家指南,Guidelines for Information Specialists of K-12 Digital Reference Services,1999。主要定义了问题回答流程。作为为整体服务规范不够全面。
  5. NiSO-问题/答案流程处理协议,Question/Answer Transaction Protocol ,2004-03。定义了一套信息交换的语法和语义的信息及其规则。他支持问题和答案的处理和跟踪,以及包装其他需要交换的信息。
  6. VRD-QuIP-问题交换协议,,Question Interchange Profile,1999。是一种线性元数据格式,用于虚拟咨询服务的数据存储、更新和交换。数据交换的标准,是结构化的元数据格式。
  7. KnowledgeBit,有AnswerBase公司设计并在美国国会图书馆的帮助下形成的,市委咨询台支撑和出版系统而设计的一种数据格式,对答案信息源进行质量评估。
  8. AskERIC网络资源选择标准,从如下几个方面进行评价:权威性、从属关系、内容、目的、拥护、新颖性和比较。
  9. CALIS虚拟参考咨询服务规范:服务范围、咨询员素质规范、咨询行为规范、咨询流程规范、答案质量规范。


title:基于既定词表的自适应汉语分词技术研究
creator:黄水清 程冲,南京农业大学信息科技学院
source:现代图书情报技术,issn1003-3513,2006年第5期,p13-17
description:
汉语自动分词的方法大致分为两类:一类基于词表,词库的匹配和词频的统计方法,另一类基于句法,语法分析,并结合语义进行分词的方法.


title:汉语自动分词模式自动机构造研究
source:现代图书情报技术,issn1003-3513,2006年第5期,p47-49
creator:吴绍根,广东轻工职业技术学院
description:
问题是怎样才能构建一个功能完备的自动机,感觉好像没有做一个规模试验。


tittle: 《中国分类主题词表》电子版研制概述
creator:卜书庆 贺玲勇
source:国家图书馆学刊,2006年第2期,p10-14
description:
词表库是符合UNIMARC 和ISO2709的数据包,如果说明没有折扣的话,数据库可直接拿来使用,不知道销售的光盘中包括这个数据包不?
文中还描述了一个接口,根据描述是一个基于window的接口,当然也就无所谓跨平台,或者 web2.0应用了,不知有无软件升级的计划。如果把本系统做成一个基于web的在线调用的系统那肯定是一件利国利民的大好事,只是出版者不能卖钱了,不过也没关系,我们不是进入2.0时代了吗?


title:从网络信息组织看《中国分类主题词表》
creator:陈树年 刘惠敏,华东理工大学
source:国家图书馆学刊,2006年第2期,p21-27
description:
从论文的第一部分可知词表出版是一项浩大工程,凝聚了许多人多年的心血。
词表最重要的价值在于构建了一个完整的知识地图,既描述了知识的等级体系,又描述了主题概念的语义关联,还描述了两者的对应关系,这本身就是一种知识。

词表能做什么?

  1. 用于传统图书馆的文献组织:分类标引和主题标引,文献检索;
  2. 用于数字图书馆的信息组织:…,与FRBR结合;
  3. 用于虚拟图书馆的信息组织;
  4. 用于网络信息的组织:与搜索引擎结合;
  5. 用于学科门户的信息组织;
  6. 在语义网中的使用:在本体应用中会发挥作用,怎样使用?

词表如何改进?

  1. 保持高度的动态性,可否考虑开放性?像wiki那样来维护;
  2. 向本题靠拢;
  3. 创制新的分类法:对现有分类法进行简单改良后用于实践;
  4. 多方面汲取营养:可以和Folksonomy结合创建与时俱进的知识分类体系。

如果你想利用词表干点什么的话,本文对你会有所启发。


title:中文古籍数字化的再思考
creator:陈力,国家图书馆
source:国家图书馆学刊,2006年第2期,p42-49
description:
全文检索的不同:古汉语以单音词为主,现代汉语以复音词为主,这里涉及全文检索中分词法。读者对象的差异也会对全文检索提出不同的要求。

“彻头彻尾,彻里彻外”的数字化:海量的文献,共性,差异性。实际上我们需要一个古籍数字化规范指南,不知道陈馆长是否牵头开始这一工作没有?

古籍中汉字的UNICODE表示问题。

2006,五月 14

What is REST?

REST:REpresentational State Transfer
SOAP:Simple Object Access Protocol

从某种意义上说,REST是SOAP的简化使用,REST具备更加轻量化的编程接口和实现代价。然而,在软件体系结构设计时,你是决定采用 SOAP还是REST?当然你也可以二者都包容(像FEDORA2.0那样)。这是设计者必须考虑的问题。


title:面向资源与面向活动的 Web 服务

title subtitle:REST 样式与 SOAP 样式 Web 服务之间关系的概览

source url:http://www-128.ibm.com/developerworks/cn/webservices/ws-restvsoap/?ca=dwcn-newsletter-webservices
creator:James M. Snell, 软件工程师, IBM

date:2004 年 11 月 01 日


web2.0的兴起,使REST得到了广泛的使用,大有喧宾夺主的架势,二者关系如何?本文给出了很好的回答。


REST 样式和 SOAP 样式 Web 服务的区别取决于应用程序是面向 资源的还是面向 活动的。

要点很简单:REST 和 SOAP 的选择归结为对您的特定应用程序的最重要部分的理解。如果您的应用程序主要集中在访问信息资源的能力(如 Bloglines 服务),那么您用的主要是面向资源服务,并且您的应用程序应该是 REST 样式的设计模式。这里应该优先考虑 Amazon、del.icio.us、Flickr 还有其他的一些厂商(请参阅 参考资料)提供的服务 API。然而,如果您的应用程序主要集中于被执行的活动(这些活动与所依赖的资源不相关),那么您的服务是面向活动的,并且应该利用 SOAP 样式的设计模式。


title:Architectural Styles and the Design of Network-based Software Architectures
creator:Roy Thomas Fielding
source url:http://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm
descritpion:
这篇文章是最早系统说明REST的软件设计风格的论文,这也是作者的博士学位论文,对于关心软件体系结构设计的读者值得一读。

2006,五月 13

The Digital Object Model in Fedora

Fedora site:http://www.fedora.info/

Fedora是一个建立在数字对象模型基础上的资源仓储系统。

系统特点:
1.分层的数字对象模型;
2.分层的仓储系统;
3.操作基础接口:提供2种接口方式,一种基于SOAP,一种基于REST.

数据存取结构方式:

1.元数据文件和数据流文件以系统文件的形式分开存放;

2.利用mets封装元数据;

3.利用关系数据库进行管理;

4.数字对象的流文件存放目录利用时间来离散:年/月/日/时/分/


title:The Fedora Project:An Open-source Digital Object Repository Management System
source:D-Lib Magazine,April 2003,Volume 9 Number 4,ISSN 1082-9873

source url:http://www.dlib.org/dlib/april03/staples/04staples.html

Fedora对象模型说明

The Fedora architecture is based on object models that by definition are templates for units of content, called data objects, which can include digital resources, metadata about the resources, and linkages to software tools and services that have been configured to deliver the content in desired ways. These software connections are provided as methods encoded into two kinds of inter-related behavior objects as described below. A Fedora repository provides access to the data objects by leveraging tools and services that are described by the behavior objects. The behavior objects store metadata that describes the operations of the tool/service and the runtime bindings for running the operations. The Web Services Description Language (WSDL) is used to describe the tool/service bindings.
Fedora体系结构基于对象模型,而对象模型是被称为数据对象的内容单元的模板,数据对象包括数字资源与其元数据,以及被配置为按希望的方法对内容进行分发的软件工具和服务的关联。

Image of object model

Figure 1. The object model

The digital resources and the metadata are datastreams in an object model, definitions of which connect the content model either to internal content under the direct control of the repository or to external content that is delivered via HTTP servers. The content of a datastream is identified using a URL. When an object is ingested into a Fedora repository, a URL for a managed datastream is used by the repository system to retrieve the content and store it in the file space under its control; the datastream in the object is updated to be this internal address. When an object contains a datastream defined as external, the URL is stored in the datastream and used by the repository to access the data whenever necessary. An in-line metadata datastream is a bytestream that is name-spaced XML encoded data stored in the XML instantiation of the object directly, rather than as remote or managed content.

From the user’s point of view, the linkages to software tools and services (via disseminators) are seen as behaviors upon the units of content. These behaviors can be exploited to deliver varieties of prepared content directly to a web browser. They can also be used to prepare or configure content to be used through some external software application. In a sense, these object models can be thought of as containers that give a useful shape to information poured into them; if the information fits the container, it can immediately be used in predefined ways.

Fedora makes it possible to describe abstract sets of behaviors that constrain a corresponding set of specific processes or mechanisms delivering the behavior described for a given unit of content. One abstract set of behaviors, a behavior definition (bdef) object, can be used to constrain many mechanism sets, or behavior mechanism (bmech) objects, ensuring a standardization of behaviors for different units of content that are equivalent in type, but differing in format. A bdef object formally defines the terms of a behavior contract that must be upheld by any bmech object to be paired with it. In turn, the bmech object contains a data contract, the terms of which any data object model subscribing to it must meet. Bdef objects and bmech objects are analogous to interfaces and implementations in object-oriented programming.

Chart showing object behavior

Figure 2. Behavior object contracts

A data object model subscribes to a set of behaviors by linking to a bdef object and pairing it with a link to an appropriate bmech object. This pair of links defines a disseminator; an object model can contain any number of disseminators. In practical terms, this means a specific data object conforming to the model can have sets of behaviors for a variety of purposes, or sets of behaviors equivalent in purpose but that prepare the object’s content to be delivered to applications with different format requirements. In summary, a data object model specifies the number and types of datastreams as well as the set of disseminators every conforming data object will have.


title:Fedora数字对象的扩展和重用机制分析
creator:江淇; 王群;
source:图书馆理论与实践 , 2006年 02期
subject:Fedora; 数字图书馆; 数字对象;
description abstract:
Fedora作为一个已经成功应用的开放源代码的数字图书馆软件,已经得到了业界的广泛注意。本文以Fedora的数字对象为重点,在分析Fedora的软件、技术文档和相关论文的基础上,探讨了它的数字对象的扩展和重用机制。

A digital object is consist of 4 constituents:

1.pid-global

2.disseminator

3.datastream

4.metadata

结构内核-细胞核

接口层-细胞质


title:Fedora仓储体系研究及其扩展案例分析
creator:林颖;中国科学院文献情报中心
source:现代图书情报技术 2005年 08期
subject:数字仓储; 数字对象; Fedora; Tufts; 大学数字图书馆;
description abstract:
数字仓储作为数字内容管理的解决方案之一,被广泛地应用于信息交流、文献出版、数字图书馆、长期保存等领域。在众多的仓储技术中,Fedora以其灵活性、扩展性引起了研究人员的广泛关注。本文针对该系统的体系结构,详细解析了数字对象及其仓储框架的概念和基本原理,并针对Fedora系统组件化的特性分析了该系统的扩展应用,最后以Tufts大学数字图书馆为例进行了具体分析。


title:基于Fedora的数字资源管理方案的研究与实现
creator:张蓓; 董丽; 李新伟; 邢春晓;清华大学图书馆; 清华大学信息技术研究院;
subject:长期保存; Fedora; 数字资源管理;
description abstract:
以清华大学开展的数字资源长期保存,数字图书馆的建设为背景,通过对开放源代码的Fedora系统框架的研究,以及和DSpace系统的比较,提出了基于Fedora的数字资源管理方案,为解决不同种类数字资源统一存储管理问题提供了思路和方法。

2006,五月 12

What is Digital Object?

注:本文正在写作中,内容可能随时更新。本论文的写作过程中全部采用web2.0工具:利用writely记录、写作,利用google、delicious搜集资料,利用delicious标记参考文献。

本文将解释数字对象的构成,数字对象在不同环境中的存在形式,以及在OAIS模型中的体现。数字对象的模型在数字资源组织建设,保存管理,存取服务中都有不同的体现,一个明确的数字对象模型是做好这些工作的基础。一个图书馆往往会存在不同的数字资源管理系统,这些系统中有负责资源加工获取的,有负责资源保存版权管理的,有负责存取服务的。在这些系统之间资源的相互交换,必然涉及数字对象的实现。构建符合图书馆实际情况的数字对象模型,对数字图书馆建设是至关重要的。

本论文包括数字对象如下内容:
内容的完备性:构成,满足不同上下文中的变化;
结构的完备性:物理结构适应不同的应用环境;
功能的完备性:OAIS在其中的体现;
资源交换的要求/系统互操作的要求:数据的封装,个体交换,批量交换;
长期保存的要求;
对象生成的要求,微内容的不断充实;
服务的要求,WEB2.0的服务模式要求;提供微内容的服务;

应用实例:电子图书,学位论文,新闻,博客条目,字画拓片,古籍电子书,课件。

什么是可信任的数字仓储?
What is trusted digital repository?

参考文献


title:Digital Preservation Architecture and Technology for Trusted Digital Repositories,针对可信任的数字仓储的数字保存体系结构和技术
creator: Ronald Jantz,Michael J. Giarlo
source:D-Lib Magazine,June 2005,Volume 11 Number 6,ISSN 1082-9873
source uri:http://www.dlib.org/dlib/june05/jantz/06jantz.html
description Abstract:
开发针对信任数字仓储的保存方法需要集成新的方法、策略、标准和技术。数字仓储保存电子资源的期限至少能够和现存的保存方法差不多。现代计算技术不过才50年,10年前我们中很少有人看到或使用数字对象。传统保存的实践还只是好的设计,我们缺少保存经验和一致意见,这就导致出现了怎样进行基于数字的保存问题。我们能够保存数字对象至少100年吗?我们能够回答诸如”这个对象是数字原件吗”,或者”这个数字对象产生多久了”?这其实是一个数字资料的信任仓储要做的。本文的一个基本结论是今天有很多可用技术将帮助我们在数字保存方面建立信任,并且这些技术可以容易地集成到一个运行的数字保存框架中。

Developing preservation processes for a trusted digital repository will require the integration of new methods, policies, standards, and technologies. Digital repositories should be able to preserve electronic materials for periods at least comparable to existing preservation methods. Modern computing technology in general is barely fifty years old and few of us have seen or used digital objects that are more than ten years old. While traditional preservation practices are comparatively well-developed, lack of experience and lack of consensus raise some questions about how we should proceed with digital-based preservation processes. Can we preserve a digital object for at least one-hundred years? Can we answer questions such as “Is this object the digital original”? or “How old is this digital object”? What does it mean to be a trusted repository of digital materials? A basic premise of this article is that there are many technologies available today that will help us build trust in a digital preservation process and that these technologies can be readily integrated into an operational digital preservation framework.

Digital preservation is defined as the managed activities necessary: 1) For the long term maintenance of a byte stream (including metadata) sufficient to reproduce a suitable facsimile of the original document and 2) For the continued accessibility of the document contents through time and changing technology.数字保存定义为管理活动的必须满足1)对一个字节流(包括元数据)的长期维护,使其足够再生一个合适的原文档的副本,2)跨越时间和变化的技术的持续的存取能力.

We will further define the digital object as the basic unit of both access and digital preservation and one that contains all of the relevant pieces of information required to reproduce the document including metadata, byte streams, and special scripts that govern dynamic behavior. This data is encapsulated in the digital object and should be managed as a whole. If our archive is organized in such a way that bits and pieces of the object are scattered throughout the storage system, it becomes difficult, perhaps impossible, to keep track of all these pieces, and the digital archivist risks the possibility of not migrating all the relevant material as one unit.
我们进一步定义了数字对象:存取和数字保存的基本单元,包含再生文档所需要的所有的信息的有关部分,包括元数据,字节流,支配动态表现的专门描述.这些数据被封装栽树字对象中,并且应当作为一个整体管理.

A reliable digital repository is one whose mission is to provide long-term access to managed digital resources; that accepts responsibility for the long-term maintenance of digital resources on behalf of its depositors and for the benefit of current and future users; that designs its system(s) in accordance with commonly accepted conventions and standards to ensure the ongoing management, access, and security of materials deposited within it; that establishes methodologies for system evaluation that meet community expectations of trustworthiness; that can be depended upon to carry out its long-term responsibilities to depositors and users openly and explicitly; and whose policies, practices, and performance can be audited and measured.

数字对象的数字签名,永久标示符.

在Rutgers University Libraries (RUL)建立一个基于Fedora建立一个数字存储系统.


title:Research Libraries Group. (2002). Trusted digital repositories: Attributes and responsibilities.
source:http://www.rlg.org/longterm/repositories.pdf

This report describes a framework of attributes and responsibilities for trusted repositories
for digital content capable of handling the range of materials held by large and small research
institutions. It builds on the foundations laid down in the CPA/RLG report, including its
concept of a “deep infrastructure” and on the more recent work on the OAIS Reference
Model, which provides a high-level, generic model for the environment, producers, users,
data types, and information flows of a digital repository.


title:An Architecture for Information in Digital Libraries

creator:William Y. Arms ,Christophe Blanchi ,Edward A. Overly

source:D-Lib Magazine, February 1997,ISSN 1082-9873

source url:http://www.dlib.org/dlib/february97/cnri/02arms1.html

description abstract:

非常经典的一篇文章,虽然已经快过去了10年,在今天依然值得一读。文中对信息的结构,数字对象组织都有明晰的描述。数字对象构成要素的意义:handle,metadata

meta-object:元对象

  • An element is a bit sequence comprising an elementary unit of information. An element has its own ID.
  • A package is a collection of elements and other packages, with its own ID.
  • A digital object is a package with key-metadata for use in a networked environment. The ID is a handle.

The Repository Access Protocol

All interactions with the repository use the Repository Access Protocol (RAP). For the pilot repository, the following RAP commands were implemented. Each is implemented as a method on the repository class.

  • VerifyHandle. Confirm that a handle has been registered in the handle system.
  • AccessRepoMeta. Access the repository metadata.
  • Verify_DO. Confirm that a repository stores a digital object with a specified handle.
  • AccessMeta. Access the metadata for a specified digital object.
  • Access_DO. Access the digital object.
  • Deposit_DO. Deposit a digital object in a repository.
  • Delete_DO. Deletes a digital object from a repository.
  • MutateMeta. Edit the metadata for a digital object.
  • Mutate_DO. Edit a digital object.

In addition, a small number of methods have been implemented to administer the repository. These methods are not part of RAP.

Identifying Repositories:对象仓储命名与对象存取紧密相关。



title:Key Concepts in the Architecture of the Digital Library

creator:William Y. Arms

source:D-Lib Magazine, July 1995

source url:http://www.dlib.org/dlib/July95/07arms.html
description:

1. The technical framework exists within a legal and social framework

技术框架同时和法律框架与社会框架共存;

2. Understanding of digital library concepts is hampered by terminology

术语妨碍了对数字图书馆概念的理解;

3. The underlying architecture should be separate from the content stored in the library

基础架构应从图书馆藏内容中分离出来;

4. Names and identifiers are the basic building block for the digital library

命名和标识是数字图书馆的基础建设模块;

5. Digital library objects are more than collections of bits

数字图书馆对象不仅仅是位流的聚集;

6. The digital library object that is used is different from the stored object

使用的数字图书馆对象不同于存储的对象;

7. Repositories must look after the information they hold

对象存储必须对其保存的信息进行维护;

8. Users want intellectual works, not digital objects

用户需要的是知识作品而不是数字对象;


title:The Fedora Project:An Open-source Digital Object Repository Management System
source:D-Lib Magazine,April 2003,Volume 9 Number 4,ISSN 1082-9873

source url:http://www.dlib.org/dlib/april03/staples/04staples.html

Fedora对象模型说明

The Fedora architecture is based on object models that by definition are templates for units of content, called data objects, which can include digital resources, metadata about the resources, and linkages to software tools and services that have been configured to deliver the content in desired ways. These software connections are provided as methods encoded into two kinds of inter-related behavior objects as described below. A Fedora repository provides access to the data objects by leveraging tools and services that are described by the behavior objects. The behavior objects store metadata that describes the operations of the tool/service and the runtime bindings for running the operations. The Web Services Description Language (WSDL) is used to describe the tool/service bindings.
Fedora体系结构基于对象模型,而对象模型是被称为数据对象的内容单元的模板,数据对象包括数字资源与其元数据,以及被配置为按希望的方法对内容进行分发的软件工具和服务的关联。

Image of object model

Figure 1. The object model

The digital resources and the metadata are datastreams in an object model, definitions of which connect the content model either to internal content under the direct control of the repository or to external content that is delivered via HTTP servers. The content of a datastream is identified using a URL. When an object is ingested into a Fedora repository, a URL for a managed datastream is used by the repository system to retrieve the content and store it in the file space under its control; the datastream in the object is updated to be this internal address. When an object contains a datastream defined as external, the URL is stored in the datastream and used by the repository to access the data whenever necessary. An in-line metadata datastream is a bytestream that is name-spaced XML encoded data stored in the XML instantiation of the object directly, rather than as remote or managed content.

From the user’s point of view, the linkages to software tools and services (via disseminators) are seen as behaviors upon the units of content. These behaviors can be exploited to deliver varieties of prepared content directly to a web browser. They can also be used to prepare or configure content to be used through some external software application. In a sense, these object models can be thought of as containers that give a useful shape to information poured into them; if the information fits the container, it can immediately be used in predefined ways.

Fedora makes it possible to describe abstract sets of behaviors that constrain a corresponding set of specific processes or mechanisms delivering the behavior described for a given unit of content. One abstract set of behaviors, a behavior definition (bdef) object, can be used to constrain many mechanism sets, or behavior mechanism (bmech) objects, ensuring a standardization of behaviors for different units of content that are equivalent in type, but differing in format. A bdef object formally defines the terms of a behavior contract that must be upheld by any bmech object to be paired with it. In turn, the bmech object contains a data contract, the terms of which any data object model subscribing to it must meet. Bdef objects and bmech objects are analogous to interfaces and implementations in object-oriented programming.

Chart showing object behavior

Figure 2. Behavior object contracts

A data object model subscribes to a set of behaviors by linking to a bdef object and pairing it with a link to an appropriate bmech object. This pair of links defines a disseminator; an object model can contain any number of disseminators. In practical terms, this means a specific data object conforming to the model can have sets of behaviors for a variety of purposes, or sets of behaviors equivalent in purpose but that prepare the object’s content to be delivered to applications with different format requirements. In summary, a data object model specifies the number and types of datastreams as well as the set of disseminators every conforming data object will have.


title:5SGraph demo: a graphical modeling tool for digital libraries
creator:Qinwei Zhu Goncalves, M.A. Fox, E.A.; Virginia Polytech. & State Univ., Blacksburg, VA, USA
source:This paper appears in: Digital Libraries, 2003. Proceedings. 2003 Joint Conference on Digital Libraries ; Houston, TX, USA, 27-31 May 2003
Abstract
We present a domain-specific visual modelling tool, 5SGraph, aimed at modelling digital libraries. 5SGraph is based on a metamodel that describes DLs using the 5S theory [M.A. Goncalves et al., 2003]. The output from 5SGraph is a digital library model that is an instance of the metamodel, expressed in the 5S description language (5SL) [M.A. Goncalves et al., 2002]. 5SGraph presents the metamodel in a structured toolbox, and provides a top-down visual building environment for designers. The visual proximity of the metamodel and instance model facilitates requirements gathering and simplifies the modelling process. Furthermore, 5SGraph maintains semantic constraints specified by the 5S metamodel and enforces these constraints over the instance model to ensure semantic consistency and correctness. 5SGraph enables component reuse to reduce the time and efforts of designers. 5SGraph also is designed to be flexible and extensible, able to accommodate and integrate several other complementary tools (e.g., to model scenarios or complex digital objects), reflecting the interdisciplinary nature of digital libraries. The tool has been tested with real users and several modelling tasks in a usability experiment [Zhu, Q., 2002] and its usefulness and learnability have been demonstrated.


title:An architectural design for digital objects
creator:Fishwick, P.A. ;Dept. of Comput. & Inf. Sci. & Eng., Florida Univ., Gainesville, FL, USA;
This paper appears in: Simulation Conference Proceedings, 1998. Winter
Publication Date: 13-16 Dec. 1998
Volume: 1
On page(s): 359 - 365 vol.1
Number of Pages: 2 vol. (xxxiii+xix+1765)
Meeting Date: 12/13/1998 - 12/16/1998
Location: Washington, DC
INSPEC Accession Number:6147637
Digital Object Identifier: 10.1109/WSC.1998.745009
Posted online: 2002-08-06 21:49:18.0
Abstract
Defines the term “digital object” and specifies a variety of qualities that are important during the object design phase. A digital object contains a set of models, and is meant to serve as a reusable entity to be used on the World Wide Web or over the Internet. An example using a two-link robot arm is presented. We have found the digital object design methodology to provide an information schema to describe where to locate information about the object, for simulation of dynamic models as well as the execution of other model types


title:The law of possession of digital objects: dominion and control issues for digital forensics investigations and prosecutions
creator:Losavio, M.M.; Dept. of Comput. Eng. & Comput. Sci., Louisville Univ., KY, USA
This paper appears in: Systematic Approaches to Digital Forensic Engineering, 2005. First International Workshop on
Publication Date: 7-9 Nov. 2005
On page(s): 177 - 183
Number of Pages: ix+279
INSPEC Accession Number:8749638
Digital Object Identifier: 10.1109/SADFE.2005.25
Posted online: 2006-02-13 09:02:47.0
Abstract
The possession of digital objects defines rights and liabilities of the possessor. The nature of digital data, networked systems and data security suggest review of the fundamental concept as applied to digital objects. Possession of digital objects may be separate and distinct from physical possession of storage media and systems. Failure to address this risks error based on misleading evidence as to possession.


title:Digital object identifiers and their role in the implementation of electronic publishing
creator:Davidson, L.A. Douglas, K. Seeley G. Mudd ;Libr. for Sci. & Eng., Northwestern Univ., Evanston, IL, USA;
This paper appears in: Socioeconomic Dimensions of Electronic Publishing Workshop, 1998. Proceedings
Publication Date: 23-25 April 1998
On page(s): 59 - 65
Number of Pages: ix+153
Meeting Date: 04/23/1998 - 04/25/1998
Location: Santa Barbara, CA
INSPEC Accession Number:6097208
Digital Object Identifier: 10.1109/SEDEP.1998.730709
Posted online: 2002-08-06 22:06:24.0
Abstract
The major scientific, technical and medical (STM) publishers have developed the Digital Object Identifier (DOI) system that will potentially offer improved order and reliability for the discovery and access of information on the global network. However, the system, as it is currently envisioned, does not meet the needs of all types of publishers, commands unknown overhead costs, is vague in application, and has features that potentially hinder access rather than facilitate it. The DOI development will need the input of not just publishers, but librarians, scholarly societies, and researchers to be effectively implemented on a worldwide scale and attract the system participants and users it will need to prevail


title:Spatiotemporal annotation graph (STAG): a data model for composite digital objects
creator:Smriti Yamini Amarnath Gupta; Dept. of Comput. Sci. & Eng., California Univ., San Diego, CA, USA
This paper appears in: Data Engineering, 2005. ICDE 2005. Proceedings. 21st International Conference on
Publication Date: 5-8 April 2005
On page(s): 1129 - 1130
Number of Pages: xxx+1154
ISSN: 1084-4627
INSPEC Accession Number:8556276
Digital Object Identifier: 10.1109/ICDE.2005.136
Posted online: 2005-04-18 09:11:49.0
description Abstract
In this demonstration, we present a database over complex documents, which, in addition to a structured text content, also has update information, annotations, and embedded objects. We propose a new data model called spatiotemporal annotation graphs (STAG) for a database of composite digital objects and present a system that shows a query language to efficiently and effectively query such database. The particular application to be demonstrated is a database over annotated MS Word and PowerPoint presentations with embedded multimedia objects.


title:P2P-4-DL: digital library over peer-to-peer
creator:Walkerdine, J. Rayson, P. ;Dept. of Comput., Lancaster Univ., UK
This paper appears in: Peer-to-Peer Computing, 2004. Proceedings. Proceedings. Fourth International Conference on
Publication Date: 25-27 Aug. 2004
On page(s): 264 - 265
Number of Pages: x+286
INSPEC Accession Number:8272574
Digital Object Identifier: 10.1109/PTP.2004.1334957
Posted online: 2004-09-20 11:00:33.0
Abstract
The P2P-4-DL project aims to investigate and build a DL system that would operate over a P2P structure. Rather than storing digital objects centrally they remain the responsibility of the individual peers that provide them. This allows the system to utilise network resources more efficiently as well as providing users with a greater sense of control over the digital objects they share. Our prototype also draws upon natural language processing (NLP) techniques in an attempt to increase the usability of the system. Other related work within this area includes EDUTELLA, a RDF based P2P infrastructure that can support the development of DL’s.


title:The OpenDOOR project: a Digital Object Observatory and Repository on the World Wide Web

creator:Nous, A.P. ;NASA Educator Resource Center, Pittsburgh Univ., PA, USA;
This paper appears in: Geoscience and Remote Sensing Symposium Proceedings, 1998. IGARSS ‘98. 1998 IEEE International
Publication Date: 6-10 July 1998
Volume: 1
On page(s): 536 - 538 vol.1
Number of Pages: 5 vol. cxxxii+2754
Meeting Date: 07/06/1998 - 07/10/1998
Location: Seattle, WA
INSPEC Accession Number:6034191
Digital Object Identifier: 10.1109/IGARSS.1998.702963
Posted online: 2002-08-06 22:00:22.0
Abstract
The University of Pittsburgh-NASA Educator Resource Center collaboration for imaging and visualization has developed a website from general guidelines suggested in visualization of space data. It is an Internet website supporting an open Digital Object Observatory and Repository, (OpenDOOR) (www.pitt.edu/~nasa/opendoor/openDOOR.html). The author describes a website displaying images, providing lessons, abstracts of technical reports, and concept papers, and eventually involving sharing of data in real time, control of instruments, and tools allowing students to work together over a wide area network


title:A semi-automated digital preservation system based on semantic Web services
creator:Hunter, J. Choudhury, S. ;DSTC PTY Ltd, Brisbane, Qld., Australia
This paper appears in: Digital Libraries, 2004. Proceedings of the 2004 Joint ACM/IEEE Conference on
Publication Date: 7-11 June 2004
On page(s): 269 - 278
Number of Pages: xiv+429
INSPEC Accession Number:8179679
Digital Object Identifier: 10.1109/JCDL.2004.1336136
Posted online: 2004-09-27 13:20:38.0
description Abstract
We describe a Web-services-based system, which we have developed to enable organizations to semiautomatically preserve their digital collections by dynamically discovering and invoking the most appropriate preservation service, as it is required. By periodically comparing preservation metadata for digital objects in a collection with a software version registry, potential object obsolescence can be detected and a notification message sent to the relevant agent. By making preservation software modules available as Web services and describing them semantically using a machine-processable ontology (OWL-S), the most appropriate preservation service(s) for each object can then be automatically discovered, composed and invoked by software agents (with optional human input at critical decision-making steps). We believe that this approach represents a significant advance towards providing a viable, cost-effective solution to the long term preservation of large-scale collections of digital objects.


title:Principles for digital preservation
creator:H. M. Gladney. Association for Computing Machinery. Communications of the ACM.

Document types: Feature
Publication title: Association for Computing Machinery. Communications of the ACM. New York: Feb 2006. Vol. 49, Iss. 2; pg. 111
ISSN/ISBN: 00010782
ProQuest document ID: 980874011
DOI: 10.1145/1113034.1113038
Document URL: http://proquest.umi.com/pqdweb?did=980874011&sid=1&Fmt=2&clientId=42365&RQT=309&VName=PQD

Abstract (Document Summary)

Most preservation literature emphasizes the perspectives of archiving institutions. This article and supporting Trustworthy Digital Object (TDO) reports focus on end users needs because these have precedence over repository needs. Principles for a TDO design have been articulated here to address every technical problem and requirement identified in the literature. The central elements are an encapsulation scheme for digital preservation objects and encoding using extended Turing-complete virtual machines. Correct TDO implementations will allow preservation of any type of digital information and will be as efficient as any competing solution. Critical examination of this work by readers is encouraged and public discussion is called for because getting it right is too important for anything short of complete transparency.

title:aDORe: A Modular, Standards-Based Digital Object Repository
creator:Herbert Van de Sompel, Jeroen Bekaert, Xiaoming Liu, Luda Balakireva, Thorsten Schwander. The Computer Journal. London: Subjects: Digital libraries, Computer architecture, Extensible Markup Language, Information storage, Coding standards
Companies: Los Alamos National Laboratory (NAICS: 541710 )
Author(s): Herbert Van de Sompel, Jeroen Bekaert, Xiaoming Liu, Luda Balakireva, Thorsten Schwander
Publication title: The Computer Journal. London: 2005. Vol. 48, Iss. 5; pg. 514
Supplement: Special Focus-Working with Multimedia Standards: MPEG-7,
ISSN/ISBN: 00104620
ProQuest document ID: 893266741
Document URL: http://proquest.umi.com/pqdweb?did=893266741&sid=1&Fmt=2&clientId=42365&RQT=309&VName=PQD

Abstract (Document Summary)

This paper describes the aDORe repository architecture designed and implemented for ingesting, storing, and accessing a vast collection of Digital Objects at the Research Library of the Los Alamos National Laboratory. The aDORe architecture is highly modular and standards-based. In the architecture, the MPEG-21 Digital Item Declaration Language is used as the XML-based format to represent Digital Objects that can consist of multiple datastreams as Open Archival Information System Archival Information Packages (OAIS AIPs). Through an ingestion process, these OAIS AIPs are stored in a multitude of autonomous repositories. A Repository Index keeps track of the creation and location of all the autonomous repositories, whereas an Identifier Locator reflects in which autonomous repository a given Digital Object or OAIS AIP resides. A front-end to the complete environment–the OAI-PMH Federator–is introduced for requesting OAIS Dissmination Information Packages (OAIS DIPs). These OAIS DIPs can be the stored OAIS AIPs themselves, or transformations thereof. This front-end allows OAI-PMH harvesters to recurrently and selectively collect batches of OAIS DIPs from aDORe, and hence to create multiple, parallel services using the collected objects. Another front-end–the OpenURL Resolver–is introduced for requesting OAIS Result Sets. An OAIS Result Set is a dissemination of an individual Digital Object or of its constituent datastreams. Both front-ends make use of an MPEG-21 Digital Item Processing engine to apply those services to OAIS AIPs, Digital Objects, or constituent datastreams that were specified in a dissemination request.[PUBLICATION ABSTRACT]


title:Archives and manuscripts: Digital assets for the next millennium
ceartor:Elizabeth Yakel. OCLC Systems and Services.

Subjects: Archives & records, Digital libraries, Technological change, Information management
Author(s): Elizabeth Yakel
Publication title: OCLC Systems and Services. Bradford: 2004. Vol. 20, Iss. 3; pg. 102
ISSN/ISBN: 1065075X
ProQuest document ID: 715699771
Document URL: http://proquest.umi.com/pqdweb?did=715699771&sid=1&Fmt=2&clientId=42365&RQT=309&VName=PQD

Abstract (Document Summary)

Over the past decade, a variety of digital imaging projects have been carried out in archives, libraries, and museums. This paper discusses the difficulties in moving from a series of digital projects to a digitization program, and the ensuing transformation in thinking from digital objects to digital assets that needs to occur. It also discusses the problems archives and museums face in managing, preserving, and providing continuing access to these digital assets and potential models for their long-term management. [PUBLICATION ABSTRACT]


title:Developing a digital preservation strategy at Edinburgh University Library

creator:Najla Semple.

Subjects: Studies, Colleges & universities, Digital libraries, Information storage, Pilot projects
Classification Codes 9130 Experimental/theoretical, 5200 Communications & information management, 8306 Schools and educational services, 9175 Western Europe
Locations: United Kingdom, UK
Companies: Edinburgh University Data Library (NAICS: 514120 )
Author(s): Najla Semple
Publication title: VINE. Bradford: 2004. Vol. 34, Iss. 1; pg. 33
ISSN/ISBN: 03055728
ProQuest document ID: 624162501
Document URL: http://proquest.umi.com/pqdweb?did=624162501&sid=1&Fmt=4&clientId=42365&RQT=309&VName=PQD

Abstract (Document Summary)

Digital preservation poses an increasing cause for concern in UK higher education institutions. This paper provides a general overview of the development of a digital preservation pilot project within a university library, including the future integration of the METS and OAIS standards. It also considers how it is planned to automate these digital preservation practices in Edinburgh University Library’s new digital object management system.


title:Pulling it all together: use of METS in RLG cultural materials service
creator:Merrilee Proffitt. Library Hi Tech.

Subjects: Studies, Libraries, Standards, Metadata, Software, Catalogs
Classification Codes 9540 Non-profit institutions, 9130 Experimental/theoretical, 9190 United States, 5240 Software & systems
Locations: United States, US
Author(s): Merrilee Proffitt
Publication title: Library Hi Tech. Bradford: 2004. Vol. 22, Iss. 1; pg. 65
ISSN/ISBN: 07378831
ProQuest document ID: 621960011
Document URL: http://proquest.umi.com/pqdweb?did=621960011&sid=1&Fmt=2&clientId=42365&RQT=309&VName=PQD

Abstract (Document Summary)

RLG has used METS for a particular application, that is as a wrapper for structural metadata. When RLG cultural materials was launched, there was no single way to deal with “complex digital objects”. METS provides a standard means of encoding metadata regarding the digital objects represented in RCM, and METS has now been fully integrated into the workflow for this service.[PUBLICATION ABSTRACT]


title:分布式数字图书馆机制

creatror :张晓林

source:情报学报,2000年,21卷,第1期,issn1007-7634

description abstract:

本文在简要分析分布数字图书馆体系的现实性、互操作性要求、互操作性实现方式后,探讨了基于分布式数字对象和基于外部协调体系的分布式数字图书馆模式,并简要介绍NCSTRL、OAI、DNER、NSDL等具体分布式数字图书馆系统


title:数字图书馆中数字对象的概念、作用及其命名与标识

creator:刘友华 张福炎

source:情报学报,2002年 20卷 12期,issn1007-7634

description abstract:

数字对象是数字图书馆架构的根本保证。本文首先提出了讨论有关数字对象问题技术的必要性,并理解性地描述了数字对象的概念,简单分析并说明了数字对象在数 字图书馆中的核心地位与关键作用,同时结合实例描述了通过命名系统对数字对象进行标识的现状,最后给出了如何解决有关数字对象问题的一些建议与方法.


title:开放数字信息服务体系:概念、结构与技术

creator:张晓林

source:中国图书馆学报,2002年 28卷 3期,issn1001-8867

description abstract:开放数字信息服务体系,有其特定的概念、原则要求和功能框架。开放系统的开放描述基于扩展的元数据概念,信息系统通过开放语言和规范机制来实现对系统各层 次内容的开放描述。发布服务机制源于分布对象技术,它将各种系统视为一个数字对象,对其界面、功能、数据流、传输协议等进行描述。开放体系的开放集成,包 括横向和纵向开放集成,它们各有其技术线路。


title:复合数字对象研究

creator:李春旺 张晓林

source:情报学报,2004 Vol.23 No.4 P.444-451

本文在综合分析国外数字对象研究现状之后,讨论了复合数字对象的属性特征.在此基础上,提出了一个复合数字对象参考模型,介绍了其层次结构的划分原则以及各层功能组成.


title:数字图书馆的数字对象体系结构

creator:欧洁 罗治国

source:中国科学院研究生院学报,2000年 17卷 1期,ISSN: 1002-1175

description abstract:

介绍了数字图书馆的数字对象体系结构,它提供对分布式数字对象的持久保存、安全访问、管理以及索引服务,它的组成部分有:名录服务系统、仓储服务系统、索 引服务系统和用户接口网关。名录服务系统在整个internet范围内为用户提供安全的名录解析和分布式的句柄管理服务。仓储服务系统的基本存储单元是数 字对象,整个系统就是为了如何存储、访问、管理数字对象而设计的。索引服务系统进行资源发现,从而使用户能够很容易地从馆藏中寻找和发现所需的对象(内 容)。用户接口网关提供以人为中心的数字图书馆功能入口点。


2006,五月 11

在网上发现一位老兄,真厉害,涉猎多个专业,个人爱好广泛,名子叫Nova Spivack, 此人很早就提出Web2(不是web2.0)。自称是著名管理学家Peter F. Drucker的孙子。

P.F.Drucker是最早论述人类必将迈入知识社会的学者,在他的管理学著作中把创新列为企业两个核心竞争力之一,不过那可是在上个世纪的50年代,比我们早了50年啊!

P.F.Drucker的孙子的个人主页:http://novaspivack.typepad.com/nova_spivacks_weblog/

2006,五月 10

5.1过后国图网络严重中毒,网络速度慢得几乎不能工作,时断时续,对我这种依赖网络工作的人影响甚重。由此可见,公共网络的安全是多么的重要啊!预祝国图的网络管理人员早日解决此问题,我等网络寒士俱乐颜!

2006,四月 19

What is DC?

Dublin Core Metadata Initiative:DCMI
DCMI也简称DC
Dublin Core Metadata Element Set,中文译为“杜柏林核心元数据元素集”,或者“DC元素集”,我们通常说的DC其实是指DC元素集。

DCMI的完整档案资料在http://dublincore.org ,包括15个核心元素定义,还有修饰词的定义。至于ISO15836-2003则只对15个元素给出了定义,其中并没有包含修饰词的定义。

DCMI的定义与一般元数据方案定义的最大区别只给出了语义及其名称的定义,至于数据的表示方式,元数据的编码则由具体的应用自行决定。例如,OAI_PMH中就有DC编码的定义。

DC在实际使用中,更多的是面向于资源的发现,要对资源进行信息挖掘和完整的描述显然是不够的,当然DC也不应该担当此任。图书馆在采用元数据方案时,应该把DC与MODS,METS等其它元数据方案相结合才能构建完备的元数据方案。

DC从诞生到现在,在IT的生命周期里虽然已经很长了,但它并没有过时,它的应用才刚刚开始。



Overview of Documentation for DCMI Metadata Terms

Creator: DCMI Usage Board
Identifier: http://dublincore.org/usage/documents/overview
Last updated: 2003-05-27
Description: This page provides an overview of official documentation of all DCMI metadata terms.
Date Valid: 2003-03-04

DCMI Metadata Terms

This document includes the up-to-date specification of all metadata terms maintained by the Dublin Core Metadata Initiative, including elements; element refinements and encoding schemes (”qualifiers”); and controlled vocabulary terms (the DCMI type vocabulary). Additional links below lead to subsets of DCMI metadata terms that are maintained as separate documents for legacy purposes: Dublin Core Metadata Element Set, Version 1.1 and DCMI Type Vocabulary. The DCMI Metadata Terms document incorporates and supersedes the legacy documents Dublin Core Qualifiers, DCMI Elements and Element Refinements - a current list, and DCMI Encoding Schemes - a current list.


Dublin Core Metadata Element Set, Version 1.1

All terms in this document are also described in the complete listing, “DCMI Metadata Terms” (above). This document describes the original fifteen Dublin Core elements. These elements have been formally endorsed as the CEN Workshop Agreement CWA 13874 and as the NISO Standard Z39.85-2001, which was used as the basis for the Draft International Standard balloted by ISO as DIS 15836. This document has been brought into line with the NISO standard and is maintained separately for public access and historical reference. It is expected that this document will be brought into line with the more authoritative DCMI Metadata Terms document prior to the formal publication of ISO 15836.


DCMI Type Vocabulary

All terms in this document are also described in the complete listing, “DCMI Metadata Terms” (above). This document describes a general, cross-domain list of approved terms that may be used as values for the Type element to identify the genre of a resource.


DC也成为ISO标准和NISO标准:

下一页 »