The Database Center for Life Science (DBCLS) / Aiming for the integration and international standardization of life science data

NEED FOR THE DATABASE

Since the late 1990s, the rapid growth of genomics has generated a large amount of information on life science that has accumulated in various formats. To maintain various and independent databases and to make good use of the knowledge in life science contained inside these databases, a dedicated organization that researches, develops, and provides services in the latest technology in database integration is indispensable. Established in April 2007, the Database Center for Life Science (DBCLS) is such an organization.

History of the Database Center for Life Science (DBCLS)

Fiscal Year 2006 Ministry of Education, Culture, Sports, Science, and Technology Integrated Database Project (the Integrated Database Project) was launched.
April 2007 DBCLS was established within the Research Organization of Information and Systems. It was commissioned as the Integrated Database Project (four-year period) in that year as the core organization.
April 2011 The National Bioscience Database Center (NBDC) was established in the Japan Science and Technology Agency (JST), and the first phase of the Life Science Database Integration Coordination Program was launched. DBCLS was selected to be the principal leader for the "Core Technology Development Program" (for a three-year period).
April 2014 "Joint Research on Core Technology Development and Database Operation for Database Integration" by NBDC was started.

DBCLS promotes the R&D for integrating life science databases, which allows the integrated use of various databases held at universities and research institutes throughout Japan. To this end, it shares a future vision of the "integrated database" with international research communities, and continues to build the core technology while staying deeply involved in its standardization. Because the most recent results developed through the program are applicable to various fields, the center is enhancing its function as a center of excellence in life science database integration.

Main Projects at the Database Center for Life Science (DBCLS)

(Project Director: Yuji Kohara [DBCLS Center, National Institute of Genetics])

R&D OVERVIEW

1. International standardization of database integration using RDF conversion technology

In the Semantic Web, which is the next generation web, semantic content is added to data for machine processing. DBCLS is promoting the development and international standardization of core technology for database integration based on RDF, the standard description format for the Semantic Web. DBCLS hosts efficient Hackathon-type international workshops in collaboration with domestic and international advanced research institutions and researchers in order to work on important issues, such as program development (including RDF conversion technology, development of advanced search technology and the establishment of ontology) and the creation of guidelines by combining the knowledge of top-level researchers, and continuously explores new research topics.

2. Accommodating the increasing volume of life science data

As the life science databases increase in scale, diversify, and become more personal and quantified, DBCLS examines technical challenges in new types of data processing and integration, image processing, security of genome and clinical data, and harmonization with the model. It also examines systematic requirements for data publication and working on more advanced search functions in order to develop sustainable database integration technologies.

3. Functional enhancement by reinforcing links to literature

R&D is under way to link relevant data and literature with technologies, such as natural language processing techniques, in order to integrate data and enhance the functionality available to users.

4. Human resource development for research and technology development

Through collaborative research with universities and research institutions throughout the country, the center is working toward establishing a human resource development network to significantly reinforce research and technology development.

PROMOTION OF DATABASE DEVELOPMENT

In coordination with the National Bioscience Database Center (NBDC) under the Japan Science and Technology Agency, RIKEN, the National Institute of Advanced Industrial Science and Technology, the Institute for Chemical Research, Kyoto University, and the University of Tokyo, the center is conducting R&D mainly on the core technology necessary for integration.

EMERGING RESEARCH TRENDS AND PROSPECTS

The center is going forward with the integration of microorganism databases using Semantic Web technologies, and the basic integrated system, TogoGenome, is now publicly available. Faceted searching with multiple ontologies has been realized in TogoGenome, and the TogoStanza system was developed to display the search results. These are the fruits of joint research by domestic research institutes and international cooperation, flourishing at the international workshop "BioHackathon" that the center plans to continue to host.

The following services to promote access to publicly available data have been created as the development of large-scale data access technologies: GGRNA, search engines for base sequence database RefSeq; GGGenome; DBCLS SRA, search engines for rapidly growing NGS data; and an integrated gene expression database, RefEx.

To develop human resources, DBCLS hired research administrators who are mainly postgraduate students, and they work on topics in the areas of biology, bioinformatics, and others through the OJT format, primarily through the development of DBCLS original video content, TogoTV. DBCLS has started joint research with the National Cancer Center in order to support human genome data expected to grow to Big Data.

Related articles

Research View 014

BioHackathon 2014, Road to Data Integration

[Life Science Data (DBCLS)] Toshiaki Katayama (Project Assistant Professor, DBCLS), and Atsuko Yamaguchi (Project Associate Professor, DBCLS)

"Hackathon" in short, is a coding event where spirited programmers gather for a short period to work on development and solve topics through highly intensive interaction. BioHackathon adopted this style in 2008.

Research View 003

Internationally standardized data are beginning to unravel the story of life

[Database Center for Life Science (DBCLS)] DBCLS Center Chief Yuji Kohara (Project professor, National Institute of Genetics)

The field of biological sciences has been an early advocate of the creation of databases.