Genetic Function Systems / We aim to develop a new method to elucidate the interrelationship between the phenotype and the genetic network.

Research outline

Research in the life sciences field has undergone a paradigm shift to a data-driven style since the human genome sequencing completion, because of the large amount of available genomic information. Recently developed massive parallel sequencing technologies have helped accelerate its momentum globally; however, the response of Japanese universities and other research institutions to this shift remains lagging. There is an urgent need for infrastructure development, in order to create a research community and scientific bases to make society accept the importance of massive data management and information/knowledge extraction that would play a major role in the future development of information and genome science.

Therefore, in this project, the latest genome-based technology is to be utilized to generate and collect systematically genomic and genetic information at a large-scale, in addition to pluralistic phenotype data in order to develop a method for the statistical analysis of various types of information. Also, a generic description of genomic function and genetic network will be prepared by integrating all obtained data, and a statistical method developed to describe the genetic correlation structures. This method is to be refined by applying to a large volume of genomic and phenotypic information obtained from a specific model organism. Through this, we aim to understand the biodiversity in a system resulting from the high-order association of a number of genetic factors.

The project includes the exchange of information between groups investigating three subthemes, in order to develop a new method for researching life phenomena. This will be done with an aim to propose a novel interpretation and principle, unique to data centric science.
(Project Director: Nori Kurata - National Institute of Genetics).

A platform for analysing variation of allelic gene expression based on the SNP information.

Purpose of the project

The major purpose of this project is to develop a statistical method for the modeling of multidimensional diversity of biological phenotypes, as well as a method for the analysis of massive genome sequence and gene expression data. A novel method could be developed by integrating these two methods, in order to visualize complex genetic correlation structures. This method is to be ultimately applied to a model organism, in order to extract the genomic function and network.

Project Promotion System

Studies analyzing large volumes of data for genetic experiments and genomic information via various informatics or statistical processing methods, as well as those deciphering, reconstructing, and utilizing a massive amount of genomic information, have seen increasing popularity in the West. These studies are essential for solving the problems in food, environmental, and medical sciences; therefore, a comprehensive research system must be quickly established.

In this project, therefore, massive and pluralistic genetic information is to be comprehensively analyzed through genetic, informatics, and statistical methods in order to establish a “genetic function system science”. In addition, a study will be conducted to understand the complex principles of life and genetic phenomena as a system. Initially, exhaustive data for multidimensional and diverse genetic factors, such as massive genome sequence polymorphism information, gene expression variation information, phenotype variation, temporal changes of those characteristics, etc. will be obtained using the genetic resources owned by the National Institute of Genetics. Finally, the information processing technology of the National Institute of Informatics and statistical modeling technology of the Institute of Statistical Mathematics is to be utilized to determine the genomic function and genetic network.

Introduction of subthemes

1. Large-scale production of genome-related information through next-generation sequencing, and the development of a method to analyze massive information

The latest genome technology will be used to systematically and exhaustively produce massive-scale genomic information, complex genome system-derived and gene-related data. A fusion research is to be conducted utilizing these data with the goal of comprehensively understanding the life system principle. To this effect, this subtheme incorporated multiple topics. In addition, the research will be performed under collaboration with the groups investigating subthemes 2 and 3; this research would also coordinate with other integrated projects focusing on massive-scale sequencing
(Principal Investigator: Asao Fujiyama - National Institute of Genetics/National Institute of Informatics).

2. Development and optimization of a statistical method for the visualization of genetic correlation structures, through the integration of a large volume of genome-related and pluralistic biological phenotype diversity data.

The main aim of this set of studies is to develop correlation analysis, regression analysis, permutation test, robust inference, impact analysis, graphical model, multiplicity adjustment, and other probability and statistical models relative to linkage analysis, quantitative trait loci (QTL) analysis, eQTL analysis, genetic network structure identification, and genetic diversity analysis. Specifically, this includes the following topics:

(Principal Investigator: Satoshi Kuriki - The Institute of Statistical Mathematics).

3. Extraction of genetic networks and genomic functions by applying informatics and statistical methods on a large amount of pluralistic data.

Mice, zebrafish, drosophila, and rice are models that have been uniquely developed and collected by the National Institute of Genetics. These are valuable research models containing a host of genomic and phenotypic information. Informatics and statistical methods established by the investigators of subthemes 1 and 2 will be applied to a large amount of model organism genomic and phenotypic data. With the further evaluation and improvement of the utilized methods, we aim to expand the foundation of “genetic function system science”, wherein the generic extraction of genetic networks and genomic functions would be possible.

(Principal Investigator: Nori Kurata - National Institute of Genetics)

Related articles

Research View 029

The NEXT STEP of Genome X Statistics

[Genetic Function Systems] Satoshi Kuriki (Professor, The Institute of Statistical Mathematics), Hironori Fujisawa (Professor, The Institute of Statistical Mathematics)

In March 2016, the Research Commons Project will end following three fruitful years. One of its research projects, "Genetic Function Systems," had evolved over 10 years having carried forth integrated research since the founding of the Institute.

Research View 028

Repatriation of the “Bean-Spotted” Fancy Mouse

[Genetic Function Systems] Toshihiko Shiroishi (National Institute of Genetics, Vice Director & Professor)

In addition to having an ideal genetic system for experimentation, the laboratory mouse, Mus musculus, has over 100 years of history of being used in the laboratory alongside the common fruit fly (Drosophila).

Research View 009

What is the Mechanism by Which a "Species" is Formed?

[Genetic Function Systems] Ayako Oka (Project researcher, Transdisciplinary Research Integration Center)

A “mule,” the offspring of a male donkey and a female horse, cannot reproduce, and therefore cannot leave any descendants.

2014-04-18 Press Release

A mechanism of genetic regulation evolving in different directions leads to new species!

[Genetic Function Systems] Ayako Oka (Project researcher, Transdisciplinary Research Integration Center), etc.

It is known that two populations of common ancestry cannot reproduce when they have been geographically separated for a long time.

Research View 002

Whole genome sequence tells a story: where does cultivated rice come from?

[Genetic Function Systems] Nori Kurata (Professor, National Institute of Genetics)

Patiently confirming one’s own primary findings via continued and persistent hands on experimentation has been the traditional way of “practicing science” in biological research.