Data-Centric Chemistry / Development in data chemistry for discovery of new chemical compounds and reactions

Latest Publications and Announcements

Click here for the latest publications and announcements from "Data-Centric Chemistry."


To date, the existence of approximately 90 million chemical substances has been confirmed, and this number is increasing yearly on the order of magnitude of several ten thousands to one million. However, advanced quantum chemical technologies have been revealing that the number of chemical substances that can be theoretically predicted to exist but human beings have yet to experimentally obtain should be much higher. In this project, such potentially revolutionary "new molecules" will be explored in the QM (quantum mechanical) level, and a database system of the chemical reaction route networks for these molecules, along with the physicochemical properties related to their energy and electronic structures, will be developed. One of the goals of the project is to develop a web system accessible by research/educational institutions and by the general public. We plan to make it a self-developing database system, which will provide data resources for a wide variety of data-centric interdisciplinary chemical researches, including material science and drug design. (Project Director: Hiroko Sato [National Institute of Informatics])


The key points of the development are the efficient search and visualization of large-scale data and acceleration of the data exploration of chemical reaction routes. We started the project with designing the system and developing basic functions necessary for the future development, especially for the following points:

  1. Database system design
  2. Preparation of a data processing scheme considering high interactivity
  3. Extraction of user primary use cases and construction of an interaction flow for the purpose of development of web community environments
  4. Design of a data registration/management system and adjustment to the output of the current version of a GRRM (Global Reaction Route Mapping) software, which is the engine to explore the global chemical reaction route networks.

According to the system design, we have developed software to visualize and analyze the explored chemical reaction route network data, which is output from GRRM in a text format. The design and the implementation of the interactive system will effectively assist scientists in analyses of the chemical reaction network data. The visualization system is designed also with consideration of working with data search algorithms for the chemical reaction networks.

Towards discovering "new molecules” that are theoretically predicted but yet to be experimentally obtained (we are under investigation…)
A Global Reaction map of C2H2O2: a global map view (left), and an energy profile (right).
A 3D animation associated with the reaction profile: a minimum energy path from benzene to fulvene.
The first "new molecules” that we found from the global reaction maps!


In this project, individual functions will be developed independently, while considering the overall system design; the individual functions will be consolidated at the end.

Affiliated research institutes of the collaborating researchers:
The National Institute of Informatics, The Institute of Statistical Mathematics, Tohoku University, The Institute for Quantum Chemical Exploration, Kyoto University, The University of Tokyo, Kyoto Sangyo University, and The Swiss Federal Institute of Technology (ETH) Zurich


We have developed visualization software, which can be used in a personal computer, independently. The first implementation has been done for chemists studying chemical reaction mechanisms, who are expected to be mainly the GRRM users. There is also a plan to incorporate user feedback into the development of the visualization software for a wider audience.

Related articles

Research View 011

Chemical space travel on your desktop - Towards molecular discovery

[Data Centric Chemistry] Hiroko Satoh (Associate Professor, National Institute of Informatics)

The benzene ring (C6H6), consisting of six carbon atoms and six hydrogen atoms, would be one of the most common chemical structures not only for chemists.