WG6 Nonclinical - Data Interconnectivity

From PHUSE Wiki
Jump to: navigation, search

Data Integration versus Data Interconnectivity

Welcome to the Wiki for the Nonclinical Data Integration Project, part of the of the FDA/PhUSE Computational Sciences Nonclinical Working Group. Learn more about the larger working group at Non-Clinical Road-map and Impacts on Implementation. This Project overlaps with and coordinates with the FDA/PhUSE Computational Sciences Challenges of Integrating and Converting Data across Studies Working Group.

At the 2013 FDA/PhUSE CSS, it was decided that the Endpoint Predictivity project and the Data Interconnectivity project would be able to take advantage of some obvious synergy between the groups by merging. This combined group was renamed Nonclinical Data Interconnectivity for Clinical Endpoint Predictivity (NICE).

Overview & Scope


Need ID 0017. Encompassing 0037. Most of the focus on nonclinical data exchange and standards has been around toxicology studies (SEND standards), and some progress has been made on analyzing and searching these data. However, the interconnectivity between Toxicology, ADME and Safety Pharmacology (including cardiac/ECG), Reproductive Toxicology and other silos that go beyond typical toxicology study siloing (path, clinpath, clin obs, etc) has not been explored in great detail. In addition, genomic or other biomarker data may be tied to a submission. The ability to "connect" or "relate" data from one study type to another moving across the development pipeline can significantly impact the nature of questions that may be asked of the data. For example, for a certain drug belonging to a known pharmacological compound class, how does the safety profile from clinical toxicological endpoint studies compare with ADME behavior and histopathology in animal studies, and does that relate to the mechanism of action at the molecular or cellular level? Answers to such questions might allow us to compare physiological effects to molecular/cellular causes/effects that bring us closer to understanding the translational behavior of the drug and a step in the direction of "translational medicine". This likely impacts everything from internal (pharma) compound decisions and project management, submission package preparation efficiencies, as well as Regulatory (FDA) reviewer effectiveness. A particularly high value case is that of integrating the various attributes of pharmacological class with other data types.


  1. Identify new ways of integrating, analyzing and interpreting trends across traditional data silos, disciplines and organizations.
  2. Standardize a means to associate pharmacological classes with experimental compounds to enable connecting and relating data across pharmacological class.
  3. Relating nonclinical data across the continuum of drug discovery and development with a view of clinical relevance and eventual correlation with clinical data.


Identified Projects/Pilots/Activities

  1. Define data "interconnectivity" versus "integration" for nonclinical datasets.
  2. Development of scope for project (e.g., where are the integration needs? Should focus be on information model or technical interconnectivity/integration?).
  3. List out questions that would be useful to ask of the "interconnected" data.
  4. List all data silos of interest both nonclinical and clinical: discovery, lead generation, safety, chemical
  5. Categorize data silos by (i) study objectives such as ADME, Tox, Biomarker, etc.; (ii) study type such as EEG, pathology, etc.; and (iii) data types and specific data formats.
  6. Use Established Pharmacological Class (EPC) as a model trait to link data from different drugs either within study types or across different study types.
  7. Determine implementation methods, schema etc., taking some specific examples of nonclinical data silos.
  8. Coordinate efforts with WG3 on data integration of clinical data, and future integration of nonclinical data with clinical data.


May - June 2012: Define interconnectivity/data integration, Start defining data to be connected and rationale for doing so.

July - November 2012: Define example usecases for data interconnectivity across studies or data types not typically connected.

December 2012 - March 2013: Focus on whitepaper deliverable with the purpose of possible journal publication.

March 2013: Present poster at PhUSE CSS meeting.



  1. Define criteria and rationale for interconnectivity of context-specific nonclinical datasets
  2. Define business benefits, problem space, risks
  3. Collect attribute types for describing pharmacological class
  4. Create networks and mappings of possible connections between various data silos with examples
  5. Gather sample data needed for developing schema and use cases
  6. Get feedback and input from subject-matter experts


  1. Defined Interconnectivity and contextual interpretations of nonclinical data interconnectivity.
  2. Specified the rationale for performing such data interconnecitivity.
  3. Created several usecases to showcase examples of interconnectivity within and across disparate nonclinical data silos, such as In-vivo, In-vitro and ADME; safety pharmacology, toxicology and ADME; Carcinogenicity and Pharmacological classes.
  4. Develop a whitepaper ready for journal submission.
  5. Create poster on accomplishments to present at PhUSE CSS meeting in March 2013.


March 2012 - March 2013

  • Definition of “interconnectivity” in the context of nonclinical data
  • Create connectivity heatmap or scoring matrix to prioritize specific silo connections
  • Develop initial schema of certain study types
  • Publish white paper describing findings, concerns, suggestions and future plans. For more information click here.

Participation needs

We are looking for participation from a maximum of about 10-12 individuals to advise and contribute to this effort with relevant expertise as listed below:

  • Toxicologists (Paul, Jeremy, Alan)
  • Pathologists
  • ADME scientists
  • Late stage discovery scientists
  • Informaticists/database experts (Suresh, Donna)
  • Bioinformatics (Latha, Jyotsna)
  • Statisticians (Jyotsna)

Commitment required from participants

  • Every member of the group will be required to actively participate and contribute towards action items and milestone deliverables.
  • Time commitment will vary depending on current task at hand (~ biweekly meetings (1hr), email correspondence, and "homework")
  • Develop and review of example use cases
  • Contribute test data (diverse nonclinical data and possibly clinical as well)
  • Contrinute to the design of schema and implementation plans & issues.
  • Contribute to presentations & publications as needed.

If you are interested in participating or learning more, please contact the subgroup co-leads:

  • Jyotsna Kasturi: jkasturi@its.jnj.com
  • Paul Brown: Paul.Brown@fda.hhs.gov


Past teleconferences and initial discussions

  • 30 Mar 2012: Project co-lead discussion
  • 03 Apr 2012: WG6 "Data Integration" first co-lead meeting
  • 23 Apr 2012: WG6 "Data Integration" first subgroup meeting
  • 26 Apr 2012: WG6 co-leads meeting

Regular telecons planned for subgroup discussions, usually every other week on Wednesday 10am-11am (EST) to accomodate timezone differences of participants. Please contact Jyotsna Kasturi for telecon details. Links to conference call minutes are below.

Work Group Participants

Jyotsna Kasturi, Johnson & Johnson (co-lead)

Paul Brown, CDER FDA (co-lead)

Suresh Madhavan, PointCross

Jeremy Wally, CBER FDA

Latha Prabhakar, PointCross

Alan Brown, Novartis

Former Participants:
Henrik Drews, SixSteps AB
Donna Danduone, Instem

Conference Calls and Minutes

WG6 Data Interconnectivity meeting minutes: 23 Apr 2012
WG6 Data Interconnectivity meeting minutes: 09 May 2012
WG6 Data Interconnectivity meeting minutes: 30 May 2012
WG6 Data Interconnectivity meeting minutes: 27 June 2012
WG6 Data Interconnectivity meeting minutes: 11 July 2012
WG6 Data Interconnectivity meeting minutes: 25 July 2012
WG6 Data Interconnectivity meeting minutes: 15 August 2012
WG6 Data Interconnectivity meeting minutes: 05 September 2012
WG6 Data Interconnectivity meeting minutes: 19 September 2012
WG6 Data Interconnectivity meeting minutes: 03 October 2012
WG6 Data Interconnectivity meeting minutes: 17 October 2012
WG6 Data Interconnectivity meeting minutes: 05 December 2012
WG6 Data Interconnectivity meeting minutes: 19 December 2012
WG6 Data Interconnectivity meeting minutes: 16 January 2013
WG6 Data Interconnectivity meeting minutes: 30 January 2013
WG6 Data Interconnectivity meeting minutes: 13 February 2013

Working Group Progress


Interconnectivity of Nonclinical Safety Data Silos

One of the four sub-groups formed within the PhUSE/FDA (March 19th/20th at Silver Springs, Maryland) Working Group 6 for Nonclinical Development was chartered to develop concepts and a way forward to address the issue of creating value from the multiple silos of nonclinical data. The specific expectations were identified as described in the Vision and Objectives above.

The sub-group agreed to develop a prescriptive white paper as the deliverable. To that end an initial deliverable would be to establish an understanding and definition of interconnectivity. To better define this we would identify the list of typical data silos and to discuss the nature of the “inter-connectivity” among these silos in terms of scientific and business value. This paper is intended to kick off this discussion and help lead the sub-group towards its first deliverable. It opens up the topic in three sections:

  1. Discussion about interconnectivity and what it means in the context of nonclinical information in general and safety study data specifically
  2. The nature of the data and a preliminary list of typical data sources
  3. How to consider interconnecting them to extract business or scientific value using some sample use cases


We should consider what interconnectivity could mean in different contexts:

  • Access to multiple data silos from an application simply by making it possible to connect to the data source from one or more applications or portals. In this context there is no expectation that the data in each silo can be combined with, or provide an integrated experience. For example while looking at liver function results the researcher may be interested in looking at the drug concentration levels as a function of dosing but without a need to correlate the ADME data with the clinical pathology data directly.
  • Integrate data from multiple silos where an application is able to pick up data from one silo and apply it in combination with data from another silo in a mathematical or other formulation to create a single integrated, usable data set. This bears discussion with subject matter experts or scientists to look at examples where this would be important. Off hand it appears that this case is not likely to be of value because various silos of data such as ADME, Clinical Pathology, Histopathology, safety pharm etc. have evolved to collect data that is normally associated with a kind of analysis or a method of collection.
  • Inter-operate with data from multiple silos where higher-level analysis calls for the use of data from these disparate silos. Given that the data is normally of substantially different types and formats; and given that usually these data may not have a direct link or a common reference key that connect them, the likely way that such inter-connectivity can be achieved will be through the use of extracted or enriched metadata from each of the silos. The formulation used to extract these metadata will be specific and tuned to meet the intent behind how it might be used. This process may have to be repeated as new kinds of extraction models are conceived. This metadata would have to be structured in formats that are both meaningful and referenceable by the applications. <an few examples should be considered – perhaps how the ADME profile measured through AUC through disparate delivery mechanisms affect a safety pharm measurement in the cardio-vascular system?>

=== Are there other types of inter-connectivity? ===

In just there are non-clinical development itself there are at least 8 typical study types that end up in their own data silos – ADME/DMPK, General Toxicology, Genetic Toxicology, Safety Pharmacology, Reproductive and Development Toxicology, Irritation, Toxicogenomics, and Toxicoproteomics. This means there are 28 possible combinations taken 2 at a time. It is quite possible that three or more areas can be considered in concert in an analysis or research scenario adding another 56 combinations or more.

Therefore it would be worthwhile to evaluate each of the combinations taken two at a time by subject matter experts and scientists and to assess as many ways that they can see these data types to be used together in some way. It would be useful to attempt to rank order or score these and to use these rankings to decide what kind of interconnectivity should be proposed and if there is room for a metadata standard that allows inter-operability.

An Excel worksheet listing some of the typical nonclinical (and other) study types that generate their own data silos is also attached here.

File:DDD PhUSE.xlsx

Related Literature

File:Biologics in the pipeline.pdf

File:FDA Established Pharmacologic Class Legend 01July10 FINAL.doc

File:FDA Pharm Class Table 20100701.xls



FDA comments are an informal communication and represent the individual's best judgment. These comments do not bind or obligate FDA. The contents of this wiki are from the individual contributors and do not necessarily reflect the view and/or policies of the Food and Drug Administration, the employers of the individuals involved or any of their staff.

Last revision by Pacorn,10/7/2013