Open Source Technologies in Clinical Research

From PHUSE Wiki
Jump to: navigation, search

Project Overview

Mission Open Source Technologies in Clinical Research aims to provide guidance to the use of open source technologies in regulatory environments within the pharmaceutical industry, including but not limited to R and Python. Our intent is to be a repository of knowledge for:
• Use Cases
• Implementation and validation guidance
• Best Practices

Our goal is to broaden the acceptance and general level of comfort with these technologies in the industry to assist in increasing their level of adoption.

Scope As open source languages and tools become more common across the data science landscape, opportunities to leverage these tools grows as well. Open source languages tend to be on the cutting edge, among other things enabling teams to explore the latest statistical techniques. R and Python are also prominent in the world of big data, connecting into powerful tools like Spark and Hadoop for distributed computing. Both of these languages also have impressive packages available in the world of machine learning and deep learning, bringing modern analytical tools to programming teams at little or no cost. Furthermore, R offers Shiny and R Markdown as advanced reporting tools enabling delivery of static or dynamic data dashboards. At the most basic levels, Python offers automation capabilities – and it is likely already available in most programming environments. These technologies have very much to offer, before even getting close to considering them as an organization’s submission or reporting language of choice.

This project will investigate the methods an organization can use to ensure the thoughtful use of these tools within regulatory environments. These methods will include the proper installation and deployment of open-source languages, validation of frameworks and packages used in development and analysis, quality control of in-house software development, and assurance of GxP compliance. Topics will be discussed at a higher level, veering the basic concepts behind techniques for installation deployment, and validation, as well as the specifics of how these things can be put into practice. The project will walk through proper installation the choice of development environment, and the implications of those choices. The risks of misunderstand what these tools can do given the environment they have been implemented in can be subtle and result in unexpected failures of applications, therefore proper validation is essential within a regulated space

The third element of this project involves the provision of best practices and guidelines that practitioners can use as resources when working with open source technologies in clinical research. One example will develop PHUSE specific coding style guides for open source programming languages, built on Google’s R Style guide, the tidyverse style guide, or for Python, the PEP-8 Style Guide. Most clinical programming departments have well defined Good Programming Practices (GPP) for SAS programming, as does PHUSE, and adoption of open source languages in the pharmaceutical industry will require similar guidance. A second example will provide guidance around the identification of common packages or common repositories that are determined safe to use within a programming environments when reporting on clinical trials, leveraging the work already being done by PharmaR. Furthermore, a clear understanding of packages selection and version control is essential to this process. The open source nature of these technologies, juxtaposed against the high standards of validation within the industry, necessitates the establishment of a trust mechanism for any open source element of the technology.

Project Leads

Name Role Organisation Deliverable Sub-team
Gayathri Kolandaivelu Sub-Team Lead JNJ Community Priority Pulse Survey
Frank Menius Sub-Team Lead YPrime
Eli Miller Sub-Team Lead Atorus Research
Michael Rimler Sub-Team Lead GSK
Michael Stackhouse Sub-Team Lead Atorus US Connect 2020

How to Get Involved

We have an All Hands Call on the First Tuesday of every month at 11am Eastern Time (New York).

A link to the meeting can be found here:

The all hands meeting is open to all who want to attend.

Please contact the Project Leads to be added to the calendar invite, mailing list, and other resources.

Sub-teams will have meetings as needed and set by the team leads. Please contact sub-team leads to participate in projects.

To propose a new Sub-team or project please contact the Project Leads and attend the All Hands Call.

Project Updates

Community Priority Pulse Survey

US Connect 2020

Open Source Technologies in Regulatory Submissions

Objectives and Timelines

List proposed project deliverables and timelines.

Recruit Volunteers July 1st 2019
Deliverable – Community Priority Pulse Survey Sep
Deliverable – Paper Presentation US PHUSE Connect 2020 (March 2020)

Meeting Minutes

Month Speaker Topic
January Paulo Bargo, J&J R Training Organizational Framework
February Internal Presentation of the Survey Analysis Team
March Katia Glass, Consultant Open Source Technology in Clinical Research
April Ellis Hughes, Fred Hutch R Package Validation Frameworks
May Doug Kelkhoff, Genentech Phuse Docker Deliverables Pilot
June Terek Peterson, YPrime The Randomizer: An IRT Solution Using R
July TBA Topic
August TBA Topic
September TBA Topic
October TBA Topic
November TBA Topic
December TBA Topic

Archive Meeting Minutes, recordings, and slides are collected on the PHUSE Teamwork application. Please contact the Project Leads to become involved with the project and added to the Teamwork application and other resources.

Open Source Training and Knowledge Base

Available Training for Open Source Technology
Links to Resources and Knowledge

Helpful Links

Slack Workspace: For informal discussions within our community. Please email the steering committee for access.

Project Members

Delivery Sub-Teams: 1 Community Priority Pulse Survey; 2 US Connect 2020

Clinical Programming

Name (Organisation) Name (Organisation)
James Kim (Pfizer) Benno Kurch (Covance)
Yuichi Nakajima (Novartis) Gayathri Kolandaivelu Project Lead (JNJ) 1
Harivardhan Jampala (Covance) James Gunter (Covance)
Matthew Travell (GSK) Michael Rimler Project Lead (GSK) 2
Michael Stackhouse Project Lead (Atorus) 2 Nicholas Masel (JNJ)
Steven Nicholas (Covance) Sonakshi Shankar (OCS Life Sciences for Danone Nutricia Utrecht)
Tulasi Marrapu (GSK) Eli Miller Project Lead (Atorus)
Alexey Kuznetsov (GSK)


Name (Organisation) Focus Area
Andy Nicholls (GSK) Statistical Data Sciences
Brian Di Pace (GSK) Clinical Statistics
Eanna Kiely (ClinBuild) Data Standards Governance
Frank Menius Project Lead (YPrime) Clinical Programming, Data Science, and Data Standards
Terek Peterson (YPrime) 2 Clinical Analytics & Data Strategies
Tatiana Scetinina (AstraZeneca) Clinical Programming / Data Science


Name (Organisation) Name (Organisation)
Aldir Medeiros Filho Bruce Wienckowski (GSK) 2
Hanming Tu (Frontage) Hatim Qais (FDA)
Jorine Putter Katja Glaß
Kristy Lauderdale (CSG) Niels Gronning (SAS)
Naga Madeti (Covance) Ajay Yalwar (Covance)
Nurcan Coskun Russell Gibson (CROS NT)
Sangram Parbhane Ashwin Venkat (BuddhiMed Technologies)
Walter Jessen (LabCorp) Yufei Du (GSK)
Douglas Kelkhoff (Gene) Marko Zivkovic (Genesis)
Sam Hume (CDISC) Martijn.X.Van- Beelen (GSK)
Teckla Akinyi (GDK) Nithiya Ananthakrishnan (Algorics)
David Pressley (Clear Creek Analytics) Amol Waykar (d-wise)
Vamshi Matta (Covance) Jingyuan Chen (Roche)
Ross Didenko (Covance) Sasikumar Palanisamy (Covance)
Srinivas Veeragoni (Bayer) Tim Williams (UCB)
Anbu Damodaran (Covance) Babru Hottengada (Eliassen)
Nagadip Rao (EG Life Sciences) Ganesh Vaidyanathan
Kiran Boddu ( William Noble (Sarah Cannon)
Ashley Tarasiewicz (Atorus Research) Nathan Kosiba (Atorus Research)