Summary of Traceability References

From PhUSE Wiki
Jump to: navigation, search

Table below summarizes and interprets traceability references found in the public domain (e.g. conference papers), including FDA docs (e.g. Common Issues Document)

Document Title Source Summary/Interpretation
Methods of Building Traceability for ADaM Data PharmaSUG 2011 Four methods of building traceability in ADaM datasets through examples of questionnaires
CDER Common Data Standards Issues Document (Version 1.1/December 2011) FDA Page #6 -

Analysis datasets should be derivable from the SDTM datasets, in order to enable traceability from analysis results presented in the study reports back to the original data elements collected in the case report form and represented in the SDTM datasets.

Comment: FDA seems to expect sponsor to create Analysis dataset from SDTM not from Raw data

CDER/CBER’s Top 7 CDISC Standards Issues FDA #18 and #19:

6. Traceability Need linkage: CRF -> SDTM -> ADaM -> CSR SDTM datasets should be created from CRFs If instead CRFs -> Raw -> SDTM, your analysis (and hopefully ADaM) datasets should be created from those same SDTM datasets, not the raw datasets Features exist in the ADaM standard that allow for traceability of analyses to ADaM to SDTM Creating SDTM and Analysis data from the raw data is incorrect (especially when submitting only SDTM and analysis data Raw data should create SDTM, and SDTM should then create Analysis

Comment: FDA seems to expect sponsor to create Analysis dataset from SDTM not from Raw data

Traceability between the clinical database and analysis datasets for a submission public meeting Legacy data conversion process well described
Traceability between SDTM and ADaM converted analysis datasets PhUSE 2010 QC process of ADaM conversion well described
ADaM Implications from the “CDER Data Standards Common Issues” and SDTM Amendment 1 Documents SCSUG 2012 Relationship between CDER Data Standards Common Issues document and SDTM IG well elaborated.
ADaM or SDTM? A Comparison of Pooling Strategies for Integrated Analyses in the Age of CDISC PhUSE 2012 Data pooling strategy well described. Details traceability from single study SDTM to single study results by adding additional records to the integrated database with a harmonized parameter.
Electronic Common Technical Document Specification V3.2.2 ICH Multiple citation of navigation and its functional purposes.
ICH GCP Guideline for Good Clinical Practice E6(R1) ICH Traceability is one of the 13 principles of ICH GCP, Section 2.10, page 9: All clinical trial information should be recorded, handled, and stored in a way that allows its accurate reporting , interpretation and verification.
Reflection paper on expectations for electronic source data and data transcribed to electronic data collection tools in clinical trials EMA Comments on traceability:
  • Section 3, page 5: The fundamental issues to be demonstrated remain common in many cases to both paper and electronic systems (e.g. traceability, change-control...), though electronic systems present additional challenges in providing an adequate level of confidence in the data and should be validated.
  • Section 6.1, page 7: Data should be traceable and an unambiguous subject identification code should be used to allow identification of all data reported for each subject.
EMEA implementation of electronic-only submission and eCTD submission: Questions and answers relating to practical and technical aspects of the implementation EMA Hyperlinks (navigation) are useful when only they add values (Question 3, page 28):
  • Paragraph 1: In general, hypertext links are encouraged within the eCTD to facilitate swift navigation around the dossier, but should not be overused.
  • Paragraph 2: The greater the number of hyperlinks contained in an eCTD dossier, the longer it takes to technically validate the submission, and the greater the likelihood of non-functioning hyperlinks.
  • Paragraph 4:For Non-clinical/Clinical, there is a less defined structure within Modules 4 and 5 and the placement of studies and their names may vary across submissions, meaning that a larger amount of linking from summaries is of benefit.
Data Standards Strategy V1.0 FDA Traceability is mentioned in the context of legacy data conversion. The agency will "provide technical guidance (nonbinding) to industry for conversion from legacy data to SDTM compliant datasets." (Section 8.2, page 8).
Define.xml Version 2.0 CDISC Traceability is mentioned in the context of (1) annotated CRF page numbers, (2) Computational Method Definitions, and (3) def:Origin Element:
  • Section 4.5. Links to Supporting Documents: Annotated CRF page numbers may be included in an ItemDef's def:Origin element to provide traceability between the data collection CRF and submission dataset variables. See section for details of the def:Origin element and section for details of the def:PDFPageRef element.
  • Section 4.6. Computational Method Definitions: "To enhance traceability users are encouraged to provide descriptions that include accurate and consistent references to source variables and derivations."
  • Section def:Origin Element: "Predecessor: Data that is copied from a variable in another dataset. For example, predecessor is used to link ADaM data back to SDTM variables to establish traceability."
CDISC ADaM IG version 1.0 CDISC Traceability is required, not optional.

ADaM has 2 levels of traceability: metadata and data-point.

  • Metadata traceability is required.
  • Data-point traceability should also be provided whenever feasible and practical.

Intermediate analysis datasets can be included to help describe complex derivations.
One of the nicest features of the vertical structure of BDS is that it allows us to easily include data-point traceability via variables --SEQ and SRC*.
The sole purpose of some ADaM variables and rows is to provide traceability, not for direct use in analysis. For example:

  • Variables can be copied from SDTM even if not used for analysis, such as to show how they contrast with similar ADaM variables.
  • Variables can be copied from ADSL to any other dataset, even if only for traceability.
  • Whenever feasible, use a full analysis dataset with all records used in determining analysis parameters. If the full dataset becomes too cumbersome to work with, create a subset in addition to the full analysis dataset.

Because ADaM requires we include all observed and derived rows for a parameter, we can use ANLzzFL to distinguish which are used in analysis. Doing so provides traceability between the dataset and the results.

ADaM Datasets for Graphs PharmaSUG 2013 Example of a clear path from SDTM VISITNUM to ADaM AVISITN and X-Axis on summary figure is provided in Table 2 and Figures 1 and 2.
Traceability in the ADaM Standard PharmaSUG 2013 Basic SDTM to ADaM traceability practices described. ADaM to ADaM traceability practices described. ADaM to table/figure output traceability practices described through the use of analysis flags.
Considerations in Data Modeling when Creating Supplemental Qualifiers Datasets in SDTM-Based Submissions PharmaSUG 2013 Explains the basics of Supplemental Qualifiers in SDTM data. The paper also provides alternative methods for representing data in custom SDTM domains while preserving the relationship between the data in the custom domain and the other collected data.
Clinical Data Acquisition Standards Harmonization (CDASH) User Guide CDISC 2.4.2 Using Fields That Do Exist in CDASH

The goal is to have end-to-end traceability of the variable name from the data capture system to the SDTM datasets.

2.4.3 Creating Fields That Do Not Exist in CDASH
The naming conventions and other variable creation recommendations in CDASH are designed to aid implementers and facilitate traceable mapping to submission datasets.

Tame the Traceability Monster PhUSE SDE 2013

Presentation at phuse SDE in Copenhagen 28th May 2013.
Traceability from reviewer’s and sponsor’s perspectives. Short introduction of CDISC project at Novo Nordisk – SDTM initial implementation including legacy conversion to SDTM. Our approach to ‘tame the traceability monster’ for legacy data.

Study Data Specifications FDA

Page #6:
- Variable names and codes should be consistent across studies and where feasible, the NCI CDISC Vocabulary should be used. For example, if glucose is collected in a number of studies, use the CDISC Submission Value “GLUC” for the laboratory test code in all of the studies. Page #8:
- For textual data that have been mapped to numeric codes, provide two variables, one with text and one with numeric codes.
- Dates should be formatted as numeric in the analysis datasets even if dates are in ISO8601 or another character format in the raw data. This formatting will facilitate calculations, such as duration.
- Any submitted programs (scripts) generated by an analysis tool should be provided as ASCII text files or PDF and should include sufficient documentation to allow a reviewer to understand the submitted programs. If the programs created by the analysis tool use a file extension other than .txt, the file name should include the native file extension generated by the analysis tool for ASCII text program files, e.g. adsl_r.txt or adsl_sas.txt, etc.
Page #10:
- The internal dataset label should clearly describe the contents of the dataset. For example, the label for an efficacy dataset might be "TIME TO RELAPSE (EFFICACY)".
Page #15:
- The annotated CRF is a blank CRF that includes treatment assignment forms and maps each item on the CRF to the corresponding variables in the database. The annotated CRF should provide the variable names and coding for each CRF item included in the data tabulation datasets. All of the pages and each item in the CRF should be included. The sponsor should write not entered in database in all items where this applies. The annotated CRF should be provided as a PDF file. Name the file blankcrf.pdf.

Draft eStudy Data Guidance FDA

Page #10, line 251:
Clinical and nonclinical study data that were previously collected in a nonstandard format are not always easily amenable to complete standardization. Typically, a conversion to a standard format will map every data element as originally collected to a corresponding data element described in a standard. Some study data conversions will be straightforward and will result in complete data in a standardized format. In some instances, it may not be possible to represent a collected data element as a standard data element. In these cases, the submission should document why certain data elements could not be fully standardized or were otherwise not included in the standardized data submission. In cases where the data were collected on a CRF but not included in the converted datasets, the omitted data should be apparent on the annotated CRF. The tabular listing of studies in a submission should indicate which studies contained previously collected nonstandard data that were subsequently converted to standard format.

Considerations in Creating Transparent SDTM-Based Datasets PhUSE 2014 Look at a number of specific areas where incorrect or misleading mapping may compromise the goal of 'traceability'.
Traceability: Plan Ahead for Future Needs PhUSE 2014 Describes some simple ways to incorporate traceability into the dataset and output development process, and elaborates on some of the benefits seen when traceability is incorporated. It includes references to documents that we, as two of the co-leads of the Computational Sciences Symposium working group on Traceability and Data Flow, helped develop and post to the wiki.
A Day in the Life of a Data Sharing Specialist PhUSE 2014 website allows data requests/transfers between and among Bayer, Boehringer Ingelheim, GSK, Lilly, Novartis, Roche, Sanofi, Takeda, UCB and ViiV Healthcare.
Data Transparency: Moving from Bad Pharma to Good Science PhUSE 2014 Enhanced data sharing with researchers, public access to data, patients who participate in trials, certifying procedures to share data and reaffirming commitments to publish clinical trial results.
Data Transparency Through Metadata Management PhUSE 2014 Data transparency requires assurance that reported data are accurate and are coming from the official source. Currently however, the data resides in a myriad of systems and formats, making it difficult to maintain the lineage from data collection through analysis and reporting. By managing data about the data or metadata across the enterprise, organizations can provide full data lineage for regulatory compliance and improve business efficiency at the same time.
The Capish Information Model - Simplify Access to Your Data PhUSE 2014 Outlines opportunities for creating clinical data transparency by integrating data to a well-defined, source-independent information model. Also how challenges in protecting patient privacy and intellectual properties can be overcome.
Transparency in the Time of Constant Change PhUSE 2014 Discuss some more trending ways, such as reporting results to CT.GOV, of both creating and presenting data in ways that ensure it is consumable and can be understood not only for analysis/submission purposes but also that post-approval it is transparent and that everyone who has a vested stake can review the data in an appropriate way.
Multi-Sponsor Data Transparency: A Group Approach to Sharing PhUSE 2014 Decisions to implement transparency systems were initially guided by proposed EMA regulations, and are now proceeding under their own momentum as biopharmaceutical companies strive to show that they have nothing to hide regarding their clinical research programs. Additionally, many of the companies that have launched their corporate transparency implementations are going one giant step further, and offering independent researchers the opportunity to readily investigate clinical trials data that spans multiple manufacturers. Fewer than 18 months ago, most biopharmaceutical companies viewed their clinical trial data as strictly proprietary. Today, these companies are actively enabling independent researchers to explore the trial data directly. During this session, you’ll learn more about this important industry initiative, and how your organization can support its success.
Draft Functional specifications for the EU portal and EU database to be audited EMA The EU portal and the EU database and associated workspace are all designed to provide users with access to data. This will provide traceability and transparency to submitted data.
Building Traceability for End Points in Analysis Datasets Using SRCDOM, SRCVAR, and SRCSEQ Triplet SAS Global Forum 2013 To be compliant with ADaM Implementation Guide V1.0, traceability feature should be incorporated to possible extent in study analysis datasets. There are two types of traceability: (1) Metadata Traceability (2) Data Point Traceability. Data Point Traceability provides clear link in the dataset to specific input data values used to derive analysis values. SRCDOM, SRCVAR, and SRCSEQ triplet is one among many ways suggested by CDISC to establish data point traceability in ADaM datasets. This poster provides various examples of applying SRCDOM, SRCVAR, and SRCSEQ triplet to establish traceability in efficacy ADaM datasets from Cystic Fibrosis therapeutic area. – User friendly ways to get traceability PhUSE 2012 Presentation looks at different ways to obtain and document traceability. Final recommendation: Embed SAS programs within RTRACELOC option to document all input and output files
Examples of Building Traceability in CDISC ADaM Datasets for FDA Submission SAS Global Forum 2012 This paper provides examples in applying the inherent traceability features available in ADaM Basic Data Structures (BDS), adding SRCDOM, SRCVAR, and SRCSEQ variables and with examples about adding Relation Criteria and Relation Factor variables in ADaM datasets.

Traceability & Data Flow wiki page