SEND Implementation Wiki - FAQ
- 1 Basics
- 2 Timing / Regulatory
- 3 Protocol, Report, QA, QC Impact
- 4 Working with SEND Files
- 5 SEND Implementation
- 6 Controlled Terminology (CT) and Formats
- 7 Numerical Data (Measurements Data)
- 8 Categorical Data (Findings Data)
- 9 Define File
- 10 SDRG
- 11 Domain Specifics
- 12 Use of SEND Packages by the FDA/Industry
- 13 SEND Future
- 14 Open Questions
This page covers a knowledge base of Frequently Asked Questions regarding SEND.
- Known Issues
- Getting SEND-ready
- SEND Fundamentals
- CT Fundamentals
- Define.xml Fundamentals
- SEND Implementation Forum - Don't see your question? Ask it here!
>>>>Want updates? Subscribe to this page! See the PhUSE Wiki - Subscribing to Pages page for details!
|Note: The content of this page was prepared by PhUSE working group and SEND team members and should not be considered as official FDA responses. This content exists to concisely summarize answers that are usually available buried within other documents or pages, to provide implementers with quick, unofficial, and useful answers to their questions.|
What is SEND?
SEND, or the Standard for Exchange of Nonclinical Data, is an implementation of the CDISC Standard Data Tabulation Model (SDTM) which specifies a way to present nonclinical data in a consistent format.
Timing / Regulatory
When Was SEND Released?
- 2011, July: SEND Implementation Guide 3.0 (SENDIG 3.0) was released. http://www.cdisc.org/send
- 2011, December: FDA CDER announced they accept SEND Implementation Guide 3.0. “CDER strongly encourages IND sponsors and NDA applicants to consider the implementation and use of data standards for the submission of applications...” See the FDA Data Standards page for more details.
When Is SEND Mandatory?
The FDA submission requirement for SEND depends on when the study starts and the type of submission:
|NDA, ANDA, and certain BLA submissions||Studies which start after 2016-12-18 (December 18th, 2016)|
|Commercial INDs and amendments, except for submissions described in section 561 of the Federal Food, Drug, and Cosmetic Act||Studies which start after 2017-12-18 (December 18th, 2017)|
- Since studies included in an IND are (always or nearly always) included in the subsequent NDA, many organizations are preparing to have SEND for all studies intended for an NDA or IND submission that start on or after the 2016-12-18 date
- These milestones apply to each study individually. Some submissions may span many years; for these, only studies that start after the dates above are mandated to be in SEND.
- That said, the preference is that where feasible, all repeat dose, single dose and carcinogenicity studies would be submitted with SEND datasets even if not technically required, since it can improve the review process.
- NDA submitted to CDER before 18 Dec 2016- does not need to be submitted in SEND format (although it is preferred)
- NDA submitted to CDER after 18 Dec 2016 - needs any studies which started after 18 Dec 2016 to be submitted in SEND. The other studies in the submission do not need to be submitted in SEND format (although it is preferred)
- IND submitted to CDER before 18 Dec 2017 - does not need to be submitted in SEND format (although it is preferred)
- IND submitted to CDER after 18 Dec 2017 - needs any studies which started after Dec 2017 to be submitted in SEND. The other studies in the submission do not need to be submitted in SEND format (although it is preferred)
These requirements to produce SEND datasets for FDA hinge around the following documents:
- The Food and Drug Administration Safety and Innovation Act (FDASIA), effective the 1st of October, 2012, including the fifth authorization of the Prescription Drug User Fee Act (PDUFA V). (Note that electronic format for submissions is addressed in FDASIA Sec 1136 which ammended Sec 745A of 21 U.S.C. 379k)
- The “Guidance for Industry ‐ Providing Regulatory Submissions in Electronic Format ‐ Submissions Under Section 745A(a) of the Federal Food, Drug, and Cosmetic Act” (aka "final guidance"), which was finalized December 18, 2014.
- The “Guidance for Industry ‐ Providing Regulatory Submissions in Electronic Format ‐ Standardized Study Data” (aka "eStudy Data guidance"), which was finalized December 18, 2014.
For more information, see the following FDA pages:
The Japanese Pharmaceuticals and Medical Devices Agency (PMDA) is working to adopt CDISC standards including SEND. At the April 2014 CDISC Europe Interchange, Yuki Ando presented their plans for adoption at some point after FY2017. For more information see the following PMDA page:
When new study types or versions of the SEND Implementation Guide are brought online, when will they be required?
When will Safety Pharm be required?
When will Respiratory and Cardiovascular be required?
When will Repro be required?
When is SENDIG 3.1 required?
How long is SENDIG 3.0 valid?
(Note: this pertains to updates made after the initial SEND requirement laid out in the previous question, e.g., IG updates, new study types, etc.) After new standards or updates are published, pending an evaluation by CDER, CDER will add the standard to the Study Data Standards Resources page with a timeframe for requirement. The timeframe for these will be at least 12 months after the standard/version is added to the page, and will apply only to new studies. It is expected that larger scale additions (such as completely new subject areas) will have a longer timeframe for Sponsors to implement and ramp up before it becomes required. Note:
- To get updates (highly recommended), click the "Sign up for email updates" link at the top of the Study Data Standards Resources page.
- At any given time, within a class of submission (e.g., NDA vs IND), only one version of a document will be officially required. For instance, as soon as SENDIG 3.1 becomes required for NDAs, SENDIG 3.0 is obsolete for NDAs.
- For studies which started when an older version was required (compared to what is required at the time of submission), sponsors have the option to submit with the newer version. For example, say a chronic starts before the 3.1 requirement date, but by the time the study finalizes, 3.1 is now the required version for studies that start. In this example, the Sponsor may choose to submit the study in 3.1 (instead of the 3.0 as technically required).
Example: say a new/revised standard is published by CDISC in August 2017, reviewed by CDER and added to the Study Data Standards Resources page September 2017, and then marked as required October 2018):
- Sponsors have the opportunity to be a part of the development and public comment periods on the new standard/version leading up to the publish from CDISC in August 2017, and can bet on CDER adding it to the Study Data Standards Resources page soon after.
- Sponsors have, at the least, from the time the standard is added to the page (in this case, September 2017) up to the requirement date (in this case, October 2018) to gear up implementation for the updated version.
- For a new submission submitted to CDER before October 2018, studies within do not need to adhere to the new standard, although it is encouraged/preferred
- For a new submission submitted to CDER after October 2018, studies within the submission which started after October 2018 need to adhere to the new standard. The other studies in the submission which started before October 2018 are not required to be submitted according to the new standard, although it is encouraged/preferred
When will new Controlled Terminology be required?
New CT will be required for studies "within a reasonable timeframe" from the new CT's release date. What is "reasonable" is open to interpretation, but we recommend keeping within a year of the CT's release date when packaging a study.
Note that this stipulation applies to CT active at the time of the creation of the SEND package for the study. For instance, if a SEND package is created for a study in 2013 and not submitted until 2017, the CT to which it must adhere is the CT active at the time of the packaging (e.g., 2013 or shortly before it). There is no requirement to retroactively update past studies with CT that comes out after finalization.
Visit the SEND CT page to get the most recent CT.
When can SEND replace TUMOR.XPT in FDA submissions?
The intent is to phase out tumor.xpt in the future, in that it can be generated from a SEND package. However, it is currently still active. As always, consult the Technical Conformance Guide (as referenced in the Study Data Standards page) to see what is or isn't required.
How do I know whether SEND is mandatory for any given endpoint?
Do I need to include data for unmodeled endpoints?
Is endpoint ___ required for a study type that isn't required?
For an endpoint to be required, it must meet two criteria to be considered mandatory for FDA submissions:
- The study type is one explicitly called out in the FDA Study Data Standards
- The endpoint is one that is currently covered/modeled by the domains in the SEND Implementation Guide
In other words:
- If the endpoints are not currently modeled, it doesn’t matter if it happens to be under the right study type – it doesn’t have a hard requirement.
- If the study type is not one asked for, then it doesn’t matter if it happens to be a modeled endpoint – it doesn’t have a hard requirement.
However, this is purely speaking to what is mandatory. The FDA has stated on numerous occasions that they still would prefer to receive data for both of the above exceptions (e.g., via custom domains or to have packages for non-required study types if they have fitting domains). Please contact email@example.com for additional advice about such a submission.
Is SEND mandatory for study types that were not piloted?
For example, if there is a safety pharm study with body weights, will the body weights need to be submitted?
The Study Data Standards page provides current expectations/recommendations. Note that while typical study types will be piloted, but a study type need not be piloted to be specified by the Study Data Standards page to be mandatory.
However, electronic submissions of data are encouraged, even when the study type is not yet mandatory. Please contact firstname.lastname@example.org for additional advice about such a submission.
For studies I submit with SEND datasets, what is the FDA's recommendation for including non-SEND datasets? (e.g., custom domains)
Is it required to submit data not modeled in a domain yet?
Generally speaking, from the industry side, it is not considered valuable to provide custom domains, given the issues with nonstandardized data (nonstandard format; can't use across organizations, etc.). Additional domains not part of SENDIG would still be present in individual tabulations (e.g., PDF) for submission coverage purposes.
However, in general, this is encouraged by the FDA. Please contact email@example.com for additional advice about such a submission.
What is the status of the FDA pilots (CDER, CVM, CBER)?
Please see the Study Data Standards page for pilot status.
How will the FDA use SEND files?
The FDA uses the files for the review process, via the Nonclinical Information Management System (NIMS) suite. This suite provides tools that are built to use SEND datasets' information, such they are able to review a submission more efficiently than when they receive only PDF or printed submissions that contain the individual animal data. Before loading the files, their gateway will first perform validation checks against the data to make sure they are SEND-compliant. Note that validation in this sense is not computer validation but rather a series of nuts and bolts checks, such as whether required variables are missing or have an unsupported value in a field that requires Controlled Terminology.
Is SEND only a U.S. requirement?
Will the EMA (EU) require SEND?
Will the PMDA (Japan) require SEND?
SEND will only be a requirement in the United States for certain FDA submissions. However, it has operational use, such as transfer between organizations, sponsor warehousing, etc., such that it is a good idea to produce SEND datasets, even if not technically required for submission.
As far as the European Medicines Agency (EMA) goes, the Clinical Trial Advisory Group on clinical trial data formats (CTAG2) is working on advising the EMA on clinical data formats, where it is leaning toward CDISC standards (although if it accepts, it would likely follow a similar progression as the FDA, with a 2-3 year pilot. Here is a link to the recommendations CTAG2 provided the EMA: Final advice to the European Medicines Agency from the clinical trial advisory group on Clinical trial data formats.
As of 2016, PMDA has put forward a schedule for requiring SDTM on the clinical side and plans to explore the nonclinical side as well (with possible pilot).
Technical Rejection Criteria
As a reminder, the Technical Rejection Criteria are available on the FDA Study Data Standards page.
When do the Technical Rejection Criteria go into effect?
30 days after publish on the FDA Study Data Standards page.
Which TSPARMCDs are required by FDA?
Just TSPARMCD=STSTDTC is explicitly required (for legacy studies under covered study types, e.g., single dose, carc, etc.). All other studies should have a full TS.
There were several variables added in SDTM v1.3, e.g., TSVALNF, TSVALCD, TSVCDREF, and TSVCDVER. Will FDA accept submissions with these not included?
If it is for legacy data submission, you don’t need to include those variables for TS. For data submissions, please create TS according to the IG you use.
Is the TS.xpt file for studies that start before 17 December 2016 required beginning on 17 December 2016?
If you submit SEND after 2016-12-17, TS is required. For legacy cases (e.g., only the report), then only the short version TS is required. All other studies should have a full TS.
Protocol, Report, QA, QC Impact
How will my study reports from a CRO change when SEND files are used?
For the near future, they will not likely change. However, it is a longer term goal for the SEND datasets to eventually replace the individual tabulated datasets. The main body of the study report, the summary tables and other appendices would remain in their current format, though.
Should SEND be listed in the Protocol or treated only as a contract deliverable?
We recommend SEND be listed in the Protocol to ensure that the endeavor is resourced properly and that expectations are set and met.
Please see the Handling of SEND in Study Documentation page for more details and discussion around this question.
Should SEND datasets be sent through QA review?
Depends on whether they are bound by GLP.
Please see the Handling of SEND in Study Documentation page for more details and discussion around this question.
What QC should be applied to SEND datasets?
Depends on whether they are bound by GLP.
Please see the Handling of SEND in Study Documentation page for more details and discussion around this question.
Will the study director's signature on the report also indicate accountability for the SEND datasets?
Depends on whether they are bound by GLP.
What documentation is required for SEND datasets created retrospectively (after report finalization)?
Depends on whether they are bound by GLP.
Working with SEND Files
What's in a SEND Package? What is a SEND File? What is a SEND dataset?
A SEND package consists of a number of dataset files (in XPT format, a.k.a., SAS v5 Transport format) and a define.xml file (which provides information about what's in the datasets).
If I receive a SEND file (i.e. from a CRO), how do I open it/view my data?
How do I open XPT files?
To open XPT files, you have a few options:
- Download and use the free Universal Xpt File Viewer (Open Source). This was previously known as the SAS Viewer, so if you have the SAS Viewer tool, that works in the same way. Note that this tool is very limited in features.
- Open in SAS using the XPORT libname option, e.g.:
libname MyLib XPORT "C:\test.xpt"
- Open in R (see the Open XPT File with R page for instructions)
When opened, they appear similarly to Excel workbooks.
Various other vendors have products that will allow you to view SEND datasets as well. These tend to be more robust in visualization/analysis capabilities and enabling of review.
How can I create a ts.xpt file with the Study Start Date?
The nonclinical script group has prepared an R script to help do this. It is available from https://github.com/phuse-org/phuse-scripts/tree/master/contributed/Nonclinical/R/CreatingXPT. To run this script, you will first need to obtain R and confirm you can use it to read an XPT file (see the Open XPT File with R page for instructions). Once you are successful with that, do the following
- Open a command prompt and enter the command: java -version
- If java is installed the version number should be displayed. If it isn't displayed, install java. If you have a 64-bit installation of R you also need a 64-bit installation of java.
- Copy the files from the aforementioned github folder to "c:\temp\r testing".
- Create an empty sub-folder named "C:\Temp\r testing\xpt output".
- Launch RGui
- Run the script by selecting the menu options "File" -> "Source R code..." and selecting the file "c:\temp\r testing\CreateTsXPT.R"
- Confirm you have xpt files in the appropriate sub-folders of the "xpt output" folder.
- Modify the Excel file and/or script as needed to get the results you need.
What should the catalog (dataset) name be for the SAS Transport files?
The catalog (dataset) name should be the same as the name of the xpt file. For example, a SAS transport file named BG.XPT should have a catalog name BG.
Will the XPT files be replaced with something easier to work with, like XML?
Yes. Please see the "When will the XPT files be replaced with XML?" question further below under the SEND Future section.
Does the FDA mandate or endorse use of a specific validator like the one from OpenCDISC?
It has been stated in open forums that FDA CDER does not, and does not intend to, have a required or preferred validator. However, the validation rules developed jointly by the FDA and industry will be published in the future. A variant of these rules is currently used by the OpenCDISC Validator tool and can be viewed through its configuration. Organizations are free to build their own validator tools to build on the rules, though, such as to validate against organization-specific data cases, provide additional checks for incoming data that are to be consumed, etc. Validation rules aside, the SENDIG provides the official rules for what comprises a SEND-compliant package, so the implementation guide takes precedence over any discrepancies between the implementation guide and validation rules.
Are there publicly available sample SEND datasets?
Here is a list of places to obtain sample datasets:
- Contact firstname.lastname@example.org with a subject line of “Send me SEND” to get an FDA validated SEND dataset for an example 28 day toxicity study
- Go to SENDDataSet.org to download an FDA validated multi-organizational SEND dataset.
- Many CROs have sample datasets they can provide to evaluate their capabilities or prepare for your own implementation
There are also many examples in the SENDIG of domain records.
For more information on implementation and SDTM/SEND basics, check out the SEND Implementation Wiki - SEND Fundamentals page.
What to Include in a SEND Dataset or SEND Package
Should I include all variables shown in the implementation tables in my dataset?
The "Core" column in the SEND implementation guide defines for each variable whether you must include the variable in your dataset. You must include all variables that are listed as Req (a.k.a., required, meaning not nullable) and Exp (a.k.a. expected, meaning with or without data, but should have a good reason if not populated). Include variables listed as Perm (a.k.a., permissible, meaning can be excluded from the dataset if you do not collect it) if you have data to report in them in the domain dataset. Note that a few variables listed as permissible are either/or (e.g., AGE and AGETXT in DM); in these cases, you are expected to provide at least one of two, although they are defined as permissible because either is acceptable. Such a case will be noted in the CDISC notes for the variable(s) and/or the assumptions for the domain.
Should I include pretest/stock animals in a SEND dataset?
The animals included in a SEND dataset should be consistent with the animals included in your study data report. Generally this means including pretest/stock animals if they have been assigned to a test treatment group and have study data collected.
What study types can/should I use SEND for or include in my SEND package?
As far as can, the domains modeled in SEND can be applied to any study that has them (SEND covers many standard endpoints), although study types not yet piloted may have some endpoints not yet covered.
As far as should, per the SENDIG, section 1.1: "This version of the SENDIG is designed to support single-dose general toxicology, repeat-dose general toxicology, and carcinogenicity studies." These are the types of studies covered in the CDER pilot. A pilot is open with the CVM group, and pilot plans are being discussed for CBER and other FDA divisions. Additional study types are currently being modeled (repro, safety pharm), during which pilots will be conducted, and updates to the SENDIG to include those study types will be completed. However, note that the domains modeled in SEND can be applied to any study that has them as SEND covers many standard endpoints, although study types not yet piloted may have some endpoints not yet covered.
Does the order of rows in a domain file matter?
Does the order of columns in a domain file matter?
Columns should be included in a SEND domain file in the order they are listed in the SEND Implementation Guide's domain tables.
Does the value of --SEQ matter?
The only requirements on --SEQ are that it is (1) an integer and (2) unique for each record within an animal. This means that each animal could have *a* record with a --SEQ value of 1, but no two rows for that animal could have the same --SEQ. One option populating --SEQ is to make it unique for the entire dataset. This method covers the requirements of --SEQ and also provides a unique key for the dataset.
What is --GRPID used for?
--GRPID is used to link together records in a single domain for a single subject. The meaning behind a --GRPID value is entirely up to the sponsor. While specific to a domain, --GRPID can also be used in conjunction with a RELREC relationship to link those records to records in other domains. For example, a --GRPID could be set in the CL domain to link together 3 clinical observations, and then a RELREC relationship based on that --GRPID and a record from another domain, to establish the relationship with a single record, instead of individually making a relationship between each record.
I have to make a SUPP domain entry because the entry in my domain field exceeds 200 characters. What do I use for QNAM if the domain variable is already 8 characters (like LBREASEX)?
In cases where the standard domain variable name is already 8 characters in length, sponsors should replace the last character with a digit when creating values for QNAM. As an example, for Reason for exclusion in LB (LBREASEX), values for QNAM for the SUPPLB records would have the values LBREASE1, LBREASE2 and so on.
How are records which are scored (like some clinpath, dermal, or neurotox tests) supposed to be presented?
Include the score as the result (--ORRES, --STRESC) as you would on your tables. Typically, these scores do not represent a true number, so typically, --STRESN would not be populated. The meaning behind the scores (e.g., that 1 means "Quivers of limbs, ears, head or skin") would be provided in the define.xml file as a CodeList with their coded and decoded values.
Why is QEVAL expected in SUPP-- domains, but --EVAL is permissible in all other domains where it is present?
This allows all SUPP-- domain files to have a consistent structure: all variables will be present in all SUPP-- files since all of the variables are either Required or Expected.
If animals are scheduled to have findings in the future (e.g., at terminal sacrifice) and the animals are removed from the study early (e.g., unscheduled sacrifice), how should findings be represented (e.g., with VISITDY, --STAT, and --REASND)?
Only in the instance where you carry out a planned assessment according to the plan (within the grace days that you have defined for that VISITDY), can you populate a record with VISITDY information. All other instances must be considered unscheduled activities and cannot have a populated VISITDY.
Findings that were originally planned for animals removed early from study are not required to be artificially inserted into the SEND datasets (e.g., with a --STAT of NOT DONE).
When the protocol is amended during the study to change planned activities, when/how should VISITDY be populated?
As soon as the schedule changes by amendment, that is now the plan, and so VISITDY should be populated with the amended planned date(s). For example, if a group is decided to be sacrificed early by protocol amendment, then the corresponding disposition records would have VISITDY populated with the early sacrifice day(s).
How should the --DTC/--DY variables be populated in cases where --STAT=NOT DONE?
When --ORRES cannot be populated, --STAT and --REASND take the place of --ORRES and thereby represent the outcome of the --TEST. In that instance, --DTC and --DY describe the outcome indicated by --STAT and --REASND. If you are collecting timing information about missing results for planned tests, then you should indeed populate the --DTC and --DY with this information (e.g., the date when the planned test was marked missing).
For outsourced studies, should the STUDYID be the study identifier used by the CRO test facility, or should it be the sponsor's study identifier, if they are different?
The following statement is from the CDER Data Standards Questions Team: "It is our position that the STUDYID variable should be populated with the identifier used during the course of the conduct of the study (in this example, the CRO study ID) and that the TSVAL when TSPARMCD=SSPONSOR should be the sponsor’s identifier."
Working with CROs
What Should I Ask My CRO?
A number of considerations arise when initiating a conversation with another organization around SEND file production. The SEND between Organizations page has an extensive "Points to Consider" question list to help smooth this process.
Working with Multiple Files/Studies/Versions
Should the SENDIG and/or CT version used be the same for all studies in a submission?
Not necessarily. The only requirements are that the same SENDIG/CT version be used within a study, and that the version is what is (reasonably) up-to-date as of the creation of the package. Especially with submissions spanning many years, it is very possible for there to be different SENDIG versions across studies as well as different versions of CT. The only expectation per the draft guidance is that the versions used at the time of creation are current within a reasonable timeframe (what is "reasonable" is being formulated and is expected to be stipulated in the final guidance).
What do I need to do when collating datasets for a domain?
For instance, when two independent labs contribute lab test data toward LB...
In some cases, pieces of a domain may come from different sources, such as different labs or different systems. When bringing the data together into a single dataset for the domain, here are some things to keep in mind:
- --SEQ may need to be re-sequenced so that no two rows have the same value within an animal
- And if you do, RELREC records based on --SEQ values may also need to be updated in the same way
- --NAM may need to be populated to distinguish between labs performing the testing
How is versioning handled for different versions of a SEND package for a study?
When providing interim datasets, do you provide deltas or full loads?
First, SEND packages are technically only considered valid or complete when they contain all data - there no specifications on providing partial data (e.g., deltas between versions). Subsequent versions of a package (version 2, 3, etc.) would be cumulative from past versions, including the existing data and any data collected since the last package.
In a number of operational cases, it is necessary or useful to be able to tie together records between a current version and the version prior, such as for the detection of deltas. There are a couple options for this:
- To definitively identify a record as being the same record across versions of a package, use the RECID variable (invariant record ID), new in SENDIG 3.1. Values might typically source from internal database IDs for the source records which do not change over time. If this is feasible to provide, this is preferable, since it then gives the consuming organization a definitive way to detect which records are new vs updated vs removed.
- If the consuming body(-ies) is not interested in figuring out deltas, you could just submit the new version of the package in its entirety.
Are calculated results reported in SEND?
Calculated results are reported in SEND if you would have included those results in your individual animal appendix in the submission.
Why are there entries in SEND files such as body weight gains and organ weight ratios, when that information can be derived from information in other SEND files?
BG and the relative organ weights in OM were added because they are used in most submissions as individually reported endpoints (and SEND is focused on the individual animal data). Including them as separate endpoints removes any ambiguity or duplicated calculation on the reviewers' part. In the future, these endpoints may be removed as analysis modeling matures for nonclinical data (e.g., through ADaM).
How do I manage unscheduled data (i.e., data collected in an unscheduled interval)?
For unscheduled findings, leave the VISITDY (the planned study day) blank. TPT and TPTNUM values would also be null.
How do I indicate derived records?
When should I use --DRVFL?
What is --DRVFL used for?
Can I use --DRVFL to indicate any derived records?
There are two flavors of derived to consider.
- When the record in the dataset is derived from other records in the dataset, then (and only then) you can use the --DRVFL variable to delineate between the derived record and the ones going into the derivation (one example could include blood pressure readings, where the result record is picked or averaged from 3 constituent readings, and so the flag serves to indicate which is the derived record (and in this case, the one to keep)). It cannot be used if the record was derived from information outside the dataset; this constraint means is it has limited use.
- When the value is calculated through a calculation, algorithm, etc. from data often outside the dataset, then this status can be indicated via the define file (i.e., whether the test was collected, derived, etc.).
SENDIG 3.1 and beyond describes this in more detail.
Controlled Terminology (CT) and Formats
For more information on Controlled Terminology, check out the SEND Implementation Wiki - CT Fundamentals page.
Working with Versions of CT
How frequently are controlled terminology files updated?
CDISC/NCI releases controlled terminology in "packages." New packages are released as needed throughout the year, generally 2-4 new packages a year are released.
Where can I find the most recent controlled terminology?
SEND terminology is available for direct download from the CDISC SEND directory on an NCI File Transfer Protocol (FTP) site in Excel, text, odm.xml, pdf and html formats.
Where can I find old controlled terminology versions?
All published versions of SEND controlled terminology are in the Archive folder of the CDISC SEND directory on the NCI File Transfer Protocol (FTP) site. The date included in the file name is the date of publication.
Should the controlled terminology version used be the same across a single study?
Are multiple controlled terminology versions okay within a study?
Yes, it is required that within a study, only one version of controlled terminology is used. Additionally, it is expected that data from multiple contributors (e.g., different CROs contributing data to a study) is aligned for the study.
Should the controlled terminology version used be the same for all studies in a submission?
Not necessarily. The only requirements are that the same controlled terminology version be used within a study, and that the version is what is (reasonably) up-to-date as of the creation of the package. Especially with submissions spanning many years, it is very possible for there to be different CT versions across studies and even different versions of the SENDIG. The only expectation per the draft guidance is that the versions used at the time of creation are current within a reasonable time frame (what is "reasonable" is being formulated and is expected to be stipulated in the final guidance). The same considerations apply to versions of the SENDIG as well.
How do I know what version of controlled terminology was used with a dataset?
In the TS (Trial Summary) domain, when TSPARMCD=SENDCTVER, the TSVAL variable contains the SEND controlled terminology version.
How do I get the OpenCDISC Validator to validate against a particular version of CT?
The OpenCDISC Validator runs off the same terminology files that you can download (e.g., "SEND Terminology.txt") and comes automatically packaged with whatever version is active the last time the Validator was published (and so it can be out of date). To update the CT against which it validates, or to swap in a particular version of the CT, please see the CT Fundamentals page, under the "Updating the OpenCDISC Validator Configuration" section.
Does the case of the controlled terms matter?
Yes. When using controlled terminology in a SEND dataset, you must use the submission terms exactly as they are listed in the controlled terminology file.
Do I need to change my units when mapping to ORRESU?
Do I need to convert units to map to ORRESU?
When mapping to ORRESU, the key is to map to the same unit concept (but the label might be different). The specific unit label used by an organization may differ from the SEND CT preferred label; however, the same concept is there (just as a synonym). For example, for the gram unit, a sponsor might use a label of "grams" or "G" internally, but the SEND preferred term is "g" - this is still the same conceptual unit, just with a different preferred label. Another example is the unit label of "ng/mL", whose submission value is "ug/L" - same unit, just a different label ("ng/mL" is in the synonyms list for "ug/L"). If you are mapping to SEND and do not see your unit represented in the CDISC Submission Value column, check the CDISC Synonym(s) column as well.
As this is just a label change, at no point should a value conversion be performed (or needed) on your original result value. For example, if you collected a value of "30 ng/mL", your value is still 30, just with a different label for the unit.
If you have a unit that is simply not on the Unit codelist in any shape or form, then you have a case for a new term request, which can be submitted to the new CT term request form
Are all controlled terminology for tissues in the singular form (KIDNEY)?
In general, yes. The --SPEC variable to which tissues are mapped represents the material type of the specimen, which is typically inherently singular (there are some exceptions for cases where the tissue is generally considered as a unit, e.g., MENINGES of the brain). Plurality for most tissues is defined through the qualifying variables of --LAT (e.g., LEFT, RIGHT, BILATERAL, UNILATERAL) and, less commonly, --PORTOT (e.g., "MULTIPLE", "SEVERAL").
What do I do if I have terms that are used in our organization that do not map to controlled terminology?
If the variable that you will report the term in uses controlled terminology, and the associated controlled term list is not extensible, then you must find a controlled term to which to map your term (if you do not, you will get validator errors). If the controlled term list is extensible, then you can report the term that you use, so long as it meet the basic requirements of the variable field for length and characters used, and you should also suggest your term to inclusion in Controlled Terminology. The CDISC New Term Request web page handles suggestions for both new terminology and changes to existing terminology.
The implementation guide mentions "ISO 8601 format" and "ISO 8601 character format". Is there a difference?
No. ISO 8601 format as it is referenced in SDTM refers to a standardized way to specify a date or datetime in character format. All date/datetime fields in SEND adhere to this format. The SENDIG provides some guidance on how to do this in the "4.4.1 FORMATS FOR DATE/TIME VARIABLES" section.
How are --TEST and --TESTCD related?
How do I link a --TEST term to a --TESTCD term?
Certain variables in the SENDIG are paired, in which case, their values strongly interrelate. In the case of --TEST and --TESTCD, it is actually IG-enforced that this is the case, and it is a 1:1 relationship. However, especially in domains whose paired variables have hundreds of terms, it can be unclear as to how to link terms that are intended to be paired with one another. In these cases, the synonym is set to the same value between the two terms, and so this can be used to determine what corresponding term should be used. So, typically speaking, you would start by finding the term of interest in the --TEST list (as the long text is more human readable). From there, you would record what it has in the synonym field. And then finally, within the --TESTCD list, you find the --TESTCD term who has that same value in its synonym field.
What do I do when I have two different --STRESU values for the same --TEST?
Currently, there is a warning/error in some validator applications if you have differing --STRESU values within a --TEST, which is unfortunately, a very real possibility, such as when the standard unit differs based on --CAT or --METHOD, as is the case with the LB domain. This is a known issue; for now, populate with the values you intend and be prepared to explain the valid reason why this rule triggers.
Numerical Data (Measurements Data)
I collect Body Weight in kilograms for Male and in grams for Female. How do I have to export the data in SEND?
SEND has different variables for the original versus standardized results:
- --ORRES reflects your original result in whatever units under which it was collected
- --STRESC/--STRESN reflect your result in the units of your choosing (e.g., if units are standardized for reporting)
So, in the example given in the question, --ORRES would be the result as collected (in kg for males and g for females), and if your reported tables would all be in terms of kg (in other words, you choose to standardize to kg), then your standardized results would be the results in kg form (e.g., males as is and females converted).
Units (--ORRESU and --STRESU) are whatever units apply to the results above, but mapped to their scientifically synonymous Controlled Terminology preferred term. For instance, the original unit of "kilograms" for males would be mapped to "kg" in --ORRESU, and the original unit of "grams" for females would use the CT term of "g" in --ORRESU (note - no unit change; just a label change). Likewise, all results would have "kg" in the --STRESU instead of the "kilograms" label used in reporting.
Do I have to convert my results to different units for --ORRESU and --STRESU?
No. There are no stipulations that you must use particular units, only that you use controlled labels for the units that you did use. The Controlled Terminology that applies to --ORRESU and --STRESU only enforces the preferred label for the same base concept, not a preferred unit you should be using. You should use the units under which you collected your result (--ORRESU) and the units you reported (--STRESU), as mapped to Controlled Terminology preferred labels. For instance, if you collected with a unit of mg/mL, you would use the preferred label of "g/L" (scientifically equivalent to "mg/mL"), and your result (--ORRES) will not change. You should not perform any unit conversions when including your original result/unit, only re-labeling. Consequently, for the standardized results, you would only convert units if you converted units for reporting/submission.
What do I do when the precision differs between collection and reporting?
My original result was collected with X sig figs, but we reported with Y sig figs. What significant figures should I use in SEND?
--ORRES is meant to store the original result, and --STRESN/--STRESC represent the standardized, aka reported, value. In theory, the two should be a unit change away from each other. However, therein lies an issue in that it is possible for the original result to have been collected with a different precision than that used for the reported result.
What to do in this situation for --ORRES is currently not clear; however, --STRESN/STRESC should definitely be populated with the reported result, character/numeric, respectively. For --ORRES, there are several approaches being used, but the predominant one follows the description of the --ORRES variable, which is to populate with the value as collected (in raw precision). As to the counterargument of "but they should be the same value, save a unit change", the counterargument is that such rounding is usually only done to levels where greater precision is considered meaningless scientifically.
In the SEND implementation guide, in the CDISC notes for the --STRESN variable for some domains, it mentions "continuous or numeric results". What is a continuous result in this context?
The SENDIG discusses this under the ORIGINAL AND STANDARDIZED RESULTS section. In summary, though, --STRESN is meant to contain the numeric representation of what is in --STRESC, provided that what is in --STRESC is actually to be considered a number. If the --STRESC values do not actually represent a number but instead a code for something, such as is in the case with scores or graded scales, then it is likely inappropriate to populate --STRESN, as these values should not be considered numeric for the purposes of calculations.
How do I handle clinpath results which are above or below limits of quantification (BLQ), such as <1.0?
The SENDIG discusses this under the ORIGINAL AND STANDARDIZED RESULTS section. Briefly, this value is not actually a numeric value, so --STRESN should be left null. However, a value of 1 may actually be used in calculations; the guide directs how to populate a --CALCN variable (usually as a SUPP-- variable) to contain this information.
(SENDIG 3.0 only) What should you do when LBORNRLO and LBORNRHI are text (since the variables are Num)?
You can set up those variables as text and then explain it in the study data review guide.
Note: this issue has been addressed in future versions
Categorical Data (Findings Data)
I have food consumption observations on my study such as "reduced food intake". How do I report these data in SEND?
Currently, the FW domain does not permit observational data. Report these observations in the CL domain. This has been logged as a requested change to the SEND standard.
What do I do with clinical observations modifiers that do not have a specific variable for them (like color)? The implementation guide says a supplemental qualifier --RESMOD isn't "expected to be used".
This information will generally be embedded in the CLORRES variable as part of the text string of the original finding. If there isn't a "bucket" for it already, then it is generally not considered important and thus optional (since this information is represented in CLORRES if desired). However, if it is useful for operational needs to include it as a separate variable, this is exactly the purpose for which SUPP-- was devised. Through SUPPCL, you can fabricate any additional variables you want (although be sure to describe them in your define.xml file).
If I have a comment for FW domain and my data are pooled, how do I set the USUBJID in SEND CO dataset?
Currently the SEND CO domain is not able to use the POOLID. You can leave the USUBJID as empty (USUBJID is an Expected field, not Required). A future version of the SENDIG will address this.
What is a define file? Must one be sent with each SEND package?
Each define file is specific to a package for one study and is the roadmap to the overall content for that study (e.g., which datasets and columns are present). It allows you to explain individual anomalies in your data and connect them to a specific field in a specific domain or it can direct the reviewer to apply the concept to your entire dataset. User-defined controlled terminology also is contained here. The define file must be submitted with each SEND package.
Can the define file be submitted as PDF or does it need to be XML?
The strong preference (as mentioned in the SENDIG) is to submit the define file in XML format.
Must the define file contain the mapping from source field(s) to Controlled Terminology field(s)?
It is not required for a submission to submit the mapping between raw source values and their mapped CT equivalents. Some mappings can include multiple columns, such as a single tissue (e.g., "adrenals") being mapped to multiple terms (e.g., SPEC="GLAND, ADRENAL"; LAT="BILATERAL").
Must the define file contain definitions for sponsor-specific Controlled Terminology?
It is a good idea to list sponsor-specific terms in the define file. Specifically, if there are coded terms, such as 1=minor, 2=moderate, then the decoding should be provided.
Must the define file contain the formulae for calculated measurements?
No; there is no requirement to include the formulae.
What is an SDRG? Must one be sent with each SEND
An SDRG is a document which is meant to aid the reader in understanding the SEND dataset in the context of the study report. The Technical Conformance Guide published by the FDA states that it "...is recommended as an integral part of a standards-compliant study data submission."
What is the proper format of an SDRG? Does it need
to contain anything specific?
While the content and format of an SDRG are not mandated, the FDA has provided some expectations in the Technical Conformance Guide. Templates have been created by PhUSE working groups for clinical and nonclinical studies. The nonclinical template is currently out for public review (which closes on 31 October 2015). The draft template, guide, and examples can be found here: http://www.phusewiki.org/wiki/index.php?title=Nonclinical_SDRG_Template_and_Guide
Is the Trial Design (TE, TA, TX, TS) meant to be planned or actual?
The Trial Design domains are meant to describe the planned design (i.e., that which the protocol and amendments prescribe). The actual progression of elements can resides in other domains, such as Subject Elements (SE) and Exposure (EX).
Is it acceptable to use "Last day of element" as the endrule in TE?
The start and end rules should not self-reference the trial design (i.e., referring to epochs, arms, elements, etc.), instead, basing off study concepts or events. If you desire a more generic end rule, then consider anchoring the end of dosing, such as "Last day of dosing with X", so that it is based on something tangible and readable.
Do I have to submit each dose?
The SENDIG allows you to choose. You can submit 1 record for each dose or you can submit 1 record for each period of consistent exposure for an animal (e.g., same lot, test material, route of administration, dose frequency, etc.). For example, if all animals received 1 dose a day with the same lot for the entire study, then you could submit 1 row per animal to describe the exposure details.
Clinical Observations (CL)
If Clinical Observations are marked as not taken, what should be the CLTEST and CLTESTCD?
Suggest the use of a generic term here, such as "Clinical Observation" and "CLINOB". The reason this was not considered is that it was not expected to occur, that we would have a planned clinical observation session not executed. Most people did not capture that information and reported by exception. Using a generic term like this makes it clear that you did "Not take" a "Clinical Observation", which is what we're trying to represent.
PK Domains (PC and PP)
Can PK data be put into SEND format? Is there a specific template or type of file to upload for PK data?
The PC (concentrations) and PP (parameters) domains have been constructed to handle PK/TK data.
How do you populate PCTESTCD/PCTEST vs. PPCAT?
Below are some principles from the IG to keep in mind:
- PCTESTCD and PCTEST is a 1:1 relationship and define the analyte/specimen (in many cases, this is effectively the analyte code and name, respectively).
- PCTESTCD is limited to 8 characters, can have no special characters, and cannot begin with a number. Often, preparers will populate this with a truncated version of the analyte's identifier if it is unlikely that two analytes with the same final digits will be included on one study, e.g, if compound ABC-1234567 is the test article, this is likely to also be one of the analytes and might be shortened to "ABC12345".
- PCTEST is limited to 40 characters. This is often just the analyte's name or identifier. In some longer treatment names, a raw truncation would not be unique enough, such as cases where the treatment names are verbose. In these cases, the preparer might try a more educated abbreviation or an internal identifier for the analyte. If the analyte's identity wouldn't be clearly understood from these fields, an explanation in the study data reviewer's guide may be necessary.
- PPCAT must equal a PCTEST value - these being equal drives the natural link between PC and PP for all but the most complicated PK cases.
How Do I Handle PK Data or Analysis with CROs/External Labs?
A number of considerations arise when initiating a conversation with another organization around SEND file production. The SEND between Organizations page has some tips and an extensive "Points to Consider" question list to help smooth this process.
In addition, the following page has some files that may be of use: Template to Facilitate Creating Pharmacokinetic SEND Datasets
Use of SEND Packages by the FDA/Industry
What data mining opportunities will SEND enable?
Data standardization is the first step in the chain to realizing cross-study querying and data mining. SEND is expected to open the door for such datamining; although the benefits that can be realized will not be discovered until either a sufficient time has passed to create a significant enough database of historical data or studies are converted and loaded into repository systems to facilitate such queries. One of WG6's foci has been to identify these key areas and facilitate and drive progress.
What software is used by reviewers to visualize and review SEND data?
The FDA uses the Nonclinical Information Management Solution (NIMS), including ToxVision, provided by PointCross, as well as SAS JMP; however, they do not endorse any particular vendor or tool. There are SEND solutions available from PointCross and most nonclinical data acquisition software vendors (e.g., Instem, Xybion, PDS, etc.) which provide the ability to produce SEND datasets, with varying levels of analysis/review options available.
What exactly needs to be included in a complete SEND package that is ready to submit to the FDA?
See the FDA Test Submission site for more information on getting started.
A typical package includes:
- The basic minimums specified in the SENDIG (usually TS, TX, TA, TE, SE, DM, EX)
- Whatever endpoints you reported which have a domain in SEND
- The define file (usually define.xml)
- An SDRG (Study Data Reviewer's Guide)
In addition, for the submission, the following are generally needed:
- A cover letter (e.g., summarizing what's in the submission, reiterating some of the information provided when initiating the submission process, etc.; see FDA site / contact FDA for more information)
See this link for the Nonclinical Standardization Roadmap group's roadmap, a general (but not binding) indication on relative priority of domains:
Is Safety Pharm included in SEND?
No. Safety Pharm is currently being developed and piloted. Certain data types have been modeled (such as ECG, blood pressure, pulmonary, respiratory), but others are still in review (such as CNS: FOB and neuro/neurotox endpoints).
Is DART / Repro Tox included in SEND?
Not yet. Repro is currently being developed. A draft guide has been posted for review for Embryo-fetal developmental (EFD) studies (see http://www.cdisc.org/send), with other study types to be included in later releases.
Will SEND be developed to cover non-GLP studies?
Assuming that the study type has endpoints covered under SEND, then SEND can certainly be used to model the data for a study. Other endpoints may be planned to include at a later time. However, endpoints with small volumes of data and/or high variability between studies, standardization into SDTM/SEND may not be an undertaking that would add value to the overall process.
Will SEND be developed to cover neurotox studies?
See the "Is Safety Pharm included in SEND?" question above.
Will SEND be developed to cover phototox studies?
Some parts of many phototoxicity studies can be represented in SEND 3.0; however, they are not within the scope of studies for current SENDIG version.
Will SEND be developed to cover genetic tox studies?
In 2015, the SEND team formed a subteam to begin to model certain types of genetic tox studies; however, draft domains are not yet available. See the "Are other endpoints being developed?" question.
Are Anti-drug Antibody (ADA) and Flow Cytometry Expected?
If ADA and Flow Cytometry data exist as part of a single-dose, repeat-dose, or carcinogenicity study, is it required to include in a SEND submission?
ADA and Flow Cytometry data are generally possible to model in LB, possibly supplemented with some additional SUPPQUAL variables. However, it is not considered officially in scope at present.
Are other endpoints being developed?
For the time being, the SEND team is focused on Repro and Safety Pharm. In 2015, the SEND team formed a subteam to assess the modeling of dermal and ocular data as well as some genetic tox data. RE and CV domains (for safety pharm) will be published in version 3.1 of the SEND IG; an NV domain for CNS studies is still in development. An Implementation Guide covering Repro EFD studies underwent public review in the summer of 2015. See the Prioritization of Nonclinical Data page for information on the prioritization of endpoints captured from the industry and compiled by the Nonclinical Standards Roadmap group.
As with any currently unmodeled endpoints, the FDA encourages the submission of data, using a SEND domain (if it makes sense) or creating a custom domain.
When will the XPT files be replaced with XML?
The FDA is currently working on this, discussing and determining the pros and cons of some of the available solutions (HL7-based implementation, ODM, etc.).
Will SEND be developed to cover veterinary records (treatment recommendations and observations from the exams)?
Vet records, among other endpoints, are on the roadmap; however, it is not determined when they will be modeled. See the WG6 Nonclinical - Standardization Roadmap team for more details.
The following are open questions, without an answer at present. Most involve the Change Control Tracker (CCT) for an answer from the larger SEND team or others.
My study data does not have all of the matching elements for SEND; what do I do?
Example: not all of the scheduling information that SEND wants (e.g., ELTM) is available.
Example: legacy processes combined scheduled and unscheduled clinical observations; how to distinguish which should have VISITDY populated?
If it was previously distinguished for reporting or other means through some form of algorithm, then you will need to replicate that logic in the population of the SEND variables. For the clinical observation example, if the detection of which observations were unscheduled was through text-matching on the clinical observation type="Unscheduled", then that same logic would need to feed the processes creating the SEND data.
However, if this information is simply not available or parseable from the data, then CCT to determine
What should be done when the only variable distinguishing records is the unit (e.g., --ORRESU)?
For some lab tests such as protein, there can be multiple results for the same TEST and METHOD but with different calculations, resulting in multiple results with different units.
CCT item submitted (#83).
| Don't See Your Question?|
SEND Implementation Forum - Ask it here!
Last revision by Troy.smyrnios, 2017-04-3