Difference between revisions of "WG5 P02 Programming Guidelines"

From PHUSE Wiki
Jump to: navigation, search
m (Conventions for macro parameter names)
m (Conventions for macro parameter names)
 
(46 intermediate revisions by the same user not shown)
Line 7: Line 7:
 
'''Starting Point:'''<br/>
 
'''Starting Point:'''<br/>
 
* [[Good Programming Practice Guidance]]
 
* [[Good Programming Practice Guidance]]
* [http://www.phusewiki.org/wiki/index.php?title=Good_Programming_Practice GPP, PhUSE publication]
+
* also published as [[File:GPP_Guidance_Document_v1.1.docx]]
 +
** "The guidance aims to show how to produce well-structured and well-documented programs so that they are '''easy to read and maintain''' over time. It is meant to be applicable to all programs, and hence all programmers regardless of experience." (emphasis added)
 
<br/>
 
<br/>
  
 
== OVERVIEW ==
 
== OVERVIEW ==
  
The PhUSE/CSS library contains 4 types of programs
+
In general, the PhUSE CS scripts should adhere to 3 general principles:
* Template programs ([https://github.com/phuse-org/phuse-scripts/tree/master/whitepapers/WPCT WPCT folder])
+
* Create '''learning''' opportunities for our community. This drives how we write and organize documents, specifications, test data, scripts, etc.
:* Produce a specified data display.
+
:* highly accessible and readable components allow us all to improve our expertise, technical capabilities and understanding of standards
:* Clearly present core statistical steps relevant to specified analyses.
+
* This includes ensuring that these documents '''reference''' relevant language specifications and industry standards
:* Explicitly assert assumptions about the data and environment, via %ASSERT* macros.
+
:* e.g., to R or SAS online documentation pages for sophisticated or subtle techniques
:* Hide generic discovery and processing irrelevant to specified analyses, via %UTIL* macros.
+
:* e.g., to CDISC SDTM or ADaM guidelines to explain test data structures
 +
* And keep all components as '''simple''' as possible, but not simpler :-)
 +
:* e.g., provide basic functionality as required or suggested by white papers
 +
:* e.g., provide basic but not full user configuration
 +
:* e.g., keep all code relevant to statistical analysis and presentation visible, accessible for additional user customization
 +
:*... but OK to move basic data-driven discovery into functions or macros. For example, SAS record-based processing makes it somewhat cumbersome to identify unique values in a variable. It's OK to move this into a utility macro, to replicate R-style data discovery.
 +
<br/>
 +
 
 +
The PhUSE CS repository contains 4 types of programs
 +
 
 +
* Standard Scripts ([https://github.com/phuse-org/phuse-scripts/tree/master/whitepapers/WPCT WPCT folder])
 +
** Produce a specified data display.
 +
** Clearly present core statistical steps relevant to specified analyses.
 +
** Explicitly assert assumptions about the data and environment, via %ASSERT* macros.
 +
** Hide generic discovery and processing irrelevant to specified analyses, via %UTIL* macros.
 +
** '''naming convention:''' <white-paper-id>_<display-id>.<lang>
 +
*** '''Example:''' [https://github.com/phuse-org/phuse-scripts/blob/master/whitepapers/WPCT/WPCT-F.07.01.sas WPCT-F.07.01.sas]
 +
 
 +
* Test scripts ([https://github.com/phuse-org/phuse-scripts/tree/master/whitepapers/qualification qualification folder, named '''test_<program-name>'''])
 +
** Establish expected results for intended functionality in standard, assertion and utility scripts
 +
** '''naming convention:''' test_<program-name>.<lang>
 +
*** '''Example:''' [https://github.com/phuse-org/phuse-scripts/blob/master/whitepapers/qualification/test_assert_dset_exist.sas test_assert_dset_exist.sas]
 +
 
 
* Assertion macros ([https://github.com/phuse-org/phuse-scripts/tree/master/whitepapers/utilities utilities folder, prefix '''assert_'''])
 
* Assertion macros ([https://github.com/phuse-org/phuse-scripts/tree/master/whitepapers/utilities utilities folder, prefix '''assert_'''])
 +
** [ SAS focus ]
 
** Test conditions in the data and environment, and  
 
** Test conditions in the data and environment, and  
** inform the end user in case of unexpected results
+
** inform the end user in case of unexpected conditions, invalid states
 +
** '''assertion naming convention:''' assert_<assertion-description>.<lang>
 +
*** '''Example:''' [https://github.com/phuse-org/phuse-scripts/blob/master/whitepapers/utilities/assert_dset_exist.sas assert_dset_exist.sas]
 +
 
 
* Utility macros ([https://github.com/phuse-org/phuse-scripts/tree/master/whitepapers/utilities utilities folder, prefix '''util_'''])
 
* Utility macros ([https://github.com/phuse-org/phuse-scripts/tree/master/whitepapers/utilities utilities folder, prefix '''util_'''])
 +
** [ SAS focus ]
 
** Accomplish discovery and processing tasks that are needed,
 
** Accomplish discovery and processing tasks that are needed,
 
** but that are not particularly relevant to the analyses.
 
** but that are not particularly relevant to the analyses.
 
** Implementation of these tasks has no impact on the interpretation of results
 
** Implementation of these tasks has no impact on the interpretation of results
* Test programs ([https://github.com/phuse-org/phuse-scripts/tree/master/whitepapers/qualification qualification folder, named '''test_<program-name>'''])
+
** '''utility naming convention:''' util_<utility-description>.<lang>
** Establish expected results for intended functionality in template, assertion and utility programs
+
*** '''Example:''' [https://github.com/phuse-org/phuse-scripts/blob/master/whitepapers/utilities/util_access_test_data.sas util_access_test_data.sas]
 +
<br/>
 +
 
 +
== TO DO for these PhUSE CS project guidelines ==
 +
 
 +
* Create standard program header, as recommended in [[Good Programming Practice Guidance|GPP Guidance]], above
 +
* Better clarity concerning language. These guidelines are currently SAS-focused, but we want to deliver R scripts, as well.
 +
 
 
<br/>
 
<br/>
  
Line 52: Line 87:
  
 
* '''CSS_''' prefix for all WORK data sets  
 
* '''CSS_''' prefix for all WORK data sets  
** DO NOT overwrite data sets that could help the user debug their data & changes
+
** DO NOT overwrite data sets that could help the user debug their data & changes ([[Good Programming Practice Guidance|GPP Guidance]])
 
** DO delete other WORK data sets as soon as they are obsolete
 
** DO delete other WORK data sets as soon as they are obsolete
  
 
* headers contain a '''TO DO''' list, to facilitate contribution
 
* headers contain a '''TO DO''' list, to facilitate contribution
** '''TO DO''' placeholders within the script can also help contributors properly incorporate new code
+
** '''TO DO''' placeholders within the script can also help contributors incorporate new code
 
* Header: see notes on "Comments", below
 
* Header: see notes on "Comments", below
 
* Spacing and alignment
 
* Spacing and alignment
** align code with space characters, never tabs. '''set your editor to replace tabs with spaces.'''
+
** align code with space characters, never tabs. '''set your editor to replace tabs with spaces.''' ([[Good Programming Practice Guidance|GPP Guidance]])
 
** use consistent number of spaces to indent within a single program
 
** use consistent number of spaces to indent within a single program
 
** 2-space indents are preferred (not more). set your editor to 2-space indenting, replacing tabs with spaces.
 
** 2-space indents are preferred (not more). set your editor to 2-space indenting, replacing tabs with spaces.
Line 75: Line 110:
 
** use the full keyword to support clarity and readability
 
** use the full keyword to support clarity and readability
 
** create a good experience for end-users of all skill levels
 
** create a good experience for end-users of all skill levels
 +
**  (similar to [[Good Programming Practice Guidance|GPP Guidance]] to always use "data=''dataset''" option in SAS programs)
  
* explicit parentheses in algorithms for readability
+
* explicit parentheses in algorithms for readability ([[Good Programming Practice Guidance|GPP Guidance]])
 
** do not force reviewers to check order of operations, demonstrate that you are in control
 
** do not force reviewers to check order of operations, demonstrate that you are in control
 
** '''NO:'''  var + 1 / 10
 
** '''NO:'''  var + 1 / 10
Line 107: Line 143:
 
** this makes it easy to
 
** this makes it easy to
 
*** extract messages from logs
 
*** extract messages from logs
*** separate SAS and PhUSE/CSS messages
+
*** separate SAS and PhUSE CS messages
** for PhUSE/CSS ASSERT and UTILITY macros, see additional details, below
+
** for PhUSE CS ASSERT and UTILITY macros, see additional details, below
  
 
* macros use Quoting carefully and intentionally
 
* macros use Quoting carefully and intentionally
Line 144: Line 180:
 
** *---  ---*;  comment statements as single-line explanations
 
** *---  ---*;  comment statements as single-line explanations
  
* Comments declare what program expects from macro call, such as data sets, macro vars, etc. See also "TEMPLATE programs", below.
+
* Comments declare what program expects from macro call, such as data sets, macro vars, etc. See also "STANDARD scripts", below.
  
* Comments visually group blocks of related code, which are indented one additional step
+
* Comments visually group blocks of related code, which are indented one additional step ([[Good Programming Practice Guidance|GPP Guidance]] extended)
 
** Examples (consistent 2-space indentation)
 
** Examples (consistent 2-space indentation)
  
Line 161: Line 197:
 
    
 
    
 
   %*--- OK, now I am prepared to call my utility macro ---*;
 
   %*--- OK, now I am prepared to call my utility macro ---*;
     %get_the_job_done(ds=my_data)
+
     %util_generic_processing(ds=my_data)
 
<br/>
 
<br/>
  
== TEMPLATE programs ==  
+
== STANDARD scripts ==  
  
* Use PhUSE/CSS test data
+
* Use PhUSE CS test data
* Access PhUSE/CSS test data via %UTIL_ACCESS_TEST_DATA
+
* Access PhUSE CS test data via %UTIL_ACCESS_TEST_DATA
 
* Use global symbol &CONTINUE with values 0 (No, there's a problem) and 1 (Yes, continue) to monitor success of processing
 
* Use global symbol &CONTINUE with values 0 (No, there's a problem) and 1 (Yes, continue) to monitor success of processing
 +
:* see also ASSERT macros, below
 
* Use assertion macro %ASSERT_CONTINUE to interrupt processing if a problem occurs (force syntax-checking mode if error indicated)
 
* Use assertion macro %ASSERT_CONTINUE to interrupt processing if a problem occurs (force syntax-checking mode if error indicated)
 
* Declare the symbols that utility programs create. E.g., see these macro calls in template program WPCT-F.07.01.sas
 
* Declare the symbols that utility programs create. E.g., see these macro calls in template program WPCT-F.07.01.sas
Line 178: Line 215:
 
<br/>
 
<br/>
  
== TEST programs ==
+
== TEST scripts ==
  
 
* script naming convention: test_<program-name-without-extension>.sas
 
* script naming convention: test_<program-name-without-extension>.sas
 
* every test explicitly uses specific data
 
* every test explicitly uses specific data
 
** this can be test data created specifically within the test program for specific tests, or
 
** this can be test data created specifically within the test program for specific tests, or
** centralized PhUSE/CSS test data available for multiple tests. see: https://github.com/phuse-org/phuse-scripts/tree/master/scriptathon2014/data
+
** centralized PhUSE CS test data available for multiple tests. see: https://github.com/phuse-org/phuse-scripts/tree/master/scriptathon2014/data
* centralized PhUSE/CSS data sets must include a '''QLTSTID''' variable that identifies specific test data
+
* centralized PhUSE CS data sets must include a '''QLTSTID''' variable that identifies specific test data
** QLTSTID has label "CSS/PhUSE Qualification Test ID", and length sufficient for all current test IDs
+
** QLTSTID has label "PhUSE CS Qualification Test ID", and length sufficient for all current test IDs
 
** see: https://github.com/phuse-org/phuse-scripts/blob/master/scriptathon2014/data/advs.xpt
 
** see: https://github.com/phuse-org/phuse-scripts/blob/master/scriptathon2014/data/advs.xpt
 
** QLTSTID values should not change, once assigned.
 
** QLTSTID values should not change, once assigned.
Line 195: Line 232:
 
== ASSERT macros ==
 
== ASSERT macros ==
  
 +
* use %assert_depend to test conditions (e.g., valid data set and variable names, etc.), for consistency of messaging. This applies to UTIL macros, as well.
 
* return a 0/1 result in-line whenever possible: 0 = FAIL, 1 = PASS
 
* return a 0/1 result in-line whenever possible: 0 = FAIL, 1 = PASS
* use and return a %local OK symbol for in-line macros
+
* IN-LINE macros: use and return a %local OK symbol to return pass/fail result
 +
* Base SAS macros: use the global symbol &CONTINUE to return any failure that should stop processing
 +
:* see also TEMPLATE programs, above
 
* declare %local and %global symbols explicitly
 
* declare %local and %global symbols explicitly
 
* always return at least one message to the log, either
 
* always return at least one message to the log, either
Line 202: Line 242:
 
: or
 
: or
 
   ERROR: (MACRO-NAME-UPCASE) Result is FAIL. Clear explanation of failed assertion.
 
   ERROR: (MACRO-NAME-UPCASE) Result is FAIL. Clear explanation of failed assertion.
 +
 +
Depending on severity of the failed condition, a log WARNING may suffice rather than an ERROR.
 +
 
<br/>
 
<br/>
  
 
== UTIL macros ==
 
== UTIL macros ==
  
 +
* use %assert_depend to test conditions (e.g., valid data set and variable names, etc.), for consistency of messaging. This applies to ASSERT macros, as well.
 
* perform a specific task
 
* perform a specific task
 
* are never highjacked to perform a related task
 
* are never highjacked to perform a related task
Line 219: Line 263:
 
|DS || SAS data set, one or two levels || positional, when usage is obvious ||  
 
|DS || SAS data set, one or two levels || positional, when usage is obvious ||  
 
|-
 
|-
|VAR || SAS var, no special chars expected || positional, when usage is obvious ||  
+
|DSOUT || Resulting SAS data set to create, one or two levels || keyword, unless usage is obvious ||
 +
|-
 +
|VAR || Valid SAS var, no special chars expected || positional, when usage is obvious || assert_unique_keys
 +
|-
 +
|KEYS || Valid SAS vars that compose unique keys for a data set || positional, when usage is obvious ||
 +
|-
 +
|INCL || Valid SAS vars to include in an output data set || always keyword || assert_unique_keys
 
|-
 
|-
|ORD || name of an ORDER variable such as AVISITN || always named parameter ||  
+
|ORD || name of an ORDER variable such as AVISITN || always keyword ||  
 
|-
 
|-
|WHR || complete where statement, %str()-quoted || always named parameter, includes semi-colon (;) for the statement ||  
+
|WHR || where clause || always keyword ||  
 
|-
 
|-
|SQLWHR || complete SQL where clause, quoted as needed || always named parameter, does NOT include semi-colon ||  
+
|SQLWHR || complete SQL where clause, quoted as needed || always keyword, does NOT include semi-colon || util_count_unique_values
 
|-
 
|-
|FMT || SAS format name WITH punctuation (@$.), as nec || always named parameter ||  
+
|FMT || SAS format name WITH punctuation (@$.), as nec || always keyword ||  
 
|-
 
|-
 
|SYM || name of a symbol (macro variable) || positional, when usage is obvious ||  
 
|SYM || name of a symbol (macro variable) || positional, when usage is obvious ||  
 +
|-
 +
|CLEANUP || 0/1 boolean whether to cleanup intermediate dsets. 1 = YES, 0 = NO.|| always keyword||
 
|}
 
|}
  

Latest revision as of 05:46, 19 February 2016


WG5 Project 02 Programming Conventions and Guidelines

Programming project:
Central Tendency White Paper

Starting Point:


OVERVIEW

In general, the PhUSE CS scripts should adhere to 3 general principles:

  • Create learning opportunities for our community. This drives how we write and organize documents, specifications, test data, scripts, etc.
  • highly accessible and readable components allow us all to improve our expertise, technical capabilities and understanding of standards
  • This includes ensuring that these documents reference relevant language specifications and industry standards
  • e.g., to R or SAS online documentation pages for sophisticated or subtle techniques
  • e.g., to CDISC SDTM or ADaM guidelines to explain test data structures
  • And keep all components as simple as possible, but not simpler :-)
  • e.g., provide basic functionality as required or suggested by white papers
  • e.g., provide basic but not full user configuration
  • e.g., keep all code relevant to statistical analysis and presentation visible, accessible for additional user customization
  • ... but OK to move basic data-driven discovery into functions or macros. For example, SAS record-based processing makes it somewhat cumbersome to identify unique values in a variable. It's OK to move this into a utility macro, to replicate R-style data discovery.


The PhUSE CS repository contains 4 types of programs

  • Standard Scripts (WPCT folder)
    • Produce a specified data display.
    • Clearly present core statistical steps relevant to specified analyses.
    • Explicitly assert assumptions about the data and environment, via %ASSERT* macros.
    • Hide generic discovery and processing irrelevant to specified analyses, via %UTIL* macros.
    • naming convention: <white-paper-id>_<display-id>.<lang>
  • Assertion macros (utilities folder, prefix assert_)
    • [ SAS focus ]
    • Test conditions in the data and environment, and
    • inform the end user in case of unexpected conditions, invalid states
    • assertion naming convention: assert_<assertion-description>.<lang>
  • Utility macros (utilities folder, prefix util_)
    • [ SAS focus ]
    • Accomplish discovery and processing tasks that are needed,
    • but that are not particularly relevant to the analyses.
    • Implementation of these tasks has no impact on the interpretation of results
    • utility naming convention: util_<utility-description>.<lang>


TO DO for these PhUSE CS project guidelines

  • Create standard program header, as recommended in GPP Guidance, above
  • Better clarity concerning language. These guidelines are currently SAS-focused, but we want to deliver R scripts, as well.


GENERAL

  • Keep it simple. aggressively.
    • before you add in complexity: stop, assess whether this is really needed, and
    • justify the gain in functionality vs. the costs of complexity.
    • before you finish your code: stop, review and assess whether you can make it simpler without meaningful loss
  • But not too simple.
    • all variable names, symbol names, macro names must be meaningful
    • long, descriptive names are better for readability than short, cryptic names
    • looping:
      • never use one-letter variables to loop (e.g., i j k ...)
      • code often loops through values, or parses a delimited string and processes each piece
        • e.g., Process each parameter in a list of lab parameters, or each var in a list of variables.
      • Our programs should uniformly use -IDX and -NXT suffixes for such processing.
      • -IDX suffix for the indexing variable (or macro symbol)
        • e.g., See %assert_var_exist() for an example of looping through data sets and var names.
        • DIDX indexes data set names, and VIDX indexes variable names
      • -NXT suffix for the variable (or symbol) that holds the value to process next from a deliminted list
        • e.g., See %assert_var_exist() for an example of looping through data sets and variable names.
        • DNXT holds the next data set name, and VNXT holds the next variable name
      • This makes the code easy to read!
  • CSS_ prefix for all WORK data sets
    • DO NOT overwrite data sets that could help the user debug their data & changes (GPP Guidance)
    • DO delete other WORK data sets as soon as they are obsolete
  • headers contain a TO DO list, to facilitate contribution
    • TO DO placeholders within the script can also help contributors incorporate new code
  • Header: see notes on "Comments", below
  • Spacing and alignment
    • align code with space characters, never tabs. set your editor to replace tabs with spaces. (GPP Guidance)
    • use consistent number of spaces to indent within a single program
    • 2-space indents are preferred (not more). set your editor to 2-space indenting, replacing tabs with spaces.
      • see Explanations (a.k.a. Comments), below.
      • indenting helps group related blocks of code, so 2-space indenting allows more indenting
    • maintain spacing in a program.
      • e.g., if you edit a program with 2-space alignment, stick with 2-space alignment
  • capitalization
    • SAS is not a case-sensitive language
    • prefer lower case, unless necessary (title, labels) or helpful for clarity (comments)
    • use casing functions explicitly in algorithms lowcase(), upcase(), %lowcase(), %upcase()
  • do not abbreviate SAS keywords anywhere
    • use the full keyword to support clarity and readability
    • create a good experience for end-users of all skill levels
    • (similar to GPP Guidance to always use "data=dataset" option in SAS programs)
  • explicit parentheses in algorithms for readability (GPP Guidance)
    • do not force reviewers to check order of operations, demonstrate that you are in control
    • NO: var + 1 / 10
    • YES: var + (1/10)
  • macro names should be meaningful, even if long
    • prefix indicates "type", e.g., assert_*, util_*, etc.
    • when reading the macro name in a calling script, the purpose should be clear
    • adhere to NAMING CONVENTIONS that SAS already establishes, whenever possible
    • NO:  %assert_dse()
    • NO:  %assert_dset_exists()
    • YES: %assert_dset_exist(), to match the grammar of SAS elements exist(), fexist(), symexist(), etc.
  • use temporary macro NULL to wrap macro logic in open code, such as an %IF block
    • Example:
 %macro null;
   %if not %symexist(init_sasautos) %then %let init_sasautos = %sysfunc(getoption(sasautos));
 %mend null;
 %null;
  • see "Conventions for macro parameter names", below
  • OK to assume that one-level data sets are in WORK
    • without checking for the USER libname & related system option
    • but keep in mind as potential bug
  • macro messages to the log follow this style and format:
    • NOTE: (MACRO-NAME-UPCASE) Clear informational message to user.
    • WARNING: (MACRO-NAME-UPCASE) Warning message to user, but processing continues.
    • ERROR: (MACRO-NAME-UPCASE) Error detected current context. Processing should stop as soon as possible.
    • this makes it easy to
      • extract messages from logs
      • separate SAS and PhUSE CS messages
    • for PhUSE CS ASSERT and UTILITY macros, see additional details, below
  • macros use Quoting carefully and intentionally
    • use q- versions of macro functions whenever processing unknown text.
    • e.g., the following macro FAILS for some values of &vars, unless you use the %qscan() function
 %macro null(vars);
   %if %scan(&vars, 1) = STDDEV %then %put Note: Calculating Standard Deviation.;
   %else %put Note: Calculating something else.;
 %mend null;
 %null(OR);
  • macros clean up after themselves
    • delete temp data sets before exiting
    • reset any modifications before exiting
  • system options,
  • graphics options,
  • ODS destinations
  • etc


Explanations (a.k.a. Comments)

  • Comments must be meaningful and easy to maintain
    • No extra characters to draw boxes around comments (see header note, below)
    • Explain what the code needs to achieve
    • Explain decisions in the code
      • why keep or drop certain vars?
      • why are the merge variables or by variables correct?
      • why is a particular algorithm correct? what do the elements represent?
  • Comment types must be used intentionally
    • Header block between starting line (/***) and ending line (***/)
    • /*** ***/ style comments for blocks of explanation, like with the header
    •  %*--- ---*; style comments to explain macro statements
    • *--- ---*; comment statements as single-line explanations
  • Comments declare what program expects from macro call, such as data sets, macro vars, etc. See also "STANDARD scripts", below.
  • Comments visually group blocks of related code, which are indented one additional step (GPP Guidance extended)
    • Examples (consistent 2-space indentation)
 *--- Single-line comment to explain the next, related steps ---*;
   all code that accomplishes this objective is indented to this level
 
 /*** Optional title for comment
   This next bit is more complicated, so requires a bit more explanation.
   But not too much.
 ***/
   all code to accomplish this complex task
 
   still working on it down here
 
 %*--- OK, now I am prepared to call my utility macro ---*;
   %util_generic_processing(ds=my_data)


STANDARD scripts

  • Use PhUSE CS test data
  • Access PhUSE CS test data via %UTIL_ACCESS_TEST_DATA
  • Use global symbol &CONTINUE with values 0 (No, there's a problem) and 1 (Yes, continue) to monitor success of processing
  • see also ASSERT macros, below
  • Use assertion macro %ASSERT_CONTINUE to interrupt processing if a problem occurs (force syntax-checking mode if error indicated)
  • Declare the symbols that utility programs create. E.g., see these macro calls in template program WPCT-F.07.01.sas
 %*--- Return macro vars: Number of parameters (&PARAMCD_N), their Names (&PARAMCD_NAM1 ...) and Labels (&PARAMCD_LAB1 ...) ---*;
   %util_labels_from_var(css_anadata, paramcd, param)
 
 %*--- Return macro var: Number of planned treatments (&TRTN) ---*;
   %util_count_unique_values(css_anadata, trtp, trtn)


TEST scripts

  • script naming convention: test_<program-name-without-extension>.sas
  • every test explicitly uses specific data
  • centralized PhUSE CS data sets must include a QLTSTID variable that identifies specific test data
    • QLTSTID has label "PhUSE CS Qualification Test ID", and length sufficient for all current test IDs
    • see: https://github.com/phuse-org/phuse-scripts/blob/master/scriptathon2014/data/advs.xpt
    • QLTSTID values should not change, once assigned.
    • e.g., if some test relies on records with QLTSTID = "TEST-01-01",
      • those obs should not change, individually or as a set, and
      • any new obs added to the same central data set must have a new value for QLTSTID


ASSERT macros

  • use %assert_depend to test conditions (e.g., valid data set and variable names, etc.), for consistency of messaging. This applies to UTIL macros, as well.
  • return a 0/1 result in-line whenever possible: 0 = FAIL, 1 = PASS
  • IN-LINE macros: use and return a %local OK symbol to return pass/fail result
  • Base SAS macros: use the global symbol &CONTINUE to return any failure that should stop processing
  • see also TEMPLATE programs, above
  • declare %local and %global symbols explicitly
  • always return at least one message to the log, either
 NOTE: (MACRO-NAME-UPCASE) Result is PASS. Optional confirmation of the successful assertion.
or
 ERROR: (MACRO-NAME-UPCASE) Result is FAIL. Clear explanation of failed assertion.

Depending on severity of the failed condition, a log WARNING may suffice rather than an ERROR.


UTIL macros

  • use %assert_depend to test conditions (e.g., valid data set and variable names, etc.), for consistency of messaging. This applies to ASSERT macros, as well.
  • perform a specific task
  • are never highjacked to perform a related task
  • are never highjacked to create a convenient side-effect


Conventions for macro parameter names

Name Description Comments Programs that use
DS SAS data set, one or two levels positional, when usage is obvious
DSOUT Resulting SAS data set to create, one or two levels keyword, unless usage is obvious
VAR Valid SAS var, no special chars expected positional, when usage is obvious assert_unique_keys
KEYS Valid SAS vars that compose unique keys for a data set positional, when usage is obvious
INCL Valid SAS vars to include in an output data set always keyword assert_unique_keys
ORD name of an ORDER variable such as AVISITN always keyword
WHR where clause always keyword
SQLWHR complete SQL where clause, quoted as needed always keyword, does NOT include semi-colon util_count_unique_values
FMT SAS format name WITH punctuation (@$.), as nec always keyword
SYM name of a symbol (macro variable) positional, when usage is obvious
CLEANUP 0/1 boolean whether to cleanup intermediate dsets. 1 = YES, 0 = NO. always keyword

Other macro parameters

Other parameter Program that uses Comment
TABLE util_freq2format.sas a 2-var PROC FREQ table spec like var1*var2, can include extra spacing
FMTNAME util_freq2format.sas macro determines fmt type, so value does NOT include punctuation (@$.)
MACNAME util_autocallpath.sas a macro name, without any special chars
DSETS assert_complete_refds.sas list of data sets, where order has a specific meaning


Team Review and Comments