Talk:Good Programming Practice Guidance

From PhUSE Wiki
Jump to: navigation, search

General comments -- Mfoxwell 12:23, 11 May 2013 (CDT)

Add any general comments about the document here

Examples are needed --

There are statements in this document for which i would find an example useful, that suggests to me that there may be other statements that I understand but others may not, so definitely a need to look at this and provide examples where we can.

I am currently considering whether it is useful to encourage my team to avoid the use of complex SQL code, unless it is necessary (i.e. usually a many-to-many merge). I'm not sure whether this is something that would be agreeable to others though. --Kevin2kane (talk) 06:09, 17 October 2013 (CDT)

Re: Examples are needed -- Mfoxwell (talk) 03:08, 2 January 2014 (CST)

Implementing this rule would mean defining complex SQL code, so I am not sure that I agree. I would say that any programmer should expect anyone reading their code to have a reasonable level of competence in base sas, sql and macro. They should comment code of course. However any complex code or ‘clever’ code tricks that can be avoided by simpler coding are not part of GPP and should be discouraged. This doesn’t help define what complex is, so agree, examples are needed? --Mfoxwell (talk) 03:08, 2 January 2014 (CST)

1. Introduction --

Enter any comments on Introduction here

2. Getting Started

3. Language

4. Program Header

It would be useful to put a template and examples in the google code repository set up by the FDA/PhUSE Standard Scripts group. Agree, can look at that.. --Mfoxwell (talk) 03:10, 2 January 2014 (CST)

I'm not sure I would include the list of macros called by the program. Firstly, this could be quite a lot of work to maintain. Secondly, what about macros within macros? If you are using standard macros this could be a lot of macros and quite complex.

The date proposed for the header, whilst more useful, is less practical, as the header will be completed at the start of production of the program. Consider recording the date the program was started to be written.

Re: 4. Program Header -- Mfoxwell (talk) 03:11, 2 January 2014 (CST)

Not sure that I understand why maintaining first level macro calls would be a problem? As for nesting of macros, I would generally discourage this, but expect these to be in the header of the calling macro. I wouldn't expect nested macros to be mentioned in the header of the main program.


Suggestion: for "in-line comments" (i.e. those explaining each data/proc step), it is (in my opinion) useful to use the format of having an asterix as the first character, and a semi-colon to finish the comment. It helps to avoid using /* .... */ for these comments as it causes problems when developing code and commenting out blocks of code at a time.

Re: 5. Comments -- Mfoxwell (talk) 03:11, 2 January 2014 (CST)

I would be happy to add this. Note that /* */ style comments are sometimes preferred (in SQL code for example).

6. Naming Conventions

I think it would be universally accepted not to have space characters in program / output file names.

Re: 6. Naming Conventions -- Mfoxwell (talk) 03:14, 2 January 2014 (CST)

Agree, spaces can cause extra work and confusion.

7. Coding Conventions

I would propose to change "Do not overwrite existing datasets, use different meaningful names for each temporary dataset." to "Avoid overwriting existing temporary datasets in the SAS WORK library, and ideally use different meaningful names for each temporary dataset. Where existing datasets are overwritten, this should be done with care."

I would propose to change "Separate data steps and procedures with at least one blank line." to "It is normal to separate data steps and procedures with at least one blank line. Exceptions may be where there is a simple proc sort or proc print after a data step".

I would propose to change "End data steps and procedures with run or quit to provide a boundary and allow for independent execution." to "End data steps and procedures with run or quit to provide a boundary and allow for independent execution (except where a data step is followed by a simple proc sort or proc print)."

Re: 7. Coding Conventions -- Mfoxwell (talk) 03:20, 2 January 2014 (CST)

Disagree. Although this isn't a big issue, there are no downsides to leaving the step boundary in place as far as I can see.

It's not clear what this means: "Split data steps into logical parts." - I think I get the idea but it needs to be stated in a clearer way.

I would propose to change "Put each statement on a separate line." to "Put each statement on a separate line unless it would make the code easier to understand or more readable to have multiple statements on one line."

I would propose to have a standard 2 character indentation, but accept this may be difficult to agree on a multi-company basis.

It's not clear what this means: "Use logical groupings to separate code into blocks."

Re: Re: 7. Coding Conventions -- Mfoxwell (talk) 03:40, 26 March 2014 (CDT)

Agree that this is difficult to define , a rewrite might work or examples may be better. Will look at this further. --Mfoxwell (talk) 03:40, 26 March 2014 (CDT)

7.1 Log File Checking

End of para 1 - "Referenced" is included in the list twice.

Suggest adding "A company-specific naming convention for user defined checks can aid in this, so the specific string can be searched for within the log. Examples of such conventions include “ISSUE:”, “USER:”, and “ALERT:”. Avoid the use of user-generated errors and warnings labeled "NOTE:", "WARNING:" or "ERROR:", as these may make it difficult to find genuine problems when searching the log."

7.2 Portability

It seems risky to recommend rounding new variables. If the variable is a continuous variable (i.e. calculated lab values LBSTRESN), rounding may increase the variability of the actual data. Raw data should not be rounded (i.e. have their values truncated) - rounding should occur when presenting the summary statistics and results.

Re: 7.2 Portability -- Mfoxwell (talk) 03:24, 2 January 2014 (CST)

Agree with your point here. Cross platform compatibility is important, but maybe this needs further thought?

7.3 Hard Coding

It would be useful to give examples of what is meant by hard coding.

It may be useful to suggest that the programmer checks if the organization has an SOP to handle hard coding (as is often the case).

7.4 Defensive Coding

Defensive Coding -- Mfoxwell 12:17, 11 May 2013 (CDT)

We use defensive coding here, but have used Robust Programming in the past. Are these the same thing?

Replace this text with your reply