Discussion Club: Good Programming Practice London 2014

From PhUSE Wiki
Jump to: navigation, search

How can we drive adoption of the PhUSE GPP guidance?

What does adoption mean?

One of the aims of the GPP steering board is to encourage adoption of the PhUSE GPP guidance. What exactly does this mean? The following is a list of methods of adoption with the guidance ordered from from high to low compliance:

  1. Use the guidance document directly without change except the addition of company specific guidance to the template provided.
  2. Use as a checklist to ensure compliance
  3. Use as a minimum standard
  4. Just check that we are using a guidance that is similar to GPP guidance
  5. better than starting from scratch

In discussion: 1 company uses the guidance as a minimum standard 3 have similar guidance 1 wanted to revise against the GPP guidance at next update.

The consensus opinion from discussion was:

  • At a minimum, use PhUSE guidance as a checklist of best practice and minimum standard.

Although the PhUSE guidance was a structured document, there was no feeling that it had to be used verbatim. The structure helps order the document when used as a checklist.

  • Ask people to sign up to adoption
    • This helps Keep adopters up to date with changes and comments
    • Encourages adopters to comment and feedback on content

Publicizing PhUSE GPP

We have used conference presentations and posters, discussion clubs, PhUSE mailings, PhUSE newsletters so far. These have had some success but we have limited success in getting people to review or comment on the PhUSE GPP guidance.

Suggestions

  • Prepare a standard presentation for use at conferences, or internal company presentations
  • A video of the presentation
  • Poster A0 and/ or A4 size to promote GPP. The A4 could be a laminated checklist?
  • Provide education and/or training material (or delivery).
  • Keep focus on SDEs and US as well as annual conference.
  • Take this to groups other than stats programmers- ACDM, PSI , FDA, EMEA etc?
  • single focussed email
  • Use of overland post

Ideas for implementation of GPP

or "How do you stop people rewriting and promote reuse of programs"

Technical Solutions

  • USE of SAS abbreviations for
    • Standard header
    • Program template structure
      • Header, Set up, Read in data, process, Report, clean up, end of program + comment boxes
    • Comment boxes
  • An "open GPP validator"
    • Log checkers are already widespread
    • Can check for
      • Proportion and number of comments
      • Number of comments relative to new variables etc.
    • Thought that this could be misused or used blindly out of context, and hide other issues.

Getting the message over

  • Small companies, other teams and non sas programmers
    • Modellers
    • Exploratory
    • PK
    • PhUSE CSS Script repository

Code Reviews

  • By management or project leads
  • Side by side walk through works well- a conversation and discussion, not just highlighting what's wrong. (Can be physically side by side or by phone of course)
    • This also allows discussion of personal style and subjective areas of GPP .
    • Can use a checklist to aid this

Other ways to implement

  • Reminders and updates to team meetings

Good and bad code

Flexibility

    • Moving code from 1 study to another - This already exists in the introduction.
    • Commenting - Commenting would be good for examples area. Both good and bad
    • Intelligent commenting - Already covered in doc
    • Why is this different (eg ‘else do’) - This could be added in the commenting section
    • What and why - Already covered in doc

Indenting

    • Not prescriptive of how many - Already covered in doc
    • Consistent within program - Already covered in doc
    • No tabs! - Already covered in doc

Naming variables and datasets

    • Meaningful – Touches on it, but could be expanded/emphasized.

Interoperability - Already covered in doc

Shared examples – For examples area

    • If <cond1……………………> then < >; else
    • If <cond2 ……………………> then < >; else
    • Etc

Commenting

  • If it takes >30 secs to understand a block, then it should be simplified, more comments etc – Maybe something to this effect could be added to the preamble of conventions?
  • Redundant code, commented out code should not be left in programs – I think this should be added to required conventions

Program Structure

  • Keep programs short and precise (keep the number of steps reasonable)
  • Make important code obvious, and in an obvious location – Only structuring mentioned is external data to be read at the top, do processing, then outputs. This should be a required convention I think.
    • Shared example: hardcoded libname routing output file was hidden in the middle of the code – Example perhaps of program structuring
  • Be specific, don’t leave SAS to handle/decide: - attrib statement mentioned in recommended conventions. Others below would be useful to be added, again to recommended.
    • on char lengths
    • Input(put(…. Don’t get those notes (ie avoid – NOTE: character has been conv..)
    • Missing data, actions performed on missing data
    • Numeric precision
  • Use of input(x, ?? format).
    • Some people thought I useful, others against – People were for and against this. For that reason, leave it out.

File naming conventions

    • No underscores (eCTD) (though not many liked the idea of not using underscores!)
    • No uppercase in filenames
    • No spaces! - Already covered in doc

Future helpful items

    • How to structure SQl statements, some examples and guidance would be helpful - Something for examples I feel. Perhaps this is almost a separate class of examples – more ‘templates’?