Analysing the role of a lead programmer

From PHUSE Wiki
Jump to: navigation, search

Professional Development & Training 2015

--Ester Schoeman


This paper will focus on finding out what our peers believe makes the difference when it comes to lead programmers.

I conducted a survey consisting of 2 parts: (a) Part 1 containing multiple choice questions on the importance of skillsets, ability levels and other soft-skills associated with the role of a lead programmer. (b) Part 2 consisted of open ended questions where more detailed information could be given. The findings from this survey will be summarised, comparing answers by programmers with those of statisticians.

Finally it looks at how the role of a lead programmer is evolving in this ever changing industry. It will discuss the trends that have made the change essential in order to adapt to the demand of the industry.


Being a lead programmer myself, I know that it is very easy to become complaisant in the way you are used to doing your day to day tasks, and to be trapped in a mindset of ‘my way is not just the best way, but should be the only way!'.

Instead of doing a paper solely based on my thoughts and experience, I deemed it much more valuable to present the viewpoints of fellow programmer – some lead programmers themselves and others working with lead programmers – and statistician colleagues.

The information gathered has been summarised, comparing expectations of programmers to those of statisticians and revealing important tasks and abilities required to perform the role of a lead programmer.

Next the paper has summarised the responses from the open ended questions of the second part of the survey, discussing the skills and qualities needed in more detail. It will also elaborate on how these affect the way we work and interactions we have with our colleagues and clients on a daily basis, key areas for focus and pitfalls to avoid.

Finally it will also comment on how the role of a lead programmer has evolved in recent years. The paper will look at where this role is headed as well as implications for us currently in the role, or for those aspiring to become a lead programmer.


Before showing the findings, let’s first look at who participated in the survey and establish credibility of the information given in this paper.


The survey was sent out to fellow programmers and statisticians colleagues that indicated an interest in participating at the beginning of 2015. As expected, not all who indicated interest actually participated. 24 programmers and 21 statisticians were willing to participate in the survey, but only 16 programmers and 17 statisticians completed the survey.


Before rushing into the results, I want to establish credibility of the participants to prove the in-depth awareness the participants have of the role of a lead programmer. Some are lead programmers themselves and others work closely with lead programmers on a daily basis.

Years of experience of programmer participants –

  • 5 - <10 years : 2 participants
  • >=10 years : 14 participants

Years of experience of statisticians participants –

  • 1 - <2 years : 2 participants
  • 2 - <5 years : 2 participants
  • 5 - <10 years : 5 participants
  • >=10 years : 8 participants

Participants acquired their experience at both biotech/pharmaceutical and CROs, giving them a wider type of experience and also shows that the responses are not company specific.

I was immensely impressed at the years of experience and knowledge held by my colleagues and I feel very privileged to work among them.


The participants were presented with a list of tasks/abilities generally associated with the role of a lead programmer and asked to rate each of these as not important, somewhat important, very important or essential.

To analyse the result, I assigned a weight value to the answers given, and used the Cochran-Mantel-Haenszel (CMH) statistical test in SAS to compare responses from programmers and statisticians:


Variables explained: QSTEST = list of tasks/abilities
PROG_VS_STATS = group variable indicating who gave response, i.e. either programmers and statistician
QSSTRESN = numeric result, i.e. 1=not important, 2=somewhat important, 3=very important and 4=essential

I was able to compare the overall responses from programmers versus statisticians, with a p-value indicating that there is no significant difference between the two groups.

The analysis showed that programmers and statisticians generally have the same expectations relating to the level of importance of tasks associated with the role of a lead programmer. The p-value of 0.811 indicated that there is no significant difference between responses from programmers and statisticians.

The average score is used to classify the tasks into importance levels. Tasks with an average score above 3.5 are essential according to the majority of participants. Average scores between 3 – 3.5 are very important, 1.5 – 3 are somewhat important and < 1.5 not important.

Willingness and attitude to perform a task, communication skills, creating ADaM specifications and programs, review of CRF, manage study progress / resource management and review of the SAP are all essential tasks. The following tasks are classed as very important: keep up to date with latest developed tools, creating SDTM specifications, interpret and understand data, programming of TLG outputs, quick decision making, annotating TLG mocks, programming of SDTM datasets, additional study specific edit checks, create PDF file of TLG outputs; while review of edit checks and the protocol are classed as somewhat important tasks.

It is obvious that all of the tasks listed are needed by lead programmers, as most of the average scores are greater than or equal to 3. Note that the highest average score is the willingness and attitude to perform a task, meaning that most participants scored this as essential. As programmers tend to be very technical in their approach to work, it is very important to remember that we are providing a service, be it to internal statistician clients or external vendor clients, and the importance of your attitude can easily be underestimated!


The final question of the first section in the survey followed on from the previous one. Participants were asked to rank the following task groups in order of importance, placing most important first, second most important next, to least important last. The same CMH statistical test as above was used to summarise the results. The responses from each participant were given a ranking position of 1 – 7, where 1 is most import and 7 least important. This time ranking position was used as the weight for the CMH test.

An overall score was calculated from the ranking position of each task group for this question. Here are the results, in order of importance, with most important task first to least important task last:

  1. Manage study team, progress and resource management
  2. Willingness and attitude
  3. Understanding and interpret the data
  4. Review abilities
  5. Quick decision making
  6. Keep up to date with new tools
  7. Additional tasks

The results are also consistent when comparing it with the results from the previous survey question.

Even though additional tasks is ranked as least important in this list of tasks, that does not mean that it is not important at all. All of the tasks groups listed are the responsibility of a lead programmer. The order of the tasks on the list should be used a guide to help a lead programmer prioritise tasks during the course of a study. Managing the study team, communicating study progress with the team and the study statistician and managing available resource should be the first priority for a lead programmer, coupled with a positive attitude and willingness to comply.

Be mindful not be spend more time on less important tasks as keeping up to date with new tools or doing additional tasks, while neglecting tasks with higher priority. Ensure that the correct amount of time and attention is given to tasks in accordance to the importance linked to individual task groups.


The second part of the survey contained open ended questions where the participants could add in additional information as free text. This allowed for more honest responses and also gave participants opportunity to add to and expand on answers given in the first part of the survey.

The questions were:
1. Are there any other qualities or skill sets not mentioned that could be beneficial for a lead programmer?
2. What do you find frustrating when working with a lead programmer?
3. Thinking about our CRO partnership, how would a GOOD lead programmer deal with both cultural differences across national borders, as well as corporate cultures?

  • Specifically thinking about:
  • Different approach to how things are done, leading to conflict
  • Practices acceptable in one company, but not in the other
  • Working across different time zones
  • Trust
  • Measuring quality
  • Handling conflict

4. Any other comments?

From the responses it was apparent that more than just programming skills were required by a good lead programmer. I have summarised the responses, and saw that these fell into three main categories, namely communication, leadership qualities, and programming and study specific knowledge. Each of these will be discussed and will highlight areas to focus on and pitfalls to avoid. If you need to include source code:


Good communication was mentioned by almost every participant as the foundation of success. So it is only fitting that the discussion starts here. Developing good communication skills will greatly benefit the role of a lead programmer.

Clear, concise and consistent communication is needed between programming team members and with the study statistician on regular intervals. The ability to interpret the language of each individual team member is a valuable asset, as not everybody is clear communicators. When delegating work, provide team members with all the relevant information, details and provide the reason for request so that there is a clear understanding of what is required. Also add a timeframe of when the deliverable is expected to be completed by, the priority level and the urgency of the request, to avoid having to rush.

Being able to communicate in non-programming terms by simplifying a problem is a desirable skill for any lead programmer, especially when communicating with colleagues in other departments. This will help to establish a relationship and trust with the wider study team which includes amongst others the statistician and medical directors, making it possible to push back and negotiate with them regarding deliverable timelines and other essential parts of the analysis.

Take a pro-active approach by keeping the statistician informed of study progress and potential problems or issues without having to be prompted. Keeping the team informed will reduce tension, increase productivity and motivate the team.

A few responses also mentioned the frustration caused by a lack of communication from the lead programmer. Be mindful of situations where resource availability requires the lead programmer to be hands-on with the programming. It is vital to keep on top of providing essential information to the team and answering study related questions they might have in a timely manner, to avoid causing a hold up for other programmers working towards the same timelines. Without the proper guidance, consistency will be lost and unnecessary rework might be required.


The list of leadership qualities that were highlighted is quite extensive. However, it is not necessary to master all of these to be considered a good lead programmer. Some of these are closely related and can sometimes be confused for being the same thing, although they are in actual fact separate qualities. Moreover, not all qualities are needed for all studies, and the ones needed will differ from study to study. Learning to identify which qualities are needed for each individual study is a quality worth acquiring.

The following lists the main qualities mentioned in the survey responses, along with how this applies to the role of a lead programmer:

Be assertive, know when to push back on unacceptable timelines, unnecessary ad hoc work, unacceptable practices, or similar. This does not mean rejection of every request that comes through or being unwilling to comply, but using study specific knowledge to justify the retort.

Time management is the ability to plan effectively upfront and setting intermediate goals along the way to ensure the study stays on track, and prevent crises where possible. This is coupled with prioritising tasks and delegating work effectively by matching available resource levels up with task difficulty level can maximise the time efficiency spent to complete it.

Pragmatism means setting realistic expectations. This sounds simple enough, but its importance can easily be misjudged. Time management can only be done effectively when the expectations set are achievable. Unrealistic timelines can de-motivate the team and be a set up for failure.

Being agile, having the ability to quickly adapt to new situations, aided by the ability to make quick decisions. In the same way, build flexibility into timelines where possible, to accommodate for issues arising that are out of our control.

There is also a need to be a good teacher to the programmers on the team. This includes having patience and being able to identify weaknesses in the project, as well as weaknesses of individual team members. Show the willingness to work as part of a team and lead by example.

The ability to motivate the team members. This can be done by, but not limited to, setting realistic expectations upfront, putting clear communication in place and by giving and allowing for honest constructive feedback. Also, having compassion or imparting empathy to team members in order to make better decisions and minimise work where possible will lead to having a motivated team, and build up trust over time.

Be innovative and think outside of the box; when discussing requirements with a requestor for a specific output, being able to be creative to produce a result without having a mock or specifications, especially when there are time constraints.


The last category we look at in this section focus on industry and study specific knowledge. Interestingly, mainly programmers commented on these aspects. It indicates a shift in the industry which will directly influence the role of lead programmers and what will be expected of them in years to come. This will be discussed in more detail this in the next section of the paper.

According to the survey responses, we will now discuss specific areas of knowledge that can be beneficial to have for a lead programmer:

Understanding the data is a vital part of clinical trials and the pharmaceutical industry as a whole. Without a comprehensive understanding of the data, certain aspects can be miss-interpreted or even missed off completely, information that sometimes only becomes apparent with an in-depth investigation of the data, and knowing what to look out for. Often it can be identifying data that have been omitted by mistake instead of finding something is visibly wrong in the data.

Good technical programming skills are required in order to help other programmers or jump in if required due to resource availability or time constraints. This requires the lead programmer to have the skills to pick up existing programs that may need updating or starting a program from scratch.

It is also necessary to understand standards and processes, and know how these can impact on meeting timelines. In line with this, the lead programmer also needs to stay abreast of the study specific details to ensure consistency within the study and across the drug program as well.

Other related elements worth mentioning include attention to detail, having the ability and know-how of performing elite tasks such as unblinding of patient data and the like, be proficient at creating analysis dataset or ADaM specifications and making use of supplemental resources such as the internet, text books, etc. in order to improve or correct provided specifications. The ability to think ahead, as well as being able to take a step back to see the big picture.

A little statistical knowledge will also give credibility and allow the lead programmer to challenge the statisticians requests, if there is a plausible reason to do so. It will also require a willingness to speak up and represent the programming facets at the study team level and across department lines.

Knowledge of the therapeutic area, study design, how the trial was run, specific study issues, expected trends within a drug program and familiarity with the SAP are all additional elements that will set apart the great from the good.

Once again, the shift in expectation for lead programmers has become apparent. Not only will they need to have in-depth programming, study and industry knowledge, but will also be required to take on more of a project manager role to manage the knowledge they possess. The next section will look at how we see this role changing in the near future and how this will affect the way we work.


The industry is ever changing, and so is the role of a lead programmer. By looking at some of the skills mentioned by the participants of the survey, it is quite obvious to see that there is a need for more than just programming expertise. Although this is necessary too, that is not the only requirement.

It has become apparent that there is also a wider need in the industry for this role to evolve into something more. Traditionally the role of a lead programmer was to support to our statistician colleagues, and be a vital part of the analysis within the boundaries of biostatistics. Activities such as attending study management team meetings were optional and lead programmers hardly had the opportunity to speak up across department lines.

Our company is currently undergoing a structural reorganisation to keep up with the ever changing needs and trends of the industry. The programming and data management departments have fused to form a new department Clinical Data Sciences (CDS). Within CDS four areas of focus have been defined, namely Data Planning, Data Operations, Data and Analysis Infrastructure, and Quality and Process Management.

There have been more and more requests for existing data assets to be shared beyond company limits. In order to comply, we had to take a step back to ensure we meet these new needs, but also remain true to keeping the confidentiality of the subjects that participated in our clinical trials. Data Planning will include developing operational data strategies to facilitate the sharing of data more easily going forward, but also sharing and governance of our current data assets. This function will also include de-identifying subject data and managing who gets access to the data.

The lead programmer role is being transformed into data scientists, and will be functioning under the combined Data Operations umbrella.

What does this mean? The role is moving away from providing specialised programming expertise as the main function, and towards being data scientists. This shift requires the holder of this role to have a broader, more generalised skillset, as well as maintain strong programming capabilities. This also means that we will be an equal complimentary partner to our statistician colleagues, instead of the supporting role we have been providing up to now. In addition, the role is also expanding its reach across data management functions, taking on an oversight role, but also having the authority to dip in and out where needed, to ensure consistent quality of data throughout the lifecycle of the study and beyond the finalisation of the Study Report.

What does the ideal Data Scientist look like? Taking a step back, a data scientist will be required to understand the entire data flow process and how the data is an asset that can be used by the business. They will need to be more involved in different areas concerning data. Therefore, a good understanding is required of the data in the steps needed prior to analysis of the data, i.e. collection of the data, setup and/or build of the databases, ensuring the data is sufficient for the analysis and potential re-use of the data.

Data scientists will have the knowledge of what is needed for the analysis, and will use this knowledge at the beginning of the process to sculpt the data to be analysis ready from the onset. Then, by using the in-depth knowledge and understanding of the data, the data scientist will be involved during the course of the data collection period mostly in an overseeing capacity, but zooming in and out as the need arises. And therefore allow for better, analysis-ready data sooner.

We will not only be in partnership with our statistician colleagues and other in-house departments, but also have external partners. Partnerships between pharmaceutical/biotech companies and CROs are becoming more regular occurrences. The data scientist will play a crucial part in these partnerships. They will need the skills to oversee work being done by partners, and to use their knowledge to make this process as smooth as possible.

Continuous involvement will be necessary, in an overseeing capacity. This will require exceptional communication and project management skills from data scientists, a need identified in the previous section.

Important elements a data scientist will need to consider when functioning within a partnership are:

1. Clear, concise and consistent communication –

  • Handling issues and situations as they arise, and anticipating issues without placing blame.
  • LISTEN!! Listening is a very important part of communication that is usually the first to be forgotten.
  • Remember to take culture and language differences into account when communicating, especially when using impersonal communication methods such as email or instant messaging.
  • Have empathy and put yourself in the other’s shoes, and show a willingness to adapt to situation.

2. Setting expectations upfront, from both companies and managing those expectations at regular intervals during the course of the study and/or partnership. Have realistic expectations.

3. Building flexibility into processes and timelines to accommodate for circumstances out of our control.

4. Harmonise procedures, practices and SOPs across companies.

5. Use metrics to measure internal quality as a benchmark of how well outsourcing is doing.

6. Effective planning and management of work across time zones.

7. Mutual respect and common decency will lead to a strong relationship that will grow into trust over time.

The list above is a guideline of positive reinforcement when functioning within an oversight relationship. Honesty and respect should go both ways, trust will follow but will take time to build up. Setting expectations upfront is also important, and should be discussed to ensure these are realistic and achievable. Make sure that the procedures and SOPs to follow have been clearly defined, and use metrics to measure and evaluate performance as well as overall wellbeing of the partnership. Remember that people, cultures and company principles differ and keep this in mind when communicating with your partner.


A survey was conducted at the beginning of the year, asking fellow programmer and statistician colleagues a range of questions on what tasks, skills and qualities were needed for the role of a lead programmer. It also collected some information on those who participated in the survey, to establish credibility of the information collected. From this it was clear to see by the number of years’ experience working as either a programmer or statistician, as well as where the experience was gained i.e. pharmaceutical/biotech and CRO companies. I came to the conclusion that I am very fortunate to work with a very talented pool of programmers and statisticians.

The first part of the survey contained multiple choice questions, where participants were asked to class tasks and skills associated with the role of a lead programmer into levels of importance – essential, very, somewhat and not important, followed by having to rank a set of task groups in order of importance. The responses were given a weight value based on the level of importance assigned and were analysed using the CMH statistical procedure, comparing the responses from programmers to those of statisticians.

The responses from the survey indicated that both programmer and statistician groups within our company have similar expectations regarding the tasks and abilities associated with the role of a lead programmer. Essential tasks were identified as the willingness and attitude to perform a task, communication skills, creating ADaM specifications, review of CRF, manage study progress and resource management, programming of ADaM datasets, and review SAP. The remainder of the tasks were classed as very important, with the exception of review of edit checks and review of protocol as somewhat important.

Participants were then tasked with ranking task groups in order of importance; placing the most important task group first to the least important last. Similar to the previous survey question, a weight was added to each task group pertaining to the position it was ranked at. Once again, the responses from both programmer and statistician groups proved to differ insignificantly. Managing study progress and resources were identified to be the overall most important task required from a lead programmer, followed by willingness and attitude to perform a task and the ability to understand and interpret the data.

This underlined that certain soft skills, such as willingness, attitude, communication and ability to understand and interpret data is as important to the role of a lead programmer as technical programming expertise and study knowledge. It also argued the point that being aware of the order of importance of tasks can help to correctly allocate time and resource to individual tasks according to priority. By taking this approach a lead programmer will be able to better plan the run of a study, and possibly reduce occurrences of critical situations running out of time to meet timelines. Failing to plan ahead when leading a study, is to plan to fail.

The second part of the survey allowed for free text responses, where participants could add comments on beneficial qualities and skillsets and raise issues they find frustrating when working with a lead programmer. The responses were categorised into three sections namely communication, leadership qualities and programming and study specific knowledge. Each section were discussed in more detail, and it became more apparent that there has been a shift it qualities and abilities needed, or rather expected from a lead programmer.

Finally this paper discussed the shift how the role of a lead programmer is being transformed into data scientists. It talks about a need from the wider community to share our data assets beyond company and even industry borders, and the impact that has on the role. Clinical Data Sciences is a new department encompassing both data management and programming functions, and as data scientists within this new department, lead programmers will need additional skills and qualities in order to meet the additional needs as they arise.

Changing from lead programmers to data scientists will require the holder of the role to take a step back and take on a more overseeing role, but also require the in-depth understanding and knowledge of the data to identify areas where our technical expertise are needed. The role will include data management functions and we will be responsible for ensuring consistently better quality data throughout the course of the study. We will also be required to apply our knowledge of the analysis at the end of the study, at the start up to allow for analysis-ready data sooner. Instead of having a supporting role, data scientists will be an equal complimentary partner to our statistician colleagues.

As part of the transformation to data scientists, the holder of the role will also be required to work in partnerships with internal as well as external partners. This will require continuous involvement in an overseeing capacity, as well as technical expertise to be applied as needed. This will demand exceptional project management and communication skills on top of technical know-how and study specific knowledge.

Within these partnership relationships, communication will be a key element to success. It is imperative to set realistic and achievable expectations upfront, allow discussions and build flexibility into timelines. Make sure that the procedures and SOPs to follow have been clearly defined. Measure and evaluate performance and overall wellbeing of the partnership at regular intervals. Keep in mind that people, cultures and company principles are different and needs to be factored in when decisions are being made. Mutual respect, honesty and common decency will pave the way to a successful partnership.