Tag Archives: SAS

The Only Three (3) [Programming] Languages You Should Learn Right Now (eClinical Speaking)

On a previous article that I wrote in 2012, I mentioned 4 programming languages that you should be learning when it comes to the development of clinical trials. Why is this important, you may ask? Clinical Trials is a method to determine if a new drug or treatment will work on disease or will it be beneficial to patients. Anayansi Gamboa - Clinical Data Management Process If you have never written a line of code in your life, you are in the right place. If you have some programming experience, but interesting in learning clinical programming, this information can be helpful.

But shouldn’t I be Learning ________?

Here are the latest eClinical programming languages you should learn:

1. SAS®: Data analysis and result reporting are two major tasks to SAS® programers. Currently, SAS is offering certifications as a Clinical Trials Programmer. Some of the skills you should learned are:

  • clinical trials process
  • accessing, managing, and transforming clinical trials data
  • statistical procedures and macro programming
  • reporting clinical trials results
  • validating clinical trial data reporting

2. ODM/XML: Operational Data Modeling or ODM uses XML to build the standard data exchange models that are being developed to support the data acquisition, exchange and archiving of operational data.

3. CDISC Language: Yes. This is not just any code. This is the standard language on clinical trials and you should be learning it right now. The future is here now. The EDC code as we know it will eventually go away as more and more vendors try to adapt their systems and technologies to meet rules and regulations. Some of the skills you should learn:

  • Annotation of variables and variable values – SDTM aCRF
  • Define XML – CDISC SDTM datasets
  • ADaM datasets – CDISC ADaM datasets

CDISC has established data standards to speed-up data review and FDA is now suggesting that soon this will become the norm. Pharmaceuticals, bio-technologies companies and many sponsors within clinical research are now better equipped to improve CDISC implementation.

Everyone should learn to code

Therefore, SAS® and XML are now cooperating. XML Engine in SAS® v9.0 is built up so one can import a wide variety of XML documentation. SAS® does what is does best – statistics, and XML does what it does best – creating reportquality tables by taking advantage of the full feature set of the publishing software. This conversation can produce report-quality tables in an automated hands-off/light out process.

Standards are more than just CDISC

If you are looking for your next career in Clinical Data Management, then SAS and CDISC SDTM should land you into the right path of career development and job security.

Conclusion: Learn the basics and advanced SAS clinical programming concepts such as reading and manipulating clinical data. Using the clinical features and basic SAS programming concepts of clinical trials, you will be able to import ADAM, CDISC or other standards for domain structure and contents into the metadata, build clinical domain target table metadata from those standards, create jobs to load clinical domains, validate the structure and content of the clinical domains based on the standards, and to generate CDISC standard define.xml files that describes the domain tables for clinical submissions.

Anayansi Gamboa has an extensive background in clinical data management as well as experience with different EDC systems including Oracle InForm, InForm Architect, Central Designer, CIS, Clintrial, Medidata Rave, Central Coding, OpenClinica – Open Source and Oracle Clinical.

Disclaimer: The legal entity on this blog is registered as Doing Business As (DBA) – Trade Name – Fictitious Name – Assumed Name as “GAMBOA”.


SAS Institute


SAS: Problem Solving 1

Today we want to provide you with some problem-resolution options for a simple situation.


We have 3 variables that we will call Var1, Var2, Var3. Their values ranges from 1-9 and we would like to create new variables that would flag a response based on their value on any of the 3 previous variables.


Response 7 has one (1) variable Var1, then VarFlag1 should be equal to 1

If the same response 7 has a value 3 on Var3, then the VarFlag3 should be equal to 1

Solution 1: Data step

data mydata;
input Var1, Var2, Var3;
array vars Var1 Var2 Var3;
array flags flag1-flag9;
do over vars;
if 1 <=vars<=9 then

2.5 7 9;

proc print; run;

Solution 2: array solution

array flag{*} flag1-flag9;
do j=1 to 9;

Solution 2: Macro solution

%macro SET_Flags(Flag,num);
%do 1=1 % to &n;
&Flag.&i=(Var=&i or Var=&i or Var=&i);
%mend Set_Flags;

Data Mydata;

4 Programming Languages You Should Learn Right Now (eClinical Speaking)

I am a strong believer that learning a new language makes you better at the others, but I am not a “learn to code” advocate since a foreign language (I know 3 languages and currently learning my 4th and I have a “to learn” language including Italian and Arabic, if I ever find some free time) or even music (I love to play drums) are equally beneficial. But if you want to obtain a job in the pharmaceutical industry, here are the list of programming languages you should learn:

  1. C#:

What it is: A general-purpose, compiled, object-oriented programming language developed by Microsoft as part of its .NET initiative.

Why you should learn it: If you are looking to become a Medidata Custom Function programmer or Oracle InForm EDC Developer then you should.

2. Python:

What it is: An interpreted, dynamically object-oriented, open-source programming language that utilizes automatic memory management.

Why you should learn it: If you are like me always looking to learn new technology, love Google platforms and perhaps want to become a Timaeus Trial Builder, you should learn it. It is used on a lot open-source technologies.

Everyone should learn to code

3. PL/SQL or SQL:

What it is: PL/SQL stands for Procedural Language/SQL.

Why you should learn it: If you are like me additive to databases then Oracle should be your choice. If you want to become an Oracle Clinical programmer or Database administrator, you should learn Oracle PL/SQL.

4- SAS

What it is: SAS stands for “Statistical Analysis System” (software). It is the most powerful and comprehensive statistics software available.

Why you should learn it: SAS skills are in high demand nowadays. If you are able to obtain the SAS Certification and a few years of experience in the Pharmaceutical industry, you will be in good shape. If you are new and looking for training there are several options available from SAS Institute to private vendors such as Clinovo to even learning on your own. I most warn you as it will be difficult to obtain a job without experience. Nevertheless, once you are in, it can only get better.

Remember that your job is not just to code but to solve real problems. Your ability to code covers a lot of range of skills: from critical thinking, problem analysis & solving, logic, etc.

So which one are you going to give a try?

Let me know what is your preference. Happy Programming!

The best thing about a boolean is even if you are wrong, you are only off by a bit.(Anonymous)

Anayansi Gamboa has an extensive background in clinical data management as well as experience with different EDC systems including Oracle InForm, InForm Architect, Central Designer, CIS, Clintrial, Medidata Rave, Central Coding, OpenClinica Open Source and Oracle Clinical.

Disclaimer: The legal entity on this blog is registered as Doing Business As (DBA) – Trade Name – Fictitious Name – Assumed Name as “GAMBOA”.

SAS Institute
Learn PL/SQL

Did You Know?

Did You Know? »

It is a new SAS procedure that is available as a hotfix for SAS 8.2 version. It is available by default for SAS 9.1.3 and latest versions.

PROC CDISC is a procedure that allows SAS programmers to import and export XML files that are compliant with the CDISC ODM version 1.2 schema.

Source: SAS programming in the Pharmaceutical Industry text book

SAS Cheat Sheet – Part 4


Code Description
<w.d> Reads standard numeric data
<datew.> Reads date values (ddmmmyy)/td>;
<$w.> Reads standard character data
<$VARYINGw> Reads character data of varying length

SAS represents a date internally as the number of days
between January 1, 1960 and the specified date. Dates
before 1960 are represented with negative numbers. A
SAS date value is a numeric variable.

SAS DateValue Calendar DateValue
0 January 1, 1960
30 January 31, 1960
366 January 1, 1961
-365 January 1, 1959

How do you use an InFormat to create a SAS date?

If the raw fields are aligned in columns, use formatted input and specify the informat.
e.g. input @1 visitdt yymmdd8.;

With list input, use a separate INFORMAT statement or a modifier.
informat visitdtyymmdd8.;
input visitdt;
input visitdt : yymmdd8.;

How do you refer to a particular date in SAS?

To create a SAS date constant, write the date enclosed in single or double quotes, followed by a D.
e.g. age = ’20-MAR-2012’d – incdt;
where visitdt lt “1may12”D;

How do you work with SAS date values?

When a variable is a SAS date value, you can easily apply operations such as addition and subtraction. To find the number of days between two dates, simply subtract the two SAS date variables.
e.g. daysbtwn = visitdt1 – visitdt2;

Comparison operators can also be used.
if visitdt1 <; visitdt2 then do;

SAS Cheat Sheet – Part 6

The macro facility is a tool you can use in your programming./*

Macro Language

%DO macro-var=start_value %TO end_value

%DO %WHILE (expression); /*Executes a section of a macro repetitively while a condition is true*/

%DO %UNTIL (expression); /*Executes a section of a macro repetitively until a condition is true*/

Macro variables can be stored in either the global symbol table or in a local symbol table

%GLOBAL macro-variable(s); /*Creates macro variables that are available during the execution of an entire SAS session*/

%IF expression %THEN action; <%ELSE action;> /Conditionally process a portion of a macro*/

%LENGTH (character string | text expression) /*Returns the length of a string*/

The name assigned to a macro variable must be a valid SAS name

%LET macro-variable =<value>; /*Creates a macro variable and assigns it a value*/

%MACRO mname (<pp1><…,ppn><kp1=value<..<kpn=v>);/*Begins a macro definition*/

%MEND <macro-name>; /*Ends a macro definition*/

%SCAN(argument,n<,delimiters>)/*Search for a word that is specified by its position in a string*/

%SUBSTR (argument,position<,length>)/*Produce a substring of a character string*/

%UPCASE (character string | text expression) /*Convert values to uppercase*/

Macro variable values are text values

Macro Quoting

%QUOTE | %NRQUOTE and %BQUOTE | %NRBQUOTE /*Mask special characters and mnemonic operators in a resolved value at macro execution */

%STR | %NRSTR /*Mask special characters and mnemonic operators in constant text at macro compilation */

%SUPERQ /*Masks special characters/mnemonic operators at macro execution but prevents further resolution of the value*/


*----------------------------------------------------------------*//* This macro will produce summary statistics ----------------   */

%Macro safety1;
%odscmd (start,portnum=11,rptname=AEActivityReport);
Data _null_;
fstdate = mdy(month(thedate), 1, year(thedate));
lstdate = intnx(‘month’, fstdate, 1)-1;
call symput (‘strtdate’, put(fstdate, date9.));
call symput (‘stpdate’, put(thedate, date9.));

%put strtdate: &strtdate;
%put stpdate: &stpdate;
* more code goes here;


SAS Cheat Sheet – Part 3


Code Description
<w.d> standard numeric
<COMMAw.d> writes numeric values with commas and decimal points
<Zw.d> print leading zeros
<$w.> writes standard character data
<$CHARw.> writes standard character data
<$VARYINGw.> Writes character data of varying length

Format is used to change how the values of variables are displayed/portrayed in your SAS output. A more formal term for this process is “altering the external representation of the values of variables in a SAS data set.

So how to add leading zeros ? e.g., 999 –> 0999

In order to create variable LEADZEROS_ID, you can use the PUT function and the format Zw. which adds leading zeros to the specified length. Here’s an example:


data leadzeros;

1 Mark M 1 2.5
2 Ace M 3 44
3 Fuzzy M 2 18
4 Champ M 4 55
5 Tom M 6 63.5 62.5
6 Ivon M 7 83
7 Balboa M 12 64.5
data leadzeros;
set leadzeros;
leadzeros_num = put(id,z4.); *z4. format is used so 3 leading zeros will be added if the ID length is 1;

Another common format is the COMMAw.d: this format writes numeric values with commas that separate every three digits and a period that separates the decimal fraction.

w = specifies the width of the output field
d = optionally specifies the number of digits to the right of the decimal point in the numeric value

e.g. put @10 profit comma10.2;
so this number 45678.45 will be formatted to 45,678.45

From Non-SAS Programmer to SAS Programmer

SAS Programmers come from many different educational backgrounds. Many has started their careers as a Data Manager in a CRO environment and grew to become a SAS programmer. Others have gone to college and pursued degrees in math, statistics or computer science degree.

Do you have SAS Skills? First, you need to find out more about statistical programming desire skills and start to slowly learn what SAS programmers and statisticians do in the pharmaceutical industry. It is also important to understand the Drug Development and Regulatory process so that you have a better understanding of the industry as a whole as well as the drug approval process.

In addition, I have personally attended several workshop on Statistics for Non-statistician provided by several of my past employers/clients (GSK, Sanofi-Aventis, etc) so I could have a greater understanding of statistics role. I am personally more inclined to the EDC development than becoming a biostatistician but these are just some of the few steps you could take to grow your career as a SAS programmer.

Practice, Practice, Practice!

To begin learning how to actually program in SAS, it would be a good idea to enroll to a SAS course provided by the SAS Institute near you or via eLearning. I have taken the course SAS Programming 1: Essentials, and I would recommended. You could also join SUGI conferences and other user groups near your city/country. Seek every opportunity to help you gain further understanding on how to efficiently program in the pharmaceutical industry. It could well land you a Junior SAS programming position.

Transitioning to a SAS Programming role: Now that you have gotten your first SAS programming job, you will need to continue your professional development and attend additional training, workshops, seminars and study workgroup meetings. The SAS Institute provide a second level, more advance course Programming II: Manipulating Data with the Data Step, SAS Macro Language and SAS macro Programming Advanced topics. There are also SAS certifications courses available to help you prepare to become a SAS certified programmer.

There is a light at the end of the tunnel: Advance!

Your ongoing development will be very exciting and challenging. Continued attending SAS classes as needed and attending industry related conferences such as PharmaSUG to gain additional knowledge and insight on how to perform your job more effectively and efficiently.

As you can see, it is possible to ‘grow’ a SAS programmer from a non-programming background to an experience programmer. All of the classes, training, and projects you will work on are crucial in expanding your SAS knowledge and will allow you to have a very exciting career opportunity ahead of you.

Anayansi Gamboa has an extensive background in clinical data management as well as experience with different EDC systems including Oracle InForm, InForm Architect, Central Designer, CIS, Clintrial, Medidata Rave, Central Coding, OpenClinica Open Source and Oracle Clinical.

Five (5) Types of Errors When Writing SAS Programs

Fortunately, SAS system fixes some mistakes made by SAS programmers. For example, SAS has gotten so smart over the years that it is now almost impossible to get an error by misspellilng a keyword. If you misspelled a keyword in a SAS program, SAS will almost always figure out what you meant to say and run the statement correctly in spite of your poor spelling skills. But SAS cannot fix all programming errors for you, so here are some of the most comment errors and how to debug them.

Syntax = compilation time errors
For example: missing semicolon [proc means data=work.demog run;]

Semantic = compile time error when the language element is correct, but the element might not be valid
For example: DATA step procedures wrong results but no error message

Execution-time = when SAS attemps to execute a program and execution fails

Data = execution time error when data values are invalid
For example: missing values were generated, numeric to character conversion, invalid data or character field is truncated

Macro-related = when you use the macro facility incorrectly

The most important rule in debugging SAS programs is to always check the SAS log. It is important to review the log messages each time you submit your program. To review the log, check at the top for messages such as ERRORS, WARNING or NOTES.

WARNING: The data set WORK.DEMOG may be incomplete. When this step was stopped there were 0 observations and 5 variables.

This message tell you that SAS did run a DATA step or able to peform the action, but for some reason there are zero observations. This could be a non-issue, but generally speaking when you go to the trouble of creating a data set, you want some data in it.

NOTE: SAS went to a new line when INPUT statement reached past the end of a line.
NOTE: The data set WORK.DEMOG has 2 observations and 2 variables.

Notes are simple there to inform you of the status of your program. If you were expecting 27 observations, one for each input record, then this would tell you that something went wrong. Notes can also be useful to streamline codes but writing more efficient programs. For example, if the run-time (total process time) of a report takes too long to run then this is another way to check your code.

A missing semi-colon can be notorious for misleading error messages. The compiler depends on a sequence of key words to identify the type of statement. If you leave out a semi-colon then you hide the key word of the next satement. The compiler is likely to find something wrong, but it is usually not the real mistake – the missing semicolon. Hence the errors and warnings are just hints about what the compiler is seeing instead of the underlying problem.

One final note, you can insert a PUTLOG statements to check to idenfity error(s):

data demog;
set edc.demog_summary;
by patient_id;
if first.patient_id=1 then race=’white’;
putlog race=;

proc print data=demog;

The DATA step debugger offers SAS programmers a new way to investigae logic errors. Since SAS runs programs in two phases, SAS compiles it then executes the program. To invoke the debugger, add / DEBUG to the end of your DATA statement and then run your DATA step.

If we modify the previous DATA step:

data demog / DEBUG;
set edc.demog_summary;
by patient_id;
if first.patient_id=1 then race=’white’;

After you submit the above code, two windows appear: the DEBUGGER LOG window and the DEBUGGER SOURCE window. As you may have imagined, the DEBUGGER LOG window contains messages from the debugger and command line. The SOURCE window contains your DATA step statements with current line highlighted. SAS executes each line of your program for the first observation, then returns to the top of the DATA step for the second observation and so on.

As you can see, there are many ways to check your SAS programs for errors, even when the ouptput looks fine. Notes are just as important as warnings and error messages. I strongly recommend that you learn how to use the debugger as it can save lots of time when debugging your program!

Anayansi Gamboa has an extensive background in clinical data management as well as experience with different EDC systems including Oracle InForm, InForm Architect, Central Designer, CIS, Clintrial, Medidata Rave, Central Coding, OpenClinica Open Source and Oracle Clinical.