Tag Archives: Programming

Data Entry|Programador | Diseño Gráfico | SEO Servicios para su empresa

¿Usted está luchando trabajar con independientes desarrolladores / diseñadores y en busca de una solución fiable aún rentable. Dejamos que usted emplea empleados virtuales contractuales para cualquier servicio relacionado TI (computación), como software / web / aplicación móvil / diseño gráfico o la entrada de datos en tiempo completo / tiempo parcial / base de proyecto.

Disponibilidad- actual de recursos

1- Diseñador Gráfico – EXP 2-5 Años – ($ 749- $ 1499) [disponibilidad-2]

2- Diseñador HTML -Exp- 3 Years- $ 1349 [disponibilidad-2]

3- PHP o la red del punto Developer- EXP 1-3 Años – ($ 999- $ 1999 [disponibilidad-4]

4- Mobile App Developer: $ 1500 [disponibilidad-2]

5- Contenido escritor-$ 999 [disponibilidad-3]

6- SEO / pago por clic de Google experto- $ 999 [disponibilidad-4] Entrada de Datos

7- (búsqueda de datos, Gestión de la lista, CMS entradas- $ 799 También disponible, Desarrolladores/programadores con mas experiencia (MS SQL o Oracle, .NET, PL/SQL), administradores de proyectos para diversas tecnologías! Las principales características son: • contrato de mes a mes • escalar fácilmente y rápidamente hacia arriba / abajo

Responder a este correo electrónico con sus requisitos / consultas y estaré feliz de compartir nuestro perfil de empresa or Click aqui!


Society of the Sojourner

Dear Readers,

Please accept our apologies for having ”dropped off the planet earth” during the last 6 months, with not producing any new posts on this blog.

Due to circumstances beyond our control, we here, at EDC Developer, do hope that you have not given up on reading us…

Actually, in some ways…we did go ”off road”…

More details about this will be coming very soon …

In the meantime, please accept, a small ”heads-up” :

Somewhere within, and, underneath …one of SWITZERLAND`S Mountains … :

Fair Use Notice: This video contains some copyrighted material whose use has not been authorized by the copyright owners. We believe that this not-for-profit, educational, and/or criticism or commentary use on the Web constitutes a fair use of the copyrighted material (as provided for in section 107 of the US Copyright Law.) If you wish to use this copyrighted material for purposes that go beyond fair use, you must obtain permission from the copyright owner. Fair Use notwithstanding we will immediately comply with any copyright owner who wants their material removed or modified, wants us to link to their website or wants us to add their photo.

Disclaimer: The EDC Developer blog is “one man’s opinion”. Anything that is said on the report is either opinion, criticism, information or commentary. If making any type of investment or legal decision it would be wise to contact or consult a professional before making that decision.

Disclaimer:De inhoud van deze columns weerspiegelen niet per definitie de mening van {EDC Developer}.

SAS Cheat Sheet – Part 4


Code Description
<w.d> Reads standard numeric data
<datew.> Reads date values (ddmmmyy)/td>;
<$w.> Reads standard character data
<$VARYINGw> Reads character data of varying length

SAS represents a date internally as the number of days
between January 1, 1960 and the specified date. Dates
before 1960 are represented with negative numbers. A
SAS date value is a numeric variable.

SAS DateValue Calendar DateValue
0 January 1, 1960
30 January 31, 1960
366 January 1, 1961
-365 January 1, 1959

How do you use an InFormat to create a SAS date?

If the raw fields are aligned in columns, use formatted input and specify the informat.
e.g. input @1 visitdt yymmdd8.;

With list input, use a separate INFORMAT statement or a modifier.
informat visitdtyymmdd8.;
input visitdt;
input visitdt : yymmdd8.;

How do you refer to a particular date in SAS?

To create a SAS date constant, write the date enclosed in single or double quotes, followed by a D.
e.g. age = ’20-MAR-2012’d – incdt;
where visitdt lt “1may12”D;

How do you work with SAS date values?

When a variable is a SAS date value, you can easily apply operations such as addition and subtraction. To find the number of days between two dates, simply subtract the two SAS date variables.
e.g. daysbtwn = visitdt1 – visitdt2;

Comparison operators can also be used.
if visitdt1 <; visitdt2 then do;

SAS Cheat Sheet – Part 6

The macro facility is a tool you can use in your programming./*

Macro Language

%DO macro-var=start_value %TO end_value

%DO %WHILE (expression); /*Executes a section of a macro repetitively while a condition is true*/

%DO %UNTIL (expression); /*Executes a section of a macro repetitively until a condition is true*/

Macro variables can be stored in either the global symbol table or in a local symbol table

%GLOBAL macro-variable(s); /*Creates macro variables that are available during the execution of an entire SAS session*/

%IF expression %THEN action; <%ELSE action;> /Conditionally process a portion of a macro*/

%LENGTH (character string | text expression) /*Returns the length of a string*/

The name assigned to a macro variable must be a valid SAS name

%LET macro-variable =<value>; /*Creates a macro variable and assigns it a value*/

%MACRO mname (<pp1><…,ppn><kp1=value<..<kpn=v>);/*Begins a macro definition*/

%MEND <macro-name>; /*Ends a macro definition*/

%SCAN(argument,n<,delimiters>)/*Search for a word that is specified by its position in a string*/

%SUBSTR (argument,position<,length>)/*Produce a substring of a character string*/

%UPCASE (character string | text expression) /*Convert values to uppercase*/

Macro variable values are text values

Macro Quoting

%QUOTE | %NRQUOTE and %BQUOTE | %NRBQUOTE /*Mask special characters and mnemonic operators in a resolved value at macro execution */

%STR | %NRSTR /*Mask special characters and mnemonic operators in constant text at macro compilation */

%SUPERQ /*Masks special characters/mnemonic operators at macro execution but prevents further resolution of the value*/


*----------------------------------------------------------------*//* This macro will produce summary statistics ----------------   */

%Macro safety1;
%odscmd (start,portnum=11,rptname=AEActivityReport);
Data _null_;
fstdate = mdy(month(thedate), 1, year(thedate));
lstdate = intnx(‘month’, fstdate, 1)-1;
call symput (‘strtdate’, put(fstdate, date9.));
call symput (‘stpdate’, put(thedate, date9.));

%put strtdate: &strtdate;
%put stpdate: &stpdate;
* more code goes here;


SAS Cheat Sheet – Part 3


Code Description
<w.d> standard numeric
<COMMAw.d> writes numeric values with commas and decimal points
<Zw.d> print leading zeros
<$w.> writes standard character data
<$CHARw.> writes standard character data
<$VARYINGw.> Writes character data of varying length

Format is used to change how the values of variables are displayed/portrayed in your SAS output. A more formal term for this process is “altering the external representation of the values of variables in a SAS data set.

So how to add leading zeros ? e.g., 999 –> 0999

In order to create variable LEADZEROS_ID, you can use the PUT function and the format Zw. which adds leading zeros to the specified length. Here’s an example:


data leadzeros;

1 Mark M 1 2.5
2 Ace M 3 44
3 Fuzzy M 2 18
4 Champ M 4 55
5 Tom M 6 63.5 62.5
6 Ivon M 7 83
7 Balboa M 12 64.5
data leadzeros;
set leadzeros;
leadzeros_num = put(id,z4.); *z4. format is used so 3 leading zeros will be added if the ID length is 1;

Another common format is the COMMAw.d: this format writes numeric values with commas that separate every three digits and a period that separates the decimal fraction.

w = specifies the width of the output field
d = optionally specifies the number of digits to the right of the decimal point in the numeric value

e.g. put @10 profit comma10.2;
so this number 45678.45 will be formatted to 45,678.45

iReview in Clinical Data Management

JReview® is the web-enabled version of Integrated Review™ (iReview). It allows users to view, create, print, and interact with their Integrated Review™ objects locally on an Intranet or securely over the Internet. JReview® can be run in two different modes of operation (authoring and non-authoring) in addition to two modes of communication (clear-text and SSL).

iReview Common Development Practice:

  • iReview allows you to saved the library of objects to be deployed at “Global” level in the production environment.
  • Create separate categories (folders for DEV/QC/UAT) before approval (deployment into production)
    – “Development”
    – “QC”
  • Create study specific folders under those categories (e.g. DEV/QC/UAT)
  • Configure UserGroups to manage privileges appropriately at the category level– – “Developers can access – Development”
    – “QR/QT can access QC”


  • You can query iReview metadata
  • Business rule verification by checking

– “Panel names, item names”
– “Object location e.g. Public, private or usergroup”

  • Use of SQL to query iReview objects metadata
  • The information in CONTENTBLOCK is parsed to get
    additional metadata information for a particular iReview
  • Define a detailed QC checklist for each object in the Global Library
  • Maintain a lessons learned document (knowledge base) to improve the development process

  • Continuously improve processes by collecting Metrics

    – Development time

    – QC time

    – Rework time

Advance Functionality

  • Deploy reports with dynamic Filter values
  • Filter values are not static and change during trial conduct
  • Deployment for non-technical end-users
  • Provide easy access to report
  • Create Lookup table(s) in the backend
  • Populate Lookup table(s) with study specific Filter values

  • Using “Filter Output” in IR, add appropriate nested queries to the WHERE clause

  • The use of ImportSQL, more complex dynamic filtering so no need to hardcode values in the front end

  • Saves development time by avoiding the creation of study specific filters and increases re-usability

  • Flexibility to activate/inactivate filter values via backend

Import SQL

  • Modify an Import SQL panel by adding more items will not impact existing reports already using this Import SQL

  • Import SQL has a limitation with max of 2000 characters (will result in the error below)

A workaround would be to create a stored procedure or a view

Patient Selection Criteria

  • Modifying a PSC has no impact on already saved existing reports using this PSC

Object Specifications window

  • Removing Objects (missing folders)
    – When all the objects are removed from a folder in the Object
    Specifications window, the folder with no objects will be hidden but
    not removed
    e.g. Drug Safety ..> All AEs ..> SAE Reports ..>SAE reconciliation
  • Removing all objects under “SAE Reports” folder will result in the “SAE Reports” folder being hidden
  • The workaround would be to use the Category section of Object Management tool to remove these hidden folders

Navigating iReview Windows

  • If you have hundreds of saved objects, typing the first few letters (similar to Windows Explorer) will help with easy scrolling and navigation in the Object Specifications window

Reference: Integrated Clinical Systems, Inc.

From Non-SAS Programmer to SAS Programmer

SAS Programmers come from many different educational backgrounds. Many has started their careers as a Data Manager in a CRO environment and grew to become a SAS programmer. Others have gone to college and pursued degrees in math, statistics or computer science degree.

Do you have SAS Skills? First, you need to find out more about statistical programming desire skills and start to slowly learn what SAS programmers and statisticians do in the pharmaceutical industry. It is also important to understand the Drug Development and Regulatory process so that you have a better understanding of the industry as a whole as well as the drug approval process.

In addition, I have personally attended several workshop on Statistics for Non-statistician provided by several of my past employers/clients (GSK, Sanofi-Aventis, etc) so I could have a greater understanding of statistics role. I am personally more inclined to the EDC development than becoming a biostatistician but these are just some of the few steps you could take to grow your career as a SAS programmer.

Practice, Practice, Practice!

To begin learning how to actually program in SAS, it would be a good idea to enroll to a SAS course provided by the SAS Institute near you or via eLearning. I have taken the course SAS Programming 1: Essentials, and I would recommended. You could also join SUGI conferences and other user groups near your city/country. Seek every opportunity to help you gain further understanding on how to efficiently program in the pharmaceutical industry. It could well land you a Junior SAS programming position.

Transitioning to a SAS Programming role: Now that you have gotten your first SAS programming job, you will need to continue your professional development and attend additional training, workshops, seminars and study workgroup meetings. The SAS Institute provide a second level, more advance course Programming II: Manipulating Data with the Data Step, SAS Macro Language and SAS macro Programming Advanced topics. There are also SAS certifications courses available to help you prepare to become a SAS certified programmer.

There is a light at the end of the tunnel: Advance!

Your ongoing development will be very exciting and challenging. Continued attending SAS classes as needed and attending industry related conferences such as PharmaSUG to gain additional knowledge and insight on how to perform your job more effectively and efficiently.

As you can see, it is possible to ‘grow’ a SAS programmer from a non-programming background to an experience programmer. All of the classes, training, and projects you will work on are crucial in expanding your SAS knowledge and will allow you to have a very exciting career opportunity ahead of you.

Anayansi Gamboa has an extensive background in clinical data management as well as experience with different EDC systems including Oracle InForm, InForm Architect, Central Designer, CIS, Clintrial, Medidata Rave, Central Coding, OpenClinica Open Source and Oracle Clinical.

Good Programming Practice for Clinical Trials by Sunil Gupta

The following are draft recommendations a Good Programming Practice for analysis, reporting and data manipulation in Clinical Trials and the Healthcare industries.

The purpose is to encourage contributions from across companies, non-profit organizations and regulators in an attempt to create a consensus recommendation. The ambition is that this page becomes recognized by the Pharmaceutical Industry, Clinical Research and Health Care Organizations as well as Regulatory Authorities.

The hope is that the Practice can be reviewed and endorsedby the relevant management teams of several Pharmaceutical companies and major Contract Research Organizations and promoted through relevant professional organizations such as PharmaSUG, PhUSE, PSI, CDISC.


The Good Programming Practices are defined in order to:

  • Ensure the clarity of the code and facilitate code review;
  • Save time in case of maintenance, and ease the transfer of code among programmers or companies;
  • Minimize the need for code maintenance by robust programming;
  • Minimize the development effort by development and re-use of standard code and by use of dynamic (easily adaptable) code;
  • Minimize the resources needed at execution time (improve the efficiency of the code);
  • Reduce the risk of logical errors.
  • Meet regulatory requirements regarding validation and 21CFRPart11 compliance

Note: As often, the various guidelines provided hereafter may conflict with one another if applied in too rigorous a way. Clarity, efficiency, re-usability, adaptability and robustness of the code are all important, and must be balanced in the programming practice.

Regulatory Requirements


21CFRPart11 compliance

Readability and Maintainability


English is an international language and study protocols, study reports for practical reasons (regulatory authorities, inlicensing, outlicensing, partnerships, mergers) are mostly written in English, therefore it is recommended to write the SAS code and comments in English.

Header and Revision History

  • Include a header for every program (template below).
**********************************************************;* Program name      :** Author            :** Date created      :** Study             : (Study number)*                     (Study title)** Purpose           :** Template          :** Inputs            :** Outputs           :** Program completed : Yes/No** Updated by        : (Name) – (Date): *                            (Modification and Reason)**********************************************************;
  • In addition to your name or initials, use your login ID to identify yourself in the header. This is so there is no ambiguity on the identify of each programmer.
  • Update the revision history at each code modification made after the finalization of the first version of a program.

Note: When you copy a program from another study, you became the author of this program, and you should clear the revision history. You can specify the origin of the program under the “Template” section of the header.

Below is an example with comments of an alternative comment block that I think is more useful for Open Source programming. PaulOldenKamp 16:54, 5 April 2009 (UTC)

/** ---------------------------------------------------------------------------------------- $Id: os3a_autoexec.sas 152 2008-11-17 01:48:40Z Paul_ok01 $ <== Id info automatically inserted with each commit to Subversion version control Application: OS3A - Common Programs Description: OS3A session initialization program. Previous Program: None Saved as: c:\os3a\trunk\os3a_autoexec.sas <== locations, local and web, where the pgm http://os3a.svn.sourceforge.net/viewvc/os3a/trunk/os3a_autoexec.sas can be found Change History: Date Prog Ref Description 04/26/2008 PMO [1] Initial programming for os3a <== date, programmer initials, ref number [1] to link to specific location of change Copyright: Copyright (c) 2008 OS3A Program. All rights reserved. <== Always tell who owns the program so one Copyright Contact: paul_ok01@users.sourceforge.net can ask permission from the copyright holder License: Eclipse Public License v1.0 <== Tell folks how they are licenced to use This program and the accompanying materials are made available under the program. the terms of the Eclipse Public License v1.0 which accompanies this distribution, and is available at www.eclipse.org/legal/epl-v10.html Contributors: Paul OldenKamp, POK_Programming@OldenKamp.org - Develop initial pgm. <== Identify significant contributors <== tag identifies start of info @purpose OS3A system initialization program. used by the Codedoc Perl script to Set up initial options and global macro variables. produce external HTML documentation @param SYSPARM - input provided from program initiation call; default: main_FORE for production. @symbol sysRoot - system root location; Windows - C:/, UNIX - / @symbol remove_cmd - system command to remove file; Windows - erase, UNIX - rm @symbol os3aRoot - directory location of os3a root @symbol futsRoot - directory location of FUTS top level macros. @symbol Root - directory location of sub-project identified with Four Letter Acronym ----------------------------------------------------------------------------------------- */

The results from encoding the header and comments in a SAS program can be seen on the CodeDoc web page. See http://www.thotwave.com/products/codedoc.jsp.
CodeDoc Download Page


  • Include a comment before each major DATA/PROC step, especially when you are doing something complex or non-standard. Comments should be comprehensive, and should describe the rationale and not simply the action. For example, do not comment “Access demography data”; instead explain which data elements and why they are needed.
  • Organize the comments into a hierarchy.
  • Do not include numbers in comments.

Reason: It avoids heavy update when removing or inserting sections.

Naming Conventions

  • Use explicit and meaningful names for variables and datasets, with a maximum length of 8.
  • For permanent datasets, use a meaningful dataset label and variable labels.
  • When possible, never use the same name for a dataset more than once in the program.

Note: However, keep in mind that large intermediate files take a lot of SAS Workspace.

  • Name IN variable using “in” plus a meaningful reference to the dataset.


data aelst;   merge aesaes (in=inae) patpat (in=inpat);   by patno;   if inae and inpat;run;
  • Labels must have a maximum length of 40 characters.

Code Structure

  • It is mandatory to include libnames, options and formats in a separate setup program unless these are temporary formats or temporary options that are reset after being used.

Reason: It will guarantee that changes of the environment are taken into account in all programs run afterwards.

  • Use standard company macros to read in libnames and settings, to write out datasets, and for standard calculating and reporting.
  • One statement per line, but several are allowed if small and repeated or related. Long statements should be split across multiple lines.
  • Control system settings to show all executed code in the log as the default, in as clear manner as possible. The log should not be so lengthy that the programmer cannot easily navigate (if so, use highly visible comments with sufficient white space). System settings should be able to be easily changed in order for a user to debug a section of code or a macro, in order to temporarily display the %included code, resolved macro names, and logic.
  • Use a standard sequence for placing statements and group like statements together.
  1. Within a program:
    1. %LET statements and macro definitions
    2. Input steps
    3. Calculations
    4. Save final (permanent) datasets and created outputs
  2. Within a DATA step:
    1. All non-executable statements first (e.g. ATTRIB, LENGTH, KEEP…)
    2. All executable statements next

Reason: It increases the readability of the program.

  • Left-justify DATA, PROC, OPTIONS statements, indent all statements within.


proc means data=osevit;   var prmres;   by prmcod treat;run;
  • End every DATA/PROC step with a left-aligned RUN statement.

Reason: It explicitly defines the step boundary.

  • Insert at least one blank line after each RUN statement in DATA/PROC steps.
  • Indent statements within a DO loop, align END with DO.
  • Avoid having too many nested DO loop and IF-ELSE statements.
  • In case of interlinked DO loop, add a comment at the start (DO) and end (END) of each loop.


data test01;  do patno=1 to 40; * cycle thru patients;    do visit=1 to 3; * cycle thru visits;      output;     end; * cycle thru visits;  end; * cycle thru patients;run;
  • Insert parentheses in meaningful places in order to clarify the sequence in which mathematical or logical operations are performed.


data test02;  set test01;  if (visit=0  and vdate lt adate1)   or (visit=99 and vdate gt adate2) then delete;run;

Style conventions

Draft section : this may not be specific to clinical programming, but may be of use when considering a general standard for sharing programs.

Use of analysis datasets

For discussion of why programming output directly from raw data is generally avoided.


  • When you input or output a SAS dataset, use a KEEP (preferred to DROP) statement to keep only the needed variables.

Reason: The SAS system loads only the specified variables into the Program Data Vector, eliminating all other variables from being loaded.

  • When subsetting a SAS dataset, use a WHERE statement rather than IF, if possible.

Reason: WHERE subsets the data before entering it into the Program Data Vector, whereas IF subsets the data after inputting the entire dataset.

  • When using IF condition, use IF/ELSE for mutually exclusive conditions, and check the most likely condition first.

Reason: The ELSE/IF will check only those observations that fail the first IF condition. With the IF/IF, all observations will be checked twice. Also, consider the use of a SELECT statement instead of IF/ELSE, as it may be more readable.

  • Avoid unnecessary sorting. CLASS statement can be used in some procedure to perform by-group processing without sorting the data.


proc means data=osevit;  var prmres;  class treat;run;
  • If possible (i.e. not a sorting variable), use character values for categorical variables or flags instead of numeric values.

Reason: It saves space. A character “1” uses one byte (if length is set to one), whereas a numeric 1 uses eight bytes.

  • Use the LENGTH statement to reduce variable size.

Reason: Storage space can be reduced significantly.
Note: Keep in mind that a too limited variable length could reduce the robustness of the code (lead to truncation with different sets of data).

  • Use simple macros for repeating code.


  • Use the MSGLEVEL=I option in order to have all informational, note, warning, and error messages sent to the LOG.
  • In the final code, there should be no dead code that does not work or that is not used. This must be removed from the program.
  • Code to allow checking of the program or of the data (on all data or on a subset of patients such as clean patients, discontinued patients, patients with SAE or patients with odd data) is encouraged and should be built throughout the program. This code can be easily activated during the development phase or commented out during a production run using the piece of code detailed in Section 6.
  • It is not acceptable to have avoidable notes or warnings in the log (mandatory).

Reason: They can often lead to ambiguities, confusion, or actual error (e.g. erroneous merging, uninitialized variables, automatic numeric/character conversions, automatic formatting, operation on missing data…).
Note: If such a warning message is unavoidable, an explanation has to be given in the program (mandatory).

  • Always use DATA= in a PROC statement (mandatory).

Reason: It ensures correct dataset referencing, makes program easy to follow, and provides internal documentation.

  • Be careful when merging datasets. Erroneous merging may occur when:
  1. No BY statement is specified (set system option MERGENOBY=WARN or ERROR).
  2. Some variables, other than BY variables, exist in the two datasets (set system option MSGLEVEL=I), S writes a warning to the SAS log whenever a MERGE statement would cause variables to be overwritten at which the values of the last dataset on the MERGE statement are kept).
  3. More than one dataset contain repeats of BY values. A WARNING though not an ERROR is produced in the LOG. If you really need, PROC SQL is the only way to perform such many-to-many merges.

Reason: One has to routinely carefully check the SASLOG as the above leads to WARNING messages rather ERROR messages yet the resulting dataset is rarely correct.

  • When coding IF-THEN-ELSE constructs use a final ELSE statement to trap any observations that do not meet the conditions in the IF-THEN clauses.

Reason: You can only be sure that all possible combinations of data are covered if there is a final ELSE statement.

  • When coding a user-defined FORMAT, include the keyword ‘other’ on the left side of the equals sign so that all possible values have an entry in the format.

Reason: A missing entry in a user-defined FORMAT can be difficult to detect. The simplest way to identify this potential problem is to ensure that all values are assigned a format.
Note: This does not apply to INFORMATs. It could be more helpful to get a WARNING message when trying to INPUT data of unexpected format.

  • Try to produce code that will operate correctly with unusual data in unexpected situations (e.g. missing data).

Code for Data Checks

Build checks so that their purpose is clear, so that they can be toggled on or off, and remove them once they are no longer needed.

Activate/Deactivate Pieces of Code

In the beginning of the program, define a macro variable that you set to blank during the development phase or that you set equal to * for the production run:

%let c=;  or  %let c=*;

For the pieces of code that check the data/program, start each line with the macro variable defined above:

&c title “Check the visits for each patient”;&c proc freq data=patvis01;&c    table patno*visit;&c run;

This code will be executed if &c is blank (development), but will be commented out when &c=* (production).

Perform Checks on a Subset of Patients

In a separate code that you store under the study MACRO folder, list the subset of patients (clean patients, discontinued patients, patients with SAE or patients with odd data) that you want to look at:

%macro select;2076 2162 2271 2449%mend;

In the beginning of the program, define a second macro variable that you set equal to * when you want to perform checks on all data or to blank when you are interested in a subset of patients:

%let s=*;  or  %let s=;

For each checking code, add a piece of code that allows subsetting the data, and start each line of this piece of code with the 2 macro variables defined above:

&c title “Check the visits for each patient”;&c proc freq data=patvis01;&c    table patno*visit;&c &s where patno in (%select);&c run;

The check will be performed only if &c is blank, and it will be applied to all patients if &s=* or on the subset of patients if &s is blank.
Better still: input the list of check case IDs as a dataset.

Floating Point Error

Consider the real number system that we are familiar with. This decimal system (0 → ± ∞) is obviously infinite. Most computers use floating point representation, in which, a finite set of numbers is used to represent the infinite real number system. Thus, we can deduce that we will have some sort of error appearing from time to time. This is more generally termed Floating Point Error and occurs in computers due to hardware limitations.

The following paper goes someway in explaining why and how this happens and also possible solutions in how to approach this issue.

Paper reference:- http://www.lexjansen.com/phuse/2008/cs/cs08.pdf

Data Imputation versus Hardcoding

Please contribute

  • Definitions
    • Data Imputation
    • Hardcoding
  • Issues
  • Recommendation

Integrity of a data transfer

At a minimum, all data transfers should be validated by checking the observation counts for each SAS dataset
or the record counts in other formats, against counts provided by the sender. It is also helpful if the sender
can provide a checksum for each file transferred since this also ensures all content made it’s way to you
without transmission errors. There are many freeware programs available to calculate checksums for any file at websites such as http://sourceforge.net/ .


Draft section : recommendations particularly relevant for the development and use of macros and macro libraries.

Macros are particularly useful under the following circumstances:

  • Program code is used repeatedly
  • A number of steps must be taken conditionally, and the logic for these is clearly fixed (no need to think of all the steps that should be included in a program under a specific situation: the macro will deduce them for you and generate the appropriate data step or proc step code)
  • There is no trivial solution via “ordinary” SAS code
  • Their application must be easier as to program the code itself!
  • The usage helps users avoiding errors and omissions.

If used appropriately the following benefits can be achieved:

  • Increase in quality by avoiding programming bugs and errors
  • Savings in time and resources
  • Enforcement of standards, e.g. standard methods and standard outputs
  • Work can be more enjoyable as programmers can focus on the non-routine work

Ideally Macro development should follow a few rules:

  • Macro headers should clearly state all changes to environment and data that result from execution. Changes should be limited to those necessary for the focused purpose of the macro:
    • strictly controlled changes to input data and creation of output data
    • clear temporary data set clutter
    • no unexpected changes to system settings (options, titles, footnotes, etc)
    • no unexpected changes to external symbol tables
  • Scope of macro variables should be explicitly controlled using %global and %local statements.
  • Method of macro variable creation should demonstrate awareness of default scope:
  • The log matters:
    • Use Base SAS techniques whenever possible to avoid excessive code generation (log bloat). For example, macro definition should use DATA step array and DO loop processing rather than Macro %DO looping.
    • But use pure Macro Language for routine utility macros (see details, below).
    • Use appropriate comment style in macro definitions to properly annotate the SAS log when MPRINT in on. For example, use %* style commenting to explain macro logic, but /* style commenting to explain resulting code. (Or * style or PUT statement commenting as appropriate.)
    • Allow the users to control the appearance of the log via MPRINT, SYMBOLGEN, and MLOGIC.
  • Code within a macro definition should be germane, limited to the specific purpose of the macro. The use of a central repository for macros (“Macro library”) is suggested.
  • Macro Library: Code for routine tasks (eg, parameter checking, system and environment checking, messaging, etc.) should be handled by dedicated utility macros. Code for such routine tasks should not overwhelm the current macro definition, obscuring the purpose, and creating unnecessary maintenance overhead and lack of consistency within a library.
  • Macro Library: Parameter naming conventions should be used for common parameters such as input/output libnames and data sets. Explicit and transparent control of macro variable scope again becomes crucial to avoid accidental change of external symbol tables
  • Macro Library: Use pure Macro Language definitions whenever possible to improve program flow and avoid producing unnecessary Base SAS code. Returning a list of data set variable, checking for macro var existence, returning data set obs count can all be achieved without BASE SAS code. Such macros can be called “inline” without unnecessary overhead or interruption of program flow.

For example, instead of %count_ds_obs definition that uses DATA Step code and interrupts program flow like

%let n_obs = %count_ds_obs(DSIN=myData);%if &n_obs > 0 %then %do;  ... more statements ...%end;

an inline, pure Macro Language implemetation allows streamlined code:

%if %inline_ds_obs(DSIN=myData) > 0 %then %do;  ... more statements ...%end;

Source: Sunil Gupta, Senior SAS Consultant, Gupta Programming http://www.sascommunity.org/wiki/Good_Programming_Practice_for_Clinical_Trials

Five (5) Types of Errors When Writing SAS Programs

Fortunately, SAS system fixes some mistakes made by SAS programmers. For example, SAS has gotten so smart over the years that it is now almost impossible to get an error by misspellilng a keyword. If you misspelled a keyword in a SAS program, SAS will almost always figure out what you meant to say and run the statement correctly in spite of your poor spelling skills. But SAS cannot fix all programming errors for you, so here are some of the most comment errors and how to debug them.

Syntax = compilation time errors
For example: missing semicolon [proc means data=work.demog run;]

Semantic = compile time error when the language element is correct, but the element might not be valid
For example: DATA step procedures wrong results but no error message

Execution-time = when SAS attemps to execute a program and execution fails

Data = execution time error when data values are invalid
For example: missing values were generated, numeric to character conversion, invalid data or character field is truncated

Macro-related = when you use the macro facility incorrectly

The most important rule in debugging SAS programs is to always check the SAS log. It is important to review the log messages each time you submit your program. To review the log, check at the top for messages such as ERRORS, WARNING or NOTES.

WARNING: The data set WORK.DEMOG may be incomplete. When this step was stopped there were 0 observations and 5 variables.

This message tell you that SAS did run a DATA step or able to peform the action, but for some reason there are zero observations. This could be a non-issue, but generally speaking when you go to the trouble of creating a data set, you want some data in it.

NOTE: SAS went to a new line when INPUT statement reached past the end of a line.
NOTE: The data set WORK.DEMOG has 2 observations and 2 variables.

Notes are simple there to inform you of the status of your program. If you were expecting 27 observations, one for each input record, then this would tell you that something went wrong. Notes can also be useful to streamline codes but writing more efficient programs. For example, if the run-time (total process time) of a report takes too long to run then this is another way to check your code.

A missing semi-colon can be notorious for misleading error messages. The compiler depends on a sequence of key words to identify the type of statement. If you leave out a semi-colon then you hide the key word of the next satement. The compiler is likely to find something wrong, but it is usually not the real mistake – the missing semicolon. Hence the errors and warnings are just hints about what the compiler is seeing instead of the underlying problem.

One final note, you can insert a PUTLOG statements to check to idenfity error(s):

data demog;
set edc.demog_summary;
by patient_id;
if first.patient_id=1 then race=’white’;
putlog race=;

proc print data=demog;

The DATA step debugger offers SAS programmers a new way to investigae logic errors. Since SAS runs programs in two phases, SAS compiles it then executes the program. To invoke the debugger, add / DEBUG to the end of your DATA statement and then run your DATA step.

If we modify the previous DATA step:

data demog / DEBUG;
set edc.demog_summary;
by patient_id;
if first.patient_id=1 then race=’white’;

After you submit the above code, two windows appear: the DEBUGGER LOG window and the DEBUGGER SOURCE window. As you may have imagined, the DEBUGGER LOG window contains messages from the debugger and command line. The SOURCE window contains your DATA step statements with current line highlighted. SAS executes each line of your program for the first observation, then returns to the top of the DATA step for the second observation and so on.

As you can see, there are many ways to check your SAS programs for errors, even when the ouptput looks fine. Notes are just as important as warnings and error messages. I strongly recommend that you learn how to use the debugger as it can save lots of time when debugging your program!

Anayansi Gamboa has an extensive background in clinical data management as well as experience with different EDC systems including Oracle InForm, InForm Architect, Central Designer, CIS, Clintrial, Medidata Rave, Central Coding, OpenClinica Open Source and Oracle Clinical.