Category Archives: Advanced SAS

How to Use a SAS Macro Video

Although all SAS users are familiar with procedures (or procs), many users may not be familiar with macros. This four-minute video demonstrates how to run a macro. The new %LCA_Distal macro is used as an example, but the steps are generally applicable to any macro, whether or not it was created by The Methodology Center.

Source: Penn State Methodology Center

“Copyright Disclaimer Under Section 107 of the Copyright Act 1976, allowance is made for “fair use” for purposes such as criticism, comment, news reporting, teaching, scholarship, and research. Fair use is a use permitted by copyright statute that might otherwise be infringing. Non-profit, educational or personal use tips the balance in favor of fair use.”


A neat new trick to trim your macro variables in 9.3

SAS macro variables are a great way to store a calculated value, so you can use it later in your code. They are not just limited to the data step — you can also use macro variables in title statements, axis statements, etc.

By default, the macro variable will be padded with blanks (per the width of the format). Here’s a simple example. Notice how the ‘avg’ (62.3) has extra spaces padded on the left in the title and in the label for the reference line:

goptions xpixels=600 ypixels=500 gunit=pct htitle=5 htext=3;

proc sql;
select avg(height) format=comma12.1 into :avg from sashelp.class;
quit; run;

axis1 label=(‘Inches’) reflabel=(c=red “&avg”);
axis2 label=none offset=(3,6);

pattern1 v=solid c=pink;
pattern2 v=solid c=cx67C8FF;

title “The average class height is &avg inches”;
proc gchart data=sashelp.class;
hbar name / type=sum sumvar=height descending
subgroup=sex nostats nolegend coutline=gray
ref=&avg cref=red raxis=axis1 maxis=axis2 noframe;

One way to have sql trim the blanks when creating the macro variable is to use the the ‘separated by’ option, and tell it the values are separated by blanks. This was more intended for the scenario where you’re outputting multiple values into multiple macro variables… but is also a clever way to trim the blanks when creating a single macro variable. See how much nicer the title and reference line label look with the blanks trimmed!

proc sql;
select avg(height) format=comma12.1 into :avg separated by ‘ ‘ from sashelp.class;
quit; run;

And in SAS 9.3, we’ve added an even more elegant solution – the ‘trimmed’ option!

proc sql;
select avg(height) format=comma12.1 into :avg trimmed from sashelp.class;
quit; run;

You can learn lots of ‘tricks’ like this, that will make your graphs look better (and make your life simpler) in the SAS/GRAPH training course!

Source: Robert Ellison –
FAIR USE-“Copyright

Disclaimer Under Section 107 of the Copyright Act 1976, allowance is
made for “fair use” for purposes such as criticism, comment, news
reporting, teaching, scholarship, and research. Fair use is a use
permitted by copyright statute that might otherwise be infringing.
Non-profit, educational or personal use tips the balance in favor of
fair use.”

SAS Proc SQL 2

Source: SAS Techies


Disclaimer Under Section 107 of the Copyright Act 1976, allowance is
made for “fair use” for purposes such as criticism, comment, news
reporting, teaching, scholarship, and research. Fair use is a use
permitted by copyright statute that might otherwise be infringing.
Non-profit, educational or personal use tips the balance in favor of
fair use.”

via SAS Proc SQL 2.

SAS Proc SQL 1

12 .SAS Proc SQL 1 – YouTube.

Source: SAS Techies


Disclaimer Under Section 107 of the Copyright Act 1976, allowance is made for “fair use” for purposes such as criticism, comment, news reporting, teaching, scholarship, and research. Fair use is a use permitted by copyright statute that might otherwise be infringing. Non-profit, educational or personal use tips the balance in favor of fair use.”

4 Programming Languages You Should Learn Right Now (eClinical Speaking)

I am a strong believer that learning a new language makes you better at the others, but I am not a “learn to code” advocate since a foreign language (I know 3 languages and currently learning my 4th and I have a “to learn” language including Italian and Arabic, if I ever find some free time) or even music (I love to play drums) are equally beneficial. But if you want to obtain a job in the pharmaceutical industry, here are the list of programming languages you should learn:

  1. C#:

What it is: A general-purpose, compiled, object-oriented programming language developed by Microsoft as part of its .NET initiative.

Why you should learn it: If you are looking to become a Medidata Custom Function programmer or Oracle InForm EDC Developer then you should.

2. Python:

What it is: An interpreted, dynamically object-oriented, open-source programming language that utilizes automatic memory management.

Why you should learn it: If you are like me always looking to learn new technology, love Google platforms and perhaps want to become a Timaeus Trial Builder, you should learn it. It is used on a lot open-source technologies.

Everyone should learn to code

3. PL/SQL or SQL:

What it is: PL/SQL stands for Procedural Language/SQL.

Why you should learn it: If you are like me additive to databases then Oracle should be your choice. If you want to become an Oracle Clinical programmer or Database administrator, you should learn Oracle PL/SQL.

4- SAS

What it is: SAS stands for “Statistical Analysis System” (software). It is the most powerful and comprehensive statistics software available.

Why you should learn it: SAS skills are in high demand nowadays. If you are able to obtain the SAS Certification and a few years of experience in the Pharmaceutical industry, you will be in good shape. If you are new and looking for training there are several options available from SAS Institute to private vendors such as Clinovo to even learning on your own. I most warn you as it will be difficult to obtain a job without experience. Nevertheless, once you are in, it can only get better.

Remember that your job is not just to code but to solve real problems. Your ability to code covers a lot of range of skills: from critical thinking, problem analysis & solving, logic, etc.

So which one are you going to give a try?

Let me know what is your preference. Happy Programming!

The best thing about a boolean is even if you are wrong, you are only off by a bit.(Anonymous)

Anayansi Gamboa has an extensive background in clinical data management as well as experience with different EDC systems including Oracle InForm, InForm Architect, Central Designer, CIS, Clintrial, Medidata Rave, Central Coding, OpenClinica Open Source and Oracle Clinical.

Disclaimer: The legal entity on this blog is registered as Doing Business As (DBA) – Trade Name – Fictitious Name – Assumed Name as “GAMBOA”.

SAS Institute
Learn PL/SQL

Did You Know?

Did You Know? »

It is a new SAS procedure that is available as a hotfix for SAS 8.2 version. It is available by default for SAS 9.1.3 and latest versions.

PROC CDISC is a procedure that allows SAS programmers to import and export XML files that are compliant with the CDISC ODM version 1.2 schema.

Source: SAS programming in the Pharmaceutical Industry text book

Kirk Lafler Shares 25 Coding Techniques!

25 Best Practice Coding Techniques for SAS Users By Kirk Paul Lafler, Software Intelligence Corporation

As SAS software becomes increasingly more popular, best practice coding techniques and guidelines become ever so critical. SAS software provides users with a powerful programming language for accessing, analyzing, manipulating, and presenting data. This tip addresses useful coding techniques for all operating system platforms.

  • After running a SAS program, immediately review the SAS log for notes, warnings, and error messages. Avoid turning off SAS System options that turn off SAS log notes, messages, and warnings.
  • Turn on the SOURCE2 SAS System option to display included source code on the log. Best practice coding techniques should mandate inclusion and display of any and all information that is available during a SAS session.
  • Considering procedures like PROC SQL and PROC REPORT for code simplification. Because multiple processes can be frequently accomplished in a single procedure step, I/O may be reduced.
    When a DATA step or PROC can do the same job, consider using procedures whenever possible. Procedures are tried-and-proven throughout the world’s SAS installations, testing requirements is considerably less.
  • Create user-defined format libraries to store formatted values in one place. User-defined format libraries have the added advantage of making programs easier to maintain since formatted data values are not hard coded.
  • Include RUN statements at the end of each DATA or PROC step (to separate step boundaries) to print benchmark statistics on the SAS log immediately following each step.
  • Document programs and routines with comments. In addition to the value associated with explaining program logic, comments should provide important information about complex code and logic conditions in a program. This helps to document important program processes as well as minimizes the learning curve associated with program maintenance and enhancement for other users.
  • Assign descriptive and meaningful variable names. Besides improving the readability of program code, it serves an important element in the form of documentation.
  • Construct program header information to serve as program documentation for all programs. The following example illustrates the type of information that can be added so others have a useful documented history.
  • Simplify complex code and operations into smaller, more manageable parts. By splitting complex code into two or more programming statements, a program becomes easier to read as well as more maintainable.
  • Specify SAS data set names when invoking procedures to help improve documentation efforts as well as preventing an incorrect data set from being processed.
  • Utilize macros for redundant code and enable autocall processing by specifying the MAUTOSOURCE system option.
  • Create macro libraries to store common macro routines in one place.
  • Create permanent libraries containing information from daily, weekly, monthly, quarterly, and annual runs. The type of libraries consists of scripts, SAS programs, SAS logs, output lists, and documentation of instructions for others to follow.
  • Create views based on user input to simplify and streamline redundant, complex and/or burdensome tasks. Consider creating views in a central view library to support maintenance and documentation requirements.
  • Code for unknown data values. This will prevent unassigned or null data values from falling through logic conditions.
  • Store informats, formats, and labels with the SAS data sets that use them. Informats, formats, and labels should be stored with important SAS data sets to minimize processing time. An important reason for using this technique is that many popular procedures use stored formats and labels as they produce output, eliminating the need to assign them in each individual step. This provides added incentives and value for programmers and users, especially since reporting requirements are usually time critical.
  • Construct conditions that would render data unusable and abort (or end) the program. This prevents unwanted or harmful data from being processed or written to a data set.
  • Test program code using “complete” test data particularly if the data set is small or represents a random sample of a large data set.
  • Set OBS=0 to test syntax and compile time errors without the risk of executing any observations through a DATA or PROC step.
  • Use the PROC SQL VALIDATE clause to test syntax and compile time errors in PROC SQL code.
  • Specify the NOREPLACE system option to prevent permanent SAS data sets from accidentally being overwritten while writing or testing a program.
  • Take advantage of procedures that summarize large amounts of data by saving and using the results in order to avoid reading a large data set again.
  • Add options that are frequently used into the SAS configuration file. This eliminates the time and keystrokes necessary to enter them during a SAS session.
  • Add statements that are frequently used into the SAS autoexec file. This eliminates the time and keystrokes necessary to enter them during a SAS session.
  • Source: 25 Best Practice Coding Techniques for SAS Users By Kirk Paul Lafler, Software Intelligence Corporation

    SAS Cheat Sheet – Part 6

    The macro facility is a tool you can use in your programming./*

    Macro Language

    %DO macro-var=start_value %TO end_value

    %DO %WHILE (expression); /*Executes a section of a macro repetitively while a condition is true*/

    %DO %UNTIL (expression); /*Executes a section of a macro repetitively until a condition is true*/

    Macro variables can be stored in either the global symbol table or in a local symbol table

    %GLOBAL macro-variable(s); /*Creates macro variables that are available during the execution of an entire SAS session*/

    %IF expression %THEN action; <%ELSE action;> /Conditionally process a portion of a macro*/

    %LENGTH (character string | text expression) /*Returns the length of a string*/

    The name assigned to a macro variable must be a valid SAS name

    %LET macro-variable =<value>; /*Creates a macro variable and assigns it a value*/

    %MACRO mname (<pp1><…,ppn><kp1=value<..<kpn=v>);/*Begins a macro definition*/

    %MEND <macro-name>; /*Ends a macro definition*/

    %SCAN(argument,n<,delimiters>)/*Search for a word that is specified by its position in a string*/

    %SUBSTR (argument,position<,length>)/*Produce a substring of a character string*/

    %UPCASE (character string | text expression) /*Convert values to uppercase*/

    Macro variable values are text values

    Macro Quoting

    %QUOTE | %NRQUOTE and %BQUOTE | %NRBQUOTE /*Mask special characters and mnemonic operators in a resolved value at macro execution */

    %STR | %NRSTR /*Mask special characters and mnemonic operators in constant text at macro compilation */

    %SUPERQ /*Masks special characters/mnemonic operators at macro execution but prevents further resolution of the value*/


    *----------------------------------------------------------------*//* This macro will produce summary statistics ----------------   */

    %Macro safety1;
    %odscmd (start,portnum=11,rptname=AEActivityReport);
    Data _null_;
    fstdate = mdy(month(thedate), 1, year(thedate));
    lstdate = intnx(‘month’, fstdate, 1)-1;
    call symput (‘strtdate’, put(fstdate, date9.));
    call symput (‘stpdate’, put(thedate, date9.));

    %put strtdate: &strtdate;
    %put stpdate: &stpdate;
    * more code goes here;


    Good Programming Practice for Clinical Trials by Sunil Gupta

    The following are draft recommendations a Good Programming Practice for analysis, reporting and data manipulation in Clinical Trials and the Healthcare industries.

    The purpose is to encourage contributions from across companies, non-profit organizations and regulators in an attempt to create a consensus recommendation. The ambition is that this page becomes recognized by the Pharmaceutical Industry, Clinical Research and Health Care Organizations as well as Regulatory Authorities.

    The hope is that the Practice can be reviewed and endorsedby the relevant management teams of several Pharmaceutical companies and major Contract Research Organizations and promoted through relevant professional organizations such as PharmaSUG, PhUSE, PSI, CDISC.


    The Good Programming Practices are defined in order to:

    • Ensure the clarity of the code and facilitate code review;
    • Save time in case of maintenance, and ease the transfer of code among programmers or companies;
    • Minimize the need for code maintenance by robust programming;
    • Minimize the development effort by development and re-use of standard code and by use of dynamic (easily adaptable) code;
    • Minimize the resources needed at execution time (improve the efficiency of the code);
    • Reduce the risk of logical errors.
    • Meet regulatory requirements regarding validation and 21CFRPart11 compliance

    Note: As often, the various guidelines provided hereafter may conflict with one another if applied in too rigorous a way. Clarity, efficiency, re-usability, adaptability and robustness of the code are all important, and must be balanced in the programming practice.

    Regulatory Requirements


    21CFRPart11 compliance

    Readability and Maintainability


    English is an international language and study protocols, study reports for practical reasons (regulatory authorities, inlicensing, outlicensing, partnerships, mergers) are mostly written in English, therefore it is recommended to write the SAS code and comments in English.

    Header and Revision History

    • Include a header for every program (template below).
    **********************************************************;* Program name      :** Author            :** Date created      :** Study             : (Study number)*                     (Study title)** Purpose           :** Template          :** Inputs            :** Outputs           :** Program completed : Yes/No** Updated by        : (Name) – (Date): *                            (Modification and Reason)**********************************************************;
    • In addition to your name or initials, use your login ID to identify yourself in the header. This is so there is no ambiguity on the identify of each programmer.
    • Update the revision history at each code modification made after the finalization of the first version of a program.

    Note: When you copy a program from another study, you became the author of this program, and you should clear the revision history. You can specify the origin of the program under the “Template” section of the header.

    Below is an example with comments of an alternative comment block that I think is more useful for Open Source programming. PaulOldenKamp 16:54, 5 April 2009 (UTC)

    /** ---------------------------------------------------------------------------------------- $Id: 152 2008-11-17 01:48:40Z Paul_ok01 $ <== Id info automatically inserted with each commit to Subversion version control Application: OS3A - Common Programs Description: OS3A session initialization program. Previous Program: None Saved as: c:\os3a\trunk\ <== locations, local and web, where the pgm can be found Change History: Date Prog Ref Description 04/26/2008 PMO [1] Initial programming for os3a <== date, programmer initials, ref number [1] to link to specific location of change Copyright: Copyright (c) 2008 OS3A Program. All rights reserved. <== Always tell who owns the program so one Copyright Contact: can ask permission from the copyright holder License: Eclipse Public License v1.0 <== Tell folks how they are licenced to use This program and the accompanying materials are made available under the program. the terms of the Eclipse Public License v1.0 which accompanies this distribution, and is available at Contributors: Paul OldenKamp, - Develop initial pgm. <== Identify significant contributors <== tag identifies start of info @purpose OS3A system initialization program. used by the Codedoc Perl script to Set up initial options and global macro variables. produce external HTML documentation @param SYSPARM - input provided from program initiation call; default: main_FORE for production. @symbol sysRoot - system root location; Windows - C:/, UNIX - / @symbol remove_cmd - system command to remove file; Windows - erase, UNIX - rm @symbol os3aRoot - directory location of os3a root @symbol futsRoot - directory location of FUTS top level macros. @symbol Root - directory location of sub-project identified with Four Letter Acronym ----------------------------------------------------------------------------------------- */

    The results from encoding the header and comments in a SAS program can be seen on the CodeDoc web page. See
    CodeDoc Download Page


    • Include a comment before each major DATA/PROC step, especially when you are doing something complex or non-standard. Comments should be comprehensive, and should describe the rationale and not simply the action. For example, do not comment “Access demography data”; instead explain which data elements and why they are needed.
    • Organize the comments into a hierarchy.
    • Do not include numbers in comments.

    Reason: It avoids heavy update when removing or inserting sections.

    Naming Conventions

    • Use explicit and meaningful names for variables and datasets, with a maximum length of 8.
    • For permanent datasets, use a meaningful dataset label and variable labels.
    • When possible, never use the same name for a dataset more than once in the program.

    Note: However, keep in mind that large intermediate files take a lot of SAS Workspace.

    • Name IN variable using “in” plus a meaningful reference to the dataset.


    data aelst;   merge aesaes (in=inae) patpat (in=inpat);   by patno;   if inae and inpat;run;
    • Labels must have a maximum length of 40 characters.

    Code Structure

    • It is mandatory to include libnames, options and formats in a separate setup program unless these are temporary formats or temporary options that are reset after being used.

    Reason: It will guarantee that changes of the environment are taken into account in all programs run afterwards.

    • Use standard company macros to read in libnames and settings, to write out datasets, and for standard calculating and reporting.
    • One statement per line, but several are allowed if small and repeated or related. Long statements should be split across multiple lines.
    • Control system settings to show all executed code in the log as the default, in as clear manner as possible. The log should not be so lengthy that the programmer cannot easily navigate (if so, use highly visible comments with sufficient white space). System settings should be able to be easily changed in order for a user to debug a section of code or a macro, in order to temporarily display the %included code, resolved macro names, and logic.
    • Use a standard sequence for placing statements and group like statements together.
    1. Within a program:
      1. %LET statements and macro definitions
      2. Input steps
      3. Calculations
      4. Save final (permanent) datasets and created outputs
    2. Within a DATA step:
      1. All non-executable statements first (e.g. ATTRIB, LENGTH, KEEP…)
      2. All executable statements next

    Reason: It increases the readability of the program.

    • Left-justify DATA, PROC, OPTIONS statements, indent all statements within.


    proc means data=osevit;   var prmres;   by prmcod treat;run;
    • End every DATA/PROC step with a left-aligned RUN statement.

    Reason: It explicitly defines the step boundary.

    • Insert at least one blank line after each RUN statement in DATA/PROC steps.
    • Indent statements within a DO loop, align END with DO.
    • Avoid having too many nested DO loop and IF-ELSE statements.
    • In case of interlinked DO loop, add a comment at the start (DO) and end (END) of each loop.


    data test01;  do patno=1 to 40; * cycle thru patients;    do visit=1 to 3; * cycle thru visits;      output;     end; * cycle thru visits;  end; * cycle thru patients;run;
    • Insert parentheses in meaningful places in order to clarify the sequence in which mathematical or logical operations are performed.


    data test02;  set test01;  if (visit=0  and vdate lt adate1)   or (visit=99 and vdate gt adate2) then delete;run;

    Style conventions

    Draft section : this may not be specific to clinical programming, but may be of use when considering a general standard for sharing programs.

    Use of analysis datasets

    For discussion of why programming output directly from raw data is generally avoided.


    • When you input or output a SAS dataset, use a KEEP (preferred to DROP) statement to keep only the needed variables.

    Reason: The SAS system loads only the specified variables into the Program Data Vector, eliminating all other variables from being loaded.

    • When subsetting a SAS dataset, use a WHERE statement rather than IF, if possible.

    Reason: WHERE subsets the data before entering it into the Program Data Vector, whereas IF subsets the data after inputting the entire dataset.

    • When using IF condition, use IF/ELSE for mutually exclusive conditions, and check the most likely condition first.

    Reason: The ELSE/IF will check only those observations that fail the first IF condition. With the IF/IF, all observations will be checked twice. Also, consider the use of a SELECT statement instead of IF/ELSE, as it may be more readable.

    • Avoid unnecessary sorting. CLASS statement can be used in some procedure to perform by-group processing without sorting the data.


    proc means data=osevit;  var prmres;  class treat;run;
    • If possible (i.e. not a sorting variable), use character values for categorical variables or flags instead of numeric values.

    Reason: It saves space. A character “1” uses one byte (if length is set to one), whereas a numeric 1 uses eight bytes.

    • Use the LENGTH statement to reduce variable size.

    Reason: Storage space can be reduced significantly.
    Note: Keep in mind that a too limited variable length could reduce the robustness of the code (lead to truncation with different sets of data).

    • Use simple macros for repeating code.


    • Use the MSGLEVEL=I option in order to have all informational, note, warning, and error messages sent to the LOG.
    • In the final code, there should be no dead code that does not work or that is not used. This must be removed from the program.
    • Code to allow checking of the program or of the data (on all data or on a subset of patients such as clean patients, discontinued patients, patients with SAE or patients with odd data) is encouraged and should be built throughout the program. This code can be easily activated during the development phase or commented out during a production run using the piece of code detailed in Section 6.
    • It is not acceptable to have avoidable notes or warnings in the log (mandatory).

    Reason: They can often lead to ambiguities, confusion, or actual error (e.g. erroneous merging, uninitialized variables, automatic numeric/character conversions, automatic formatting, operation on missing data…).
    Note: If such a warning message is unavoidable, an explanation has to be given in the program (mandatory).

    • Always use DATA= in a PROC statement (mandatory).

    Reason: It ensures correct dataset referencing, makes program easy to follow, and provides internal documentation.

    • Be careful when merging datasets. Erroneous merging may occur when:
    1. No BY statement is specified (set system option MERGENOBY=WARN or ERROR).
    2. Some variables, other than BY variables, exist in the two datasets (set system option MSGLEVEL=I), S writes a warning to the SAS log whenever a MERGE statement would cause variables to be overwritten at which the values of the last dataset on the MERGE statement are kept).
    3. More than one dataset contain repeats of BY values. A WARNING though not an ERROR is produced in the LOG. If you really need, PROC SQL is the only way to perform such many-to-many merges.

    Reason: One has to routinely carefully check the SASLOG as the above leads to WARNING messages rather ERROR messages yet the resulting dataset is rarely correct.

    • When coding IF-THEN-ELSE constructs use a final ELSE statement to trap any observations that do not meet the conditions in the IF-THEN clauses.

    Reason: You can only be sure that all possible combinations of data are covered if there is a final ELSE statement.

    • When coding a user-defined FORMAT, include the keyword ‘other’ on the left side of the equals sign so that all possible values have an entry in the format.

    Reason: A missing entry in a user-defined FORMAT can be difficult to detect. The simplest way to identify this potential problem is to ensure that all values are assigned a format.
    Note: This does not apply to INFORMATs. It could be more helpful to get a WARNING message when trying to INPUT data of unexpected format.

    • Try to produce code that will operate correctly with unusual data in unexpected situations (e.g. missing data).

    Code for Data Checks

    Build checks so that their purpose is clear, so that they can be toggled on or off, and remove them once they are no longer needed.

    Activate/Deactivate Pieces of Code

    In the beginning of the program, define a macro variable that you set to blank during the development phase or that you set equal to * for the production run:

    %let c=;  or  %let c=*;

    For the pieces of code that check the data/program, start each line with the macro variable defined above:

    &c title “Check the visits for each patient”;&c proc freq data=patvis01;&c    table patno*visit;&c run;

    This code will be executed if &c is blank (development), but will be commented out when &c=* (production).

    Perform Checks on a Subset of Patients

    In a separate code that you store under the study MACRO folder, list the subset of patients (clean patients, discontinued patients, patients with SAE or patients with odd data) that you want to look at:

    %macro select;2076 2162 2271 2449%mend;

    In the beginning of the program, define a second macro variable that you set equal to * when you want to perform checks on all data or to blank when you are interested in a subset of patients:

    %let s=*;  or  %let s=;

    For each checking code, add a piece of code that allows subsetting the data, and start each line of this piece of code with the 2 macro variables defined above:

    &c title “Check the visits for each patient”;&c proc freq data=patvis01;&c    table patno*visit;&c &s where patno in (%select);&c run;

    The check will be performed only if &c is blank, and it will be applied to all patients if &s=* or on the subset of patients if &s is blank.
    Better still: input the list of check case IDs as a dataset.

    Floating Point Error

    Consider the real number system that we are familiar with. This decimal system (0 → ± ∞) is obviously infinite. Most computers use floating point representation, in which, a finite set of numbers is used to represent the infinite real number system. Thus, we can deduce that we will have some sort of error appearing from time to time. This is more generally termed Floating Point Error and occurs in computers due to hardware limitations.

    The following paper goes someway in explaining why and how this happens and also possible solutions in how to approach this issue.

    Paper reference:-

    Data Imputation versus Hardcoding

    Please contribute

    • Definitions
      • Data Imputation
      • Hardcoding
    • Issues
    • Recommendation

    Integrity of a data transfer

    At a minimum, all data transfers should be validated by checking the observation counts for each SAS dataset
    or the record counts in other formats, against counts provided by the sender. It is also helpful if the sender
    can provide a checksum for each file transferred since this also ensures all content made it’s way to you
    without transmission errors. There are many freeware programs available to calculate checksums for any file at websites such as .


    Draft section : recommendations particularly relevant for the development and use of macros and macro libraries.

    Macros are particularly useful under the following circumstances:

    • Program code is used repeatedly
    • A number of steps must be taken conditionally, and the logic for these is clearly fixed (no need to think of all the steps that should be included in a program under a specific situation: the macro will deduce them for you and generate the appropriate data step or proc step code)
    • There is no trivial solution via “ordinary” SAS code
    • Their application must be easier as to program the code itself!
    • The usage helps users avoiding errors and omissions.

    If used appropriately the following benefits can be achieved:

    • Increase in quality by avoiding programming bugs and errors
    • Savings in time and resources
    • Enforcement of standards, e.g. standard methods and standard outputs
    • Work can be more enjoyable as programmers can focus on the non-routine work

    Ideally Macro development should follow a few rules:

    • Macro headers should clearly state all changes to environment and data that result from execution. Changes should be limited to those necessary for the focused purpose of the macro:
      • strictly controlled changes to input data and creation of output data
      • clear temporary data set clutter
      • no unexpected changes to system settings (options, titles, footnotes, etc)
      • no unexpected changes to external symbol tables
    • Scope of macro variables should be explicitly controlled using %global and %local statements.
    • Method of macro variable creation should demonstrate awareness of default scope:
    • The log matters:
      • Use Base SAS techniques whenever possible to avoid excessive code generation (log bloat). For example, macro definition should use DATA step array and DO loop processing rather than Macro %DO looping.
      • But use pure Macro Language for routine utility macros (see details, below).
      • Use appropriate comment style in macro definitions to properly annotate the SAS log when MPRINT in on. For example, use %* style commenting to explain macro logic, but /* style commenting to explain resulting code. (Or * style or PUT statement commenting as appropriate.)
      • Allow the users to control the appearance of the log via MPRINT, SYMBOLGEN, and MLOGIC.
    • Code within a macro definition should be germane, limited to the specific purpose of the macro. The use of a central repository for macros (“Macro library”) is suggested.
    • Macro Library: Code for routine tasks (eg, parameter checking, system and environment checking, messaging, etc.) should be handled by dedicated utility macros. Code for such routine tasks should not overwhelm the current macro definition, obscuring the purpose, and creating unnecessary maintenance overhead and lack of consistency within a library.
    • Macro Library: Parameter naming conventions should be used for common parameters such as input/output libnames and data sets. Explicit and transparent control of macro variable scope again becomes crucial to avoid accidental change of external symbol tables
    • Macro Library: Use pure Macro Language definitions whenever possible to improve program flow and avoid producing unnecessary Base SAS code. Returning a list of data set variable, checking for macro var existence, returning data set obs count can all be achieved without BASE SAS code. Such macros can be called “inline” without unnecessary overhead or interruption of program flow.

    For example, instead of %count_ds_obs definition that uses DATA Step code and interrupts program flow like

    %let n_obs = %count_ds_obs(DSIN=myData);%if &n_obs > 0 %then %do;  ... more statements ...%end;

    an inline, pure Macro Language implemetation allows streamlined code:

    %if %inline_ds_obs(DSIN=myData) > 0 %then %do;  ... more statements ...%end;

    Source: Sunil Gupta, Senior SAS Consultant, Gupta Programming