Tag Archives: SAS tips

Where in the World is Ana?

First, I want to say thank you for reading my blog, connecting with me on LinkedIn or following whenever I go [thank you NSA].

Many of you, for months, have asked me if I was going to write more articles related to clinical trials. For sometime now, I have taking time-off from this EDC blog and concentrated on some other projects of equal importance. I will share some new insights and information as I get myself back on track.

So what is a girl who has a master’s degree in project management and computer networks doing as a programmer? It’s not that I didn’t like project management, per se. And entering in the IT network business years ago was quite difficult for girls like me in a world dominated by men. It’s basically that I didn’t find myself with the same passion for project management or computer networks that I have for programming and technology in general.

Because I am so interested in technology and programming, I tend to spend a lot more time than my peers in learning new technologies, and enhancing my existing skills. Many of my co-workers and ex-collega (Dutch) have commented on their admiration that my skill level is as high as it is, and that I am able to learn new technologies so quickly. But beyond just learning new technologies and APIs, I’m passionate about becoming a better overall programmer. Reason why in the last few months, I spent time learning IOs development (iPhone and Android apps). I am actually working on an app to ‘hack’ into my own car. 🙂 Well, not exactly. I want to be able to open my car and do some other basic command (like opening the garage door) using an APP.

Given my degree in project management, it should be clear that I have useful skills beyond the programming world. In fact, having a project management background has helped me interface with various groups in various organizations in which I’ve worked.

I have installed, maintained, and designed numerous relational databases and small networks. As a freelancer, I have worked in projects doing data analysis, project support and computer training.

Now you know a little about me personally. If you think I might be the type of developer you’re looking for, feel free to browse my resume and contact me.

Anayansi Gamboa
Resume / CV .

Comments? Join us at {EDC Developer}

P.S. I will be releasing some training videos / training material for several EDC tools in the near future including tips and best practices. Price has not been setup yet. All training will be web-based, password protected. If you wish to consult with me for a face-to-face training or on-site training, please contact me to discuss further.

Fair Use Notice: This video contains some copyrighted material whose use has not been authorized by the copyright owners. We believe that this not-for-profit, educational, and/or criticism or commentary use on the Web constitutes a fair use of the copyrighted material (as provided for in section 107 of the US Copyright Law. If you wish to use this copyrighted material for purposes that go beyond fair use, you must obtain permission from the copyright owner. Fair Use notwithstanding we will immediately comply with any copyright owner who wants their material removed or modified, wants us to link to their website or wants us to add their photo.

Disclaimer: The EDC Developer blog is “one man’s opinion”. Anything that is said on the report is either opinion, criticism, information or commentary, If making any type of investment or legal decision it would be wise to contact or consult a professional before making that decision.

Disclaimer:De inhoud van deze columns weerspiegelen niet per definitie de mening van {EDC Developer}.

Disclaimer: The legal entity on this blog is registered as Doing Business As (DBA) – Trade Name – Fictitious Name – Assumed Name as “GAMBOA”.

Advertisements

SAS: Problem Solving 1

Today we want to provide you with some problem-resolution options for a simple situation.

Problem:

We have 3 variables that we will call Var1, Var2, Var3. Their values ranges from 1-9 and we would like to create new variables that would flag a response based on their value on any of the 3 previous variables.

Sample:

Response 7 has one (1) variable Var1, then VarFlag1 should be equal to 1

If the same response 7 has a value 3 on Var3, then the VarFlag3 should be equal to 1

Solution 1: Data step

data mydata;
input Var1, Var2, Var3;
array vars Var1 Var2 Var3;
array flags flag1-flag9;
do over vars;
if 1 <=vars<=9 then
flags[vars]=1;
end;

cards;
123
987
2.5 7 9;
run;

proc print; run;

Solution 2: array solution

array flag{*} flag1-flag9;
do j=1 to 9;
flag{j}=(index(Var1||Var2||Var3,j)>0);
end;

Solution 2: Macro solution

%macro SET_Flags(Flag,num);
%do 1=1 % to &n;
&Flag.&i=(Var=&i or Var=&i or Var=&i);
%end;
%mend Set_Flags;

Data Mydata;
%Set_Flags(Flag,5);
run;

How to Use a SAS Macro Video

Although all SAS users are familiar with procedures (or procs), many users may not be familiar with macros. This four-minute video demonstrates how to run a macro. The new %LCA_Distal macro is used as an example, but the steps are generally applicable to any macro, whether or not it was created by The Methodology Center.

Source: Penn State Methodology Center

FAIR USE-
“Copyright Disclaimer Under Section 107 of the Copyright Act 1976, allowance is made for “fair use” for purposes such as criticism, comment, news reporting, teaching, scholarship, and research. Fair use is a use permitted by copyright statute that might otherwise be infringing. Non-profit, educational or personal use tips the balance in favor of fair use.”

Did You Know?

Did You Know? » the difference between IS NULL and IS MISSING?

There is really no differences. SAS does not have a concept of null like databases do and therefore you can use either WHERE operator to specify the same thing.

For example: where race is null;

Anayansi Gamboa has an extensive background in clinical data management as well as experience with different EDC systems including Oracle InForm, InForm Architect, Central Designer, CIS, Clintrial, Medidata Rave, Central Coding, OpenClinica Open Source and Oracle Clinical.

SAS Programmers Tools

Are you new to SAS and wondering how to write SAS programs?

Most SAS programmers use the built-in SAS enhanced editor for their daily works. Sometimes, this editor is replaced by the code editor of SAS Enterprise Guide which provide other features like the Log tab, Output data tab and Results tab. However, some SAS users like their text editor to be very customizable and full of features which may or may not be in the enhanced editor.

If you find that your current editor is insufficient in handling your work you are not alone. We have found some alternative editors and below are some of the text editors I have come across that you can use instead of the pre-built SAS editor:

TextPad: is a full-featured text editor offering a spelling checker, macros, and powerful formatting and file-storage options from Helios Software Solutions.

This is a great program – it’s a powerful text editing tool that’s really comfortable to use. Textpad has a very clean, simple interface that deals only with text – that is, it doesn’t let you change font halfway down the page, or make text underlined or italic; it’s built purely to deal with the content, and does that job EXTREMELY well.

These features are excellent for SAS macro programming and SCL programming. Besides these, Textpad has a built-in compiler for Java which allows for rapid switching to Java coding that is occasionally required.

Below is a screen shot of the editor:

Textpad has many macro features that allows for repetitive actions to be recorded and recycled easily.

Crimson Editor is a professional source editor for Windows Open source from Ingyu Kang and one of the most popular editor available for programmers to use.

This editor also allow programmers to install schematics (define tools) that will highlight sections of your SAS programs.

Below is a screen shot of the editor:

In summary, there are many options to help a SAS programmer increase efficiency, write cleaner code, or make SAS life easier. There are other popular editors such as Emacs but I don’t have a lot of experience using it thus I cannot comment on it properly. Your style of programming will influence the type of editor you will use.

Anayansi Gamboa has an extensive background in clinical data management as well as experience with different EDC systems including Oracle InForm, InForm Architect, Central Designer, CIS, Clintrial, Medidata Rave, Central Coding, OpenClinica Open Source and Oracle Clinical.

How to Use SAS – Lesson 6 – SAS Arithmetic and Variable Creation

This video series is intended to help you learn how to program using SAS for your statistical needs. Lesson 6 introduces the concept of SAS arithmetic in the DATA STEP. I discuss how one can add, subtract, divide, multiply, or create their own formulas for variables in the data. I also discuss using SAS arithmetic to create new variables based on mathematical transformations of old variables, which may sometimes aid in meeting the assumptions of statistical tests. Finally, I provide basic examples of each of these methods.

Helpful Notes:

1. SAS uses many of the same arithmetic operators to add, subtract, divide and multiply as other programming languages and basic algebra.

2. Arithmetic operations on variables affect the entire list of observations. So be careful in operating with existing variables and make new variables if you can afford to.

3. The varnum ;option on the PROC CONTENTS statement can allow you to see the variables listed in the order they were created.

Today’s Code:

data main;
input x y;
cards;
1 2
3 4
5 6
7 8
;
run;

proc print data=main;
run;

data new_main; set main;
a = x + y;
b = x – y;
c = x * y;
d = x / y;
e = x ** y;
f = ((x + y) * (x – y));
run;

proc contents data=new_main varnum;
run;

proc print data=new_main;
run;

-FAIR ;USE-
“Copyright Disclaimer Under Section 107 of the Copyright Act 1976, allowance is made for “fair use” for purposes such as criticism, comment, news reporting, teaching, scholarship, and research. Fair use is a use permitted by copyright statute that might otherwise be infringing. Non-profit, educational or personal use tips the balance in favor of fair use.”

Anayansi Gamboa has an extensive background in clinical data management as well as experience with different EDC systems including Oracle InForm, InForm Architect, Central Designer, CIS, Clintrial, Medidata Rave, Central Coding, OpenClinica Open Source and Oracle Clinical.

How to Use SAS – Lesson 5 – Data Reduction and Data Cleaning

This video series is intended to help you learn how to program using SAS for your statistical needs. Lesson 5 introduces the concept of data reduction (also known as subsetting ;data sets). I discuss how one can subset a data set (i.e. reduce a data set’s number of observations) based on some criteria using the IF statement in the DATA STEP, or using the WHERE statement in a PROC STEP. I also discuss using the KEEP, DROP, and RENAME statements for reducing data to only a handful of the original variables (i.e. reduce a data set’s number of variables). Furthermore, I show how one can label variables so that descriptive information can be presented in output and value formats so that specific values are easy to understand. Finally, I provide basic examples of each of these for three hypothetical data sets.

Helpful Notes:

1. There are two places you can reduce the data you analyze; in the DATA STEP, and in the PROC STEP.

2. To subset data in the DATA STEP, use the IF statement.

3. To subset data in the PROC STEP, use the WHERE statement.

4. Another way to reduce data is to eliminate variables using a KEEP or DROP statement. This method is useful if you are creating a second data set or analytic version of your main dataset.

5. The RENAME statement simply changes a variables name.

Today’s Code:

data main;
input x y z;
cards;
1 2 3
7 8 9
;
run;

proc contents data=main; run;
proc print data=main; run;

/* 1. Reduce data in the DATA STEP using a simple IF statement */
data reduced_main; set main;
if x = 1;
run;

proc print data=main; run;
proc print data=reduced_main; run;

/* 2. Reduce data in the PROC STEP using a simple WHERE statement */
proc print data=main;
where x = 1;
run;

proc print data=main; run;
proc print data=reduced_main; run;

/* 3. Reduce data in the DATA STEP by KEEPing only the variables you do want */
data reduced_main; set main;
KEEP x y;
run;

proc print data=main; run;
proc print data=reduced_main; run;

/* 4. Reduce data in the DATA STEP by DROPing the variables you don’t want */
data reduced_main; set main;
DROP y;
run;

proc print data=main; run;
proc print data=reduced_main; run;

/* 5. Clean up variables using the RENAME statement within a DATA STEP */
data clean_main; set main;
rename x = ID y = month z = day;
run;

proc contents data=main; run;
proc contents data=clean_main; run;

/* 6. Clean up variables using a LABEL statement within a DATA STEP */
data clean_main; set clean_main;
label ID = “Identification Number” month = “Month of the Year” day = “Day of the Year”;
run;

proc contents data=main; run;
proc contents data=clean_main; run;

/* 7. FORMAT value labels using the PROC FORMAT and FORMAT statements */
PROC FORMAT;
value months 1=”January” 2=”February” 3=”March” 4=”April” 5=”May” 6=”June” 7=”July” 8=”August” 9=”September” 10=”October” 11=”November” 12=”December”;
run;

data clean_main; set clean_main;
format month months.;
run;

proc ;freq data=clean_main;
table month;
run;

-FAIR ;USE-
“Copyright Disclaimer Under Section 107 of the Copyright Act 1976, allowance is made for “fair use” for purposes such as criticism, comment, news reporting, teaching, scholarship, and research. Fair use is a use permitted by copyright statute that might otherwise be infringing. Non-profit, educational or personal use tips the balance in favor of fair use.”

Anayansi Gamboa has an extensive background in clinical data management as well as experience with different EDC systems including Oracle InForm, InForm Architect, Central Designer, CIS, Clintrial, Medidata Rave, Central Coding, OpenClinica Open Source and Oracle Clinical.

How to Use SAS – Lesson 4 – Merging Data Sets

This video series is intended to help you learn how to program using SAS for your statistical needs. Lesson 4 introduces the concept of merging SAS data sets using a variety of methods. I discuss how one can merge two or more data sets in the DATA STEP using the SET statement. I also describe how one can use the MERGE statement to bring two or more datasets together that may have a common index variable. Furthermore, I describe the SORT procedure (PROC ;SORT) that must be used with the MERGE statement. Finally, I provide basic methods of merging data sets using PROC SQL.

Helpful Notes:
1. Use one SET statement when you have the same variables, but different observations.

2. Use two SET statements when you have different variables, but the same observations.

3. Use the MERGE statement when you have a common index variable, and any new variables or observations.

4. The MERGE statement first requires that you use the SORT procedure (PROC SORT) to sort on the index variable before merging.

5. Make sure that you add the BY statement after the MERGE statement in your DATA step or you will have a new dataset that is merged incorrectly.

6. PROC SQL is an advanced method of merging data that can be very powerful for large datasets. It uses different kinds of “JOINS” that I will provide more information on in a later video.

Today’s Code:
data main;
input x y z;
cards;
1 2 3
7 8 9
;
run;

/* 1. Use one SET statement when you have the same variables, but different observations */
data more_people;
input x y z;
cards;
4 5 6
3 6 9
;
run;

data final;
set main more_people;
run;

proc print data=final; run;

/* 2. Use two SET statements when you have different variables, but the same observations */
data more_vars;
input a b c;
cards;
20 40 60
10 20 30
;
run;
data new_final;
set main;
set more_vars;
run;

proc print data=new_final; run;

/* 3. Use the MERGE statement when you have a common index variable, and any new variables or observations */
data more_vars_and_people;
input x a b c;
cards;
1 20 40 60
7 10 20 30
2 11 12 13
3 14 15 16
;
run;

* The MERGE statement requires that you use an index variable to merge on (e.g. an ID variable).;
* Thus, you must SORT your data BY that index variable.;
proc sort data=main;
by x;
proc sort data=more_vars_and_people;
by x;
run;
data merged_final;
merge main more_vars_and_people;
by x;
run;

proc print data=merged_final; run;

/* 4. SQL is an advanced programming language for databases. Here, I provide a basic example to merge the two datasets using a LEFT JOIN. I will include more information about JOIN types in a follow up video. For now, think of a LEFT JOIN as one that only includes the data from the second dataset (more_vars_and_people) that corresponds to data from the original dataset (main).
*/
proc ;sql;
create table sql_final as
select L.*, R.*
from main as L
LEFT JOIN more_vars_and_people as R
on L.x = R.x;
quit;

proc print data=sql_final; run;

-FAIR ;USE-
“Copyright Disclaimer Under Section 107 of the Copyright Act 1976, allowance is made for “fair use” for purposes such as criticism, comment, news reporting, teaching, scholarship, and research. Fair use is a use permitted by copyright statute that might otherwise be infringing. Non-profit, educational or personal use tips the balance in favor of fair use.”

Anayansi Gamboa has an extensive background in clinical data management as well as experience with different EDC systems including Oracle InForm, InForm Architect, Central Designer, CIS, Clintrial, Medidata Rave, Central Coding, OpenClinica Open Source and Oracle Clinical.

How to Use SAS – Lesson 2 – Creating Datasets on the Fly

This video series is intended to help you learn how to program using SAS for your statistical needs. Lesson 2 introduces some basic data step programming to define variables and specify their values for data sets containing one or more observations.

I also introduce two procedures: the PRINT procedure (PROC ;PRINT) to display the data contents in the OUTPUT window, and the CONTENTS procedure (PROC CONTENTS) to summarize the data set. Finally, I introduce the concept of libraries to show another method of inspecting the data set by physically opening it from the temporary WORK library.

Helpful Notes:

1. PROC PRINT – displays the entire data set by observation in the OUTPUT window
2. PROC CONTENTS – summarizes the properties of a data set, including an alphabetic listing of the variables and a count of the number of observations.
3. The assignment operator (“=”) directly specifies the value of a variable in the data step.
4. The INPUT statement defines one or more variables of our data set.
5. The CARDS statement specifies the values for each of the INPUT variables (in order).
6. It is a good rule of thumb to always pair the INPUT and CARDS statements together.
7. DON’T FORGET SEMI;COLONS! They end statements and without them, you will most certainly have errors arise.
8. If you have any errors, always, ALWAYS, ALWAYS check the LOG first!
9. Creating datasets “on-the-fly” just means you’re making a new dataset without bringing in the data from any other source.

Today’s Code:

data main;
input x y z;
cards;
1 2 3
7 8 9
;
run;

proc print data=main;
run;

proc contents data=main;
run;

-FAIR ;USE-
“Copyright Disclaimer Under Section 107 of the Copyright Act 1976, allowance is made for “fair use” for purposes such as criticism, comment, news reporting, teaching, scholarship, and research. Fair use is a use permitted by copyright statute that might otherwise be infringing. Non-profit, educational or personal use tips the balance in favor of fair use.”

Anayansi Gamboa has an extensive background in clinical data management as well as experience with different EDC systems including Oracle InForm, InForm Architect, Central Designer, CIS, Clintrial, Medidata Rave, Central Coding, OpenClinica Open Source and Oracle Clinical.