Tag Archives: base SAS

Using PROC PRINT to Validate Clinical Data

Using PROC PRINT to Validate Clinical Data

When your data isn’t clean, you need to locate the errors and validate them.  We can use SAS Procedures to determine whether or not the data is clean. Today, we will cover the PROC PRINT procedure.

  • First step is to identify the errors in a raw data file. Usually, in our DMP, in the DVP/DVS section, we can identify what it is considered ‘clean’ or data errors.
    • Study your data
  • Then validate using PROC PRINT procedure.
  • We will clean the data using data set steps with assignments and IF-THEN-ELSE statements.

When you validate your data, you are looking for:

  • Missing values
  • Invalid values
  • Out-of-ranges values
  • Duplicate values

In the example below, our lab data ranges table we find missing values. We also would like to update the lab test to UPPER case.

Clinical Raw data
Proc Print data val code
PROC PRINT output – data validation

 

From the screenshot above, our PROC PRINT program identified all missing / invalid values as per our specifications. We need to clean up 6 observations.

Cleaning Data Using Assignment Statements and If-Then-Else in SAS

We can use the data step to update the datasets/tables/domains when there is an invalid or missing data as per protocol requirements.

In our example, we have a lab data ranges for a study that has started but certain information is missing or invalid.

To convert our lab test in upper case, we will use an assignment statement. For the rest of the data cleaning, we will use IF statements.

Proc Print data cleaning

 

 

 

 

 

 

 

Data Validation and data cleaning final dataset

 

 

 

 

 

 

 

From our final dataset, we can verify that there are no missing values. We converted our labTest in uppercase and we updated the unit and  EffectiveEnddate to k/cumm and 31DEC2025 respectively.

You cannot use PROC PRINT to detect values that are not unique. We will do that in our next blog ‘Using PROC FREQ to Validate Clinical Data’. To find duplicates/remove duplicates, check out my previous post-Finding Duplicate data.

or use a proc sort data=<dataset> out=sorted nodupkey equals; by ID; run;

To hire me for services, you may contact me via Contact Me OR Join me on LinkedIn

 

From Non-SAS Programmer to SAS Programmer Part II

Previously, we wrote about how you can become a SAS Programmer with little or no programming background.

Today, I want to share a new link where you can download SAS Studio for free and practice. I have to give a thank to Andrew from statskom for the tip. Visit his blog for more SAS tips.

Here is a quick step on what you need in order to use the SAS University version for free provided by SAS:

1- Create a SAS profile and select the environment based on your operating system in order to download the SAS® University Edition. I  chose Oracle VirtualBox. The options available are: Oracle VirtualBox in Windows, Macintosh, and Linux operating environments.

2- You will receive an email where you can you download your SAS edition as per your selected environment on step 1. Click the link. It could take up to an hour for the entire program to download.

SAS University Edition

3-Go to https://www.virtualbox.org/wiki/Downloads to install the OracleVirtualBox.

4-Add the SAS University Edition vApp downloaded on step 2 to VirtualBox step 3.

OracleVM

5-Create a folder for your data and results.

6- Start the SAS University Edition vApp

7-Open the SAS University Edition by opening your web browser and typing  http://localhost:10080. From the the SAS University Edition: Information Center, click Start SAS Studio.

There you have it! You have now access to SAS and can start practicing your new programming language.

anayansigamboa sas studio anayansigamboa sas studio anayansigamboa sas studio anayansigamboa sas studio

For more information about the SAS University Edition, see the FAQs and videos at http://support.sas.com/software/products/university-edition/index.html.

For Data Management and EDC training, please contact RA eClinical Solutions.

Anayansi Gamboa has an extensive background in clinical data management as well as experience with different EDC systems including Oracle InForm, InForm Architect, Central Designer, CIS, Clintrial, Medidata Rave, Central Coding, OpenClinica, Open Source and Oracle Clinica.

Disclaimer: The legal entity on this blog is registered as Doing Business As (DBA) – Trade Name – Fictitious Name – Assumed Name as “GAMBOA”.

Did You Know?

Did You Know? »

PROC CDISC?
It is a new SAS procedure that is available as a hotfix for SAS 8.2 version. It is available by default for SAS 9.1.3 and latest versions.

PROC CDISC is a procedure that allows SAS programmers to import and export XML files that are compliant with the CDISC ODM version 1.2 schema.

Source: SAS programming in the Pharmaceutical Industry text book