Tag Archives: issues with cleaning clinical data

SAS Date format

What is the date format that will display a SAS date as YYYYMMDD?

e.g. 20130815

Use YYMMDDN SAS format

Advertisements

Central Designer – Troubleshooting Tips

If an edit check or function fails to behave as expected, it is time to use your ‘troubleshooting’ skills. The following tips may help you when you are troubleshooting rules in InForm:

Rules:

  • check if rules are running
  • check the rule logic
  • check Rule Dependencies: a rule on a form has access to items on that form, but not other forms or other visits
  • check InForm machine’s Application Event Log

Though some vendors will correct major problems with their products by releasing entirely new versions, other vendors may fix minor bugs by issuing patches, small software updates that address problems detected by users or developers.

Check the release notes for Central Designer for known problems. The release notes provide descriptions and workaround solutions for known problems.
Remember that there is a report available you can run “Data Entry Rule Actions Report”. This report outputs all data entry rules in CSV format and can be formatted into an edit check specification documentation for QA testing.
A rule can be written in more than one way, which makes it difficult to impose any restrictions:
Scenario: Route item has 3 choices. OP, SC and IV. Query should fire if the user does not choose either OP or SC. This rule could be written in many ways:

–Value = route.Value

If (value == 3)

–Value = route.Value == 3

If (value == true)

–Value = !(route.Value == 3)

If (value == false)

–Value = (route.Value == 1 || route.Value == 2)

If (value == false)

–Value = route.Value !=1 && route.Value != 2

If (value == true)

Keep it consistent across the trial. Do not overuse the conditional statements when a simple range check should be program.

Note: Be aware that if you want to reuse a rule that uses data from a logical schema in another study, the other study must also contain the logical schema.

If you have explored most of the obvious possibilities and still
cannot get your rule / edit check to work, ask someone in your team to peer review the build.

 

  • unit test your code
  • context available for defining test cases
  • Site name, date/time, locale; Form associations; Empty values; Unknown dates; Repeating objects
  • test case results: Pass or Fail based on expected results
  • perform formal QA / QT

Remember to check the Event log via Control Panel -> Administrative Tools -> Event Viewer

Reference Document : Central Designer – Rule Troubleshooting.pdf

Your comments and questions are valued and encouraged.
Anayansi Gamboa has an extensive background in clinical data management as well as experience with different EDC systems including Oracle InForm, InForm Architect, Central Designer, CIS, Clintrial, Medidata Rave, Central Coding, OpenClinica, Open Source and Oracle Clinical.

Data Management Plan in Clinical Trials

 

The preparation of the data management plan (DMP) is a simple, straightforward approach designed to promote and ensure comprehensive project planning.

The data management plan typically contains the following items. They are:

  1. Introduction/Purpose of the document
  2. Scope of application/Definitions
  3. Abbreviations
  4. Who/what/where/when
  5. Project Schedule/Major Project Milestones
  6. Updates of the DMP
  7. Appendix

The objective of this guidelines is to define the general content of the Data Management Plan (DMP) and the procedures for developing and maintaining this document.

The abbreviation section could include all acronyms used within a particular study for further clarification.

e.g. CRF = Case Report Form
TA = Therapeutic Area

The Who/What/Where/When section should describe the objective of the study specific data management plans for ABC study. This section provides detail information about the indications, the number of subjects planned for the study, countries participating in the clinical trial, monitoring guidelines (SDV) or partial SDV, if any CROs or 3rd party are involved in the study (e.g. IVRS, central labs), which database will be used to collect study information (e.g. Clintrial, Oracle Clinical, Medidata Rave or Inform EDC).

The Appendix provides a place to put supporting information, allowing the body of the DMP to be kept concise and at more summary levels. For example, you could document Database Access of team members, Self-evident correction plan, Data Entry plan if using Double-data entry systems or Paper-Based clinical trials systems.

Remember, this is a living document and must be updated throughout the course of the clinical trial.

If problems arise during the life of a project, our first hunch would be that the project was not properly planned.

Reference: Role of Project Management in Clinical Trials
Your comments and questions are valued and encouraged.
Anayansi Gamboa has an extensive background in clinical data management as well as experience with different EDC systems including Oracle InForm, InForm Architect, Central Designer, CIS, Clintrial, Medidata Rave, Central Coding, OpenClinica, Open Source and Oracle Clinical.

To hire me for services, you may contact me via Contact Me OR Join me on LinkedIn

Disclaimer: The legal entity on this blog is registered as Doing Business As (DBA) – Trade Name – Fictitious Name – Assumed Name as “GAMBOA”.

CDISC Clinical Research “A” Terminology

acronym: A word formed from the beginning letters (e.g., ANSI) or
a combination of syllables and letters (e.g., MedDRA) of a name or phrase.
admission criteria:Basis for selecting target population for a clinical trial.
Subjects must be screened to ensure that their characteristics match a list of admission criteria and that none of their characteristics match any single one of the exclusion criteria set up for the study.
algorithm: Step-by-step procedure
for solving a mathematical problem;
also used to describe step-by-step
procedures for making a series of
choices among alternative decisions to
reach a calculated result or decision.
amendment: A written description
of a change(s) to, or formal clarification
of, a protocol.
analysis dataset:An organized collection of data or
information with a common theme arranged in rows and columns and
represented as a single file; comparable to a database table.
analysis variables: Variables used
to test the statistical hypotheses
identified in the protocol and analysis
plan; variables to be analyzed.
approvable letter:An official communication from FDA to an
NDA/BLA sponsor that lists issues to be resolved before an approval can be issued.
[Modified from 21 CFR 314.3;Guidance to Industry and FDA Staff

arm: A planned sequence of elements,
typically equivalent to a treatment
group.

attribute (n): In data modeling,
refers to specific items of data that can
be collected for a class.
audit:A systematic and independent
examination of trial-related activities
and documents to determine whether
the evaluated trial-related activities were
conducted and the data were recorded,
analyzed, and accurately reported
according to the protocol, sponsor’s
standard operating procedures (SOPs),
good clinical practice (GCP), and the
applicable regulatory requirement(s).
[ICH E6 Glossary]
audit report: A written evaluation by
the auditor of the results of the audit.
[Modified from ICH E6 Glossary]
audit trail. A process that captures
details such as additions, deletions,
or alterations of information in an
electronic record without obliterating the original record. An audit trail
facilitates the reconstruction of the
history of such actions relating to the
electronic record.

Source:Applied Clinical Trials

Anayansi Gamboa has an extensive background in clinical data management as well as experience with different EDC systems including Oracle InForm, InForm Architect, Central Designer, CIS, Clintrial, Medidata Rave, Central Coding, OpenClinica Open Source and Oracle Clinical.

Acme Pharma Develops A Drug: Part I

Learn more about how the pharmaceutical industry has traditionally developed and brought drugs to market. Watch part II of this series to learn how Network Fortress can improve the drug development process and save pharma and biotech companies time and money.

-FAIR USE-
“Copyright Disclaimer Under Section 107 of the Copyright Act 1976, allowance is made for “fair use” for purposes such as criticism, comment, news reporting, teaching, scholarship, and research. Fair use is a use permitted by copyright statute that might otherwise be infringing. Non-profit, educational or personal use tips the balance in favor of fair use.”

How to Use SAS – Lesson 6 – SAS Arithmetic and Variable Creation

This video series is intended to help you learn how to program using SAS for your statistical needs. Lesson 6 introduces the concept of SAS arithmetic in the DATA STEP. I discuss how one can add, subtract, divide, multiply, or create their own formulas for variables in the data. I also discuss using SAS arithmetic to create new variables based on mathematical transformations of old variables, which may sometimes aid in meeting the assumptions of statistical tests. Finally, I provide basic examples of each of these methods.

Helpful Notes:

1. SAS uses many of the same arithmetic operators to add, subtract, divide and multiply as other programming languages and basic algebra.

2. Arithmetic operations on variables affect the entire list of observations. So be careful in operating with existing variables and make new variables if you can afford to.

3. The varnum ;option on the PROC CONTENTS statement can allow you to see the variables listed in the order they were created.

Today’s Code:

data main;
input x y;
cards;
1 2
3 4
5 6
7 8
;
run;

proc print data=main;
run;

data new_main; set main;
a = x + y;
b = x – y;
c = x * y;
d = x / y;
e = x ** y;
f = ((x + y) * (x – y));
run;

proc contents data=new_main varnum;
run;

proc print data=new_main;
run;

-FAIR ;USE-
“Copyright Disclaimer Under Section 107 of the Copyright Act 1976, allowance is made for “fair use” for purposes such as criticism, comment, news reporting, teaching, scholarship, and research. Fair use is a use permitted by copyright statute that might otherwise be infringing. Non-profit, educational or personal use tips the balance in favor of fair use.”

Anayansi Gamboa has an extensive background in clinical data management as well as experience with different EDC systems including Oracle InForm, InForm Architect, Central Designer, CIS, Clintrial, Medidata Rave, Central Coding, OpenClinica Open Source and Oracle Clinical.

How to Use SAS – Lesson 5 – Data Reduction and Data Cleaning

This video series is intended to help you learn how to program using SAS for your statistical needs. Lesson 5 introduces the concept of data reduction (also known as subsetting ;data sets). I discuss how one can subset a data set (i.e. reduce a data set’s number of observations) based on some criteria using the IF statement in the DATA STEP, or using the WHERE statement in a PROC STEP. I also discuss using the KEEP, DROP, and RENAME statements for reducing data to only a handful of the original variables (i.e. reduce a data set’s number of variables). Furthermore, I show how one can label variables so that descriptive information can be presented in output and value formats so that specific values are easy to understand. Finally, I provide basic examples of each of these for three hypothetical data sets.

Helpful Notes:

1. There are two places you can reduce the data you analyze; in the DATA STEP, and in the PROC STEP.

2. To subset data in the DATA STEP, use the IF statement.

3. To subset data in the PROC STEP, use the WHERE statement.

4. Another way to reduce data is to eliminate variables using a KEEP or DROP statement. This method is useful if you are creating a second data set or analytic version of your main dataset.

5. The RENAME statement simply changes a variables name.

Today’s Code:

data main;
input x y z;
cards;
1 2 3
7 8 9
;
run;

proc contents data=main; run;
proc print data=main; run;

/* 1. Reduce data in the DATA STEP using a simple IF statement */
data reduced_main; set main;
if x = 1;
run;

proc print data=main; run;
proc print data=reduced_main; run;

/* 2. Reduce data in the PROC STEP using a simple WHERE statement */
proc print data=main;
where x = 1;
run;

proc print data=main; run;
proc print data=reduced_main; run;

/* 3. Reduce data in the DATA STEP by KEEPing only the variables you do want */
data reduced_main; set main;
KEEP x y;
run;

proc print data=main; run;
proc print data=reduced_main; run;

/* 4. Reduce data in the DATA STEP by DROPing the variables you don’t want */
data reduced_main; set main;
DROP y;
run;

proc print data=main; run;
proc print data=reduced_main; run;

/* 5. Clean up variables using the RENAME statement within a DATA STEP */
data clean_main; set main;
rename x = ID y = month z = day;
run;

proc contents data=main; run;
proc contents data=clean_main; run;

/* 6. Clean up variables using a LABEL statement within a DATA STEP */
data clean_main; set clean_main;
label ID = “Identification Number” month = “Month of the Year” day = “Day of the Year”;
run;

proc contents data=main; run;
proc contents data=clean_main; run;

/* 7. FORMAT value labels using the PROC FORMAT and FORMAT statements */
PROC FORMAT;
value months 1=”January” 2=”February” 3=”March” 4=”April” 5=”May” 6=”June” 7=”July” 8=”August” 9=”September” 10=”October” 11=”November” 12=”December”;
run;

data clean_main; set clean_main;
format month months.;
run;

proc ;freq data=clean_main;
table month;
run;

-FAIR ;USE-
“Copyright Disclaimer Under Section 107 of the Copyright Act 1976, allowance is made for “fair use” for purposes such as criticism, comment, news reporting, teaching, scholarship, and research. Fair use is a use permitted by copyright statute that might otherwise be infringing. Non-profit, educational or personal use tips the balance in favor of fair use.”

Anayansi Gamboa has an extensive background in clinical data management as well as experience with different EDC systems including Oracle InForm, InForm Architect, Central Designer, CIS, Clintrial, Medidata Rave, Central Coding, OpenClinica Open Source and Oracle Clinical.

How to Use SAS – Lesson 4 – Merging Data Sets

This video series is intended to help you learn how to program using SAS for your statistical needs. Lesson 4 introduces the concept of merging SAS data sets using a variety of methods. I discuss how one can merge two or more data sets in the DATA STEP using the SET statement. I also describe how one can use the MERGE statement to bring two or more datasets together that may have a common index variable. Furthermore, I describe the SORT procedure (PROC ;SORT) that must be used with the MERGE statement. Finally, I provide basic methods of merging data sets using PROC SQL.

Helpful Notes:
1. Use one SET statement when you have the same variables, but different observations.

2. Use two SET statements when you have different variables, but the same observations.

3. Use the MERGE statement when you have a common index variable, and any new variables or observations.

4. The MERGE statement first requires that you use the SORT procedure (PROC SORT) to sort on the index variable before merging.

5. Make sure that you add the BY statement after the MERGE statement in your DATA step or you will have a new dataset that is merged incorrectly.

6. PROC SQL is an advanced method of merging data that can be very powerful for large datasets. It uses different kinds of “JOINS” that I will provide more information on in a later video.

Today’s Code:
data main;
input x y z;
cards;
1 2 3
7 8 9
;
run;

/* 1. Use one SET statement when you have the same variables, but different observations */
data more_people;
input x y z;
cards;
4 5 6
3 6 9
;
run;

data final;
set main more_people;
run;

proc print data=final; run;

/* 2. Use two SET statements when you have different variables, but the same observations */
data more_vars;
input a b c;
cards;
20 40 60
10 20 30
;
run;
data new_final;
set main;
set more_vars;
run;

proc print data=new_final; run;

/* 3. Use the MERGE statement when you have a common index variable, and any new variables or observations */
data more_vars_and_people;
input x a b c;
cards;
1 20 40 60
7 10 20 30
2 11 12 13
3 14 15 16
;
run;

* The MERGE statement requires that you use an index variable to merge on (e.g. an ID variable).;
* Thus, you must SORT your data BY that index variable.;
proc sort data=main;
by x;
proc sort data=more_vars_and_people;
by x;
run;
data merged_final;
merge main more_vars_and_people;
by x;
run;

proc print data=merged_final; run;

/* 4. SQL is an advanced programming language for databases. Here, I provide a basic example to merge the two datasets using a LEFT JOIN. I will include more information about JOIN types in a follow up video. For now, think of a LEFT JOIN as one that only includes the data from the second dataset (more_vars_and_people) that corresponds to data from the original dataset (main).
*/
proc ;sql;
create table sql_final as
select L.*, R.*
from main as L
LEFT JOIN more_vars_and_people as R
on L.x = R.x;
quit;

proc print data=sql_final; run;

-FAIR ;USE-
“Copyright Disclaimer Under Section 107 of the Copyright Act 1976, allowance is made for “fair use” for purposes such as criticism, comment, news reporting, teaching, scholarship, and research. Fair use is a use permitted by copyright statute that might otherwise be infringing. Non-profit, educational or personal use tips the balance in favor of fair use.”

Anayansi Gamboa has an extensive background in clinical data management as well as experience with different EDC systems including Oracle InForm, InForm Architect, Central Designer, CIS, Clintrial, Medidata Rave, Central Coding, OpenClinica Open Source and Oracle Clinical.

How to Use SAS – Lesson 3 – Importing External Data

This video series is intended to help you learn how to program using SAS for your statistical needs. Lesson 3 introduces the concept of permanent or external data sets and how to import them into SAS. I provide basic methods of importing permanent data sets using the INFILE ;statement and the IMPORT procedure (PROC ;IMPORT) for non-SAS based data files. I also discuss libraries and the LIBNAME ;statement to import SAS data directly using the SET statement. Finally, I show how one can save a SAS data set from the data step using LIBNAMEs in the DATA step.

Helpful Notes:

1. The LIBNAME statement is used to point SAS towards a specific folder on your computer.
2. The INFILE statement “reads” data into SAS if it is of a certain format (usually comma, space, or tab delimited).
3. PROC IMPORT – imports data of any of several different file formats into SAS.
4. The SET statement imports data from a library into SAS at the DATA STEP.
5. The library name in a data step’s data name “writes” data from SAS into your library folder using SAS’s own file format system.

Today’s Code:
data main;
input x y z;
cards;
1 2 3
7 8 9
;
run;

proc contents data=main;
run;

proc print data=main;
run;

/* TEMPLATED CODE: .txt file type, with or without delimiters */
data [appropriate data set name here]; infile ;”[your file location here, including .txt extension]” LRECL=[a logical length of your data to emcompass ;ENTIRE data] DLM=’,’;
input
[variable names here]
;
run;

data infile_main;
infile “C:\My SAS Files\main.txt”;
input x y z;
run;

proc print data=infile_main;
run;

/* TEMPLATED CODE: Microsoft Excel (.xls) file type */
proc import out=[your data set name here]
datafile='[your file location here, including .xls extension]’
dbms=excel replace;
*Optional statements are below; sheet='[specify sheet to obtain]’; getnames=[yes/no – first row = variable names]; mixed=[yes/no – refers to data types, if num AND char varibles, use yes]; usedate=[yes/no – read date formatted data as date formatted SAS data]; scantime=[yes/no – read in time formatted data as long as variable is not date format];
run;

proc import out=imported_excel
datafile=’C:\My SAS Files\main.xls’
dbms=excel replace;
*Optional statements are below; sheet=’Sheet1′; getnames=yes;
run;

proc print data=imported_excel;
run;

libname home “C:\My SAS Files\”;

data sas_format; set home.main;
run;

data home.sas_format; set infile_main;
run;

-FAIR ;USE-
“Copyright Disclaimer Under Section 107 of the Copyright Act 1976, allowance is made for “fair use” for purposes such as criticism, comment, news reporting, teaching, scholarship, and research. Fair use is a use permitted by copyright statute that might otherwise be infringing. Non-profit, educational or personal use tips the balance in favor of fair use.”

Anayansi Gamboa has an extensive background in clinical data management as well as experience with different EDC systems including Oracle InForm, InForm Architect, Central Designer, CIS, Clintrial, Medidata Rave, Central Coding, OpenClinica Open Source and Oracle Clinical.

How to Use SAS – Lesson 2 – Creating Datasets on the Fly

This video series is intended to help you learn how to program using SAS for your statistical needs. Lesson 2 introduces some basic data step programming to define variables and specify their values for data sets containing one or more observations.

I also introduce two procedures: the PRINT procedure (PROC ;PRINT) to display the data contents in the OUTPUT window, and the CONTENTS procedure (PROC CONTENTS) to summarize the data set. Finally, I introduce the concept of libraries to show another method of inspecting the data set by physically opening it from the temporary WORK library.

Helpful Notes:

1. PROC PRINT – displays the entire data set by observation in the OUTPUT window
2. PROC CONTENTS – summarizes the properties of a data set, including an alphabetic listing of the variables and a count of the number of observations.
3. The assignment operator (“=”) directly specifies the value of a variable in the data step.
4. The INPUT statement defines one or more variables of our data set.
5. The CARDS statement specifies the values for each of the INPUT variables (in order).
6. It is a good rule of thumb to always pair the INPUT and CARDS statements together.
7. DON’T FORGET SEMI;COLONS! They end statements and without them, you will most certainly have errors arise.
8. If you have any errors, always, ALWAYS, ALWAYS check the LOG first!
9. Creating datasets “on-the-fly” just means you’re making a new dataset without bringing in the data from any other source.

Today’s Code:

data main;
input x y z;
cards;
1 2 3
7 8 9
;
run;

proc print data=main;
run;

proc contents data=main;
run;

-FAIR ;USE-
“Copyright Disclaimer Under Section 107 of the Copyright Act 1976, allowance is made for “fair use” for purposes such as criticism, comment, news reporting, teaching, scholarship, and research. Fair use is a use permitted by copyright statute that might otherwise be infringing. Non-profit, educational or personal use tips the balance in favor of fair use.”

Anayansi Gamboa has an extensive background in clinical data management as well as experience with different EDC systems including Oracle InForm, InForm Architect, Central Designer, CIS, Clintrial, Medidata Rave, Central Coding, OpenClinica Open Source and Oracle Clinical.