Tag Archives: issues with cleaning clinical data

How to Use SAS – Lesson 2 – Creating Datasets on the Fly

This video series is intended to help you learn how to program using SAS for your statistical needs. Lesson 2 introduces some basic data step programming to define variables and specify their values for data sets containing one or more observations.

I also introduce two procedures: the PRINT procedure (PROC ;PRINT) to display the data contents in the OUTPUT window, and the CONTENTS procedure (PROC CONTENTS) to summarize the data set. Finally, I introduce the concept of libraries to show another method of inspecting the data set by physically opening it from the temporary WORK library.

Helpful Notes:

1. PROC PRINT – displays the entire data set by observation in the OUTPUT window
2. PROC CONTENTS – summarizes the properties of a data set, including an alphabetic listing of the variables and a count of the number of observations.
3. The assignment operator (“=”) directly specifies the value of a variable in the data step.
4. The INPUT statement defines one or more variables of our data set.
5. The CARDS statement specifies the values for each of the INPUT variables (in order).
6. It is a good rule of thumb to always pair the INPUT and CARDS statements together.
7. DON’T FORGET SEMI;COLONS! They end statements and without them, you will most certainly have errors arise.
8. If you have any errors, always, ALWAYS, ALWAYS check the LOG first!
9. Creating datasets “on-the-fly” just means you’re making a new dataset without bringing in the data from any other source.

Today’s Code:

data main;
input x y z;
cards;
1 2 3
7 8 9
;
run;

proc print data=main;
run;

proc contents data=main;
run;

-FAIR ;USE-
“Copyright Disclaimer Under Section 107 of the Copyright Act 1976, allowance is made for “fair use” for purposes such as criticism, comment, news reporting, teaching, scholarship, and research. Fair use is a use permitted by copyright statute that might otherwise be infringing. Non-profit, educational or personal use tips the balance in favor of fair use.”

Anayansi Gamboa has an extensive background in clinical data management as well as experience with different EDC systems including Oracle InForm, InForm Architect, Central Designer, CIS, Clintrial, Medidata Rave, Central Coding, OpenClinica Open Source and Oracle Clinical.

Advertisements

Clinical Trials Terminology for SAS Programmers

Entry Level SAS Programmers

Statistical Programmer:requires him to program using the SAS language to analyze clinical data and produce reports for the FDA

Bioanalyst, Clinical Data Analyst, Statistical Programmer Analyst and SAS Programmer: same as Statistical programmer.

Biotechnology:companies which is a general term used to explain a technique of using living organisms within biological systems to develop micro-organisms for a particular purpose.

protocol:outlined all the procedures and contained detailed plans of the study.

controlled experiment: the clinical trial had patients grouped into different groups such as those in the placebo controlled group which had no active drug. This is how comparisons are made within the controlled clinical trial CFR Part 11:Code of Federal Regulations set by the FDA to regulate food, drug, biologics and device industries. The part 11 specifically deals with the creation and maintenance of electronic records.
Case Report Form or CRF:forms to collect information such as demographic and adverse events. Source Data or the information collected:which include important documents because they contain the core information required to reconstruct the essential capital of the study.
sponsor:company who is responsible for the management, financing and conduct of the entire trial. randomized: subjects that are randomly assigned to groups so that each subject has an equal chance to be assigned to the placebo control
baseline: subjects are assigned to their drug change from baseline:analyses that measure differences between baseline and current visit
placebo or sugar pill:is an inactive substance designed to look like the drug being tested. blinded:they do not know if the drug that they are taking contains the active ingredient.
open-label study:all was out in the open, the drug the subject is assigned to. Pharmacokinetics or PK:analysis of that study showed that with that dosing level, there were high levels of toxicity in the subject.
informed consent: described all the potential benefits and risks involved. TLGs: Tables, Listings and Graphs
trade name:drug name that is collected from the patient and recorded into the source data. For example: Tylenol generic name: refers to its chemical compound. For example: Acetaminophen.
WHO-DRUG: list all the drug names and how they matched to the generic drug names.This dictionary is managed by the World Health Organization MedDRA:This is short for Med (Medical), D (Dictionary), R (Regulatory), and A (Activities).
SAP: Statistical Analysis Plan ANOVA: analysis of variable
confidence interval:gives an estimated range of values being calculated from the sample of patient data that is currently in the study. null hypothesis:lack of difference between the groups in a report
pilot study:perform the same analysis upon an older. DIA: Drug Information Association
CBER: Center for Biologics Evaluation and Research (medical device) CDER: Center for Drug Evaluation and Research (drug)

Source:CDER Acronym List


Anayansi Gamboa has an extensive background in clinical data management as well as experience with different EDC systems including Oracle InForm, InForm Architect, Central Designer, CIS, Clintrial, Medidata Rave, Central Coding, OpenClinica Open Source and Oracle Clinical.

Adverse Event Monitoring for CRAs

During monitoring visits one of the most important and impacting activities that a CRA performs is the source document verification of Adverse Events. The CRA is the eyes for the research sponsor when it comes to proper collection and documentation of subject safety information. Incorrect and inadequate monitoring of adverse events can lead to inaccurate labeling for clinical trials and impact market application inspectional reviews, as well as post marketing labeling. The safety regulatory and ICH definitions will be reviewed and applied to the monitoring process. This includes Causality, Expectedness/Unanticipated, and other important concepts. Case scenarios will be used to apply the information for better learning.

-FAIR USE-
“Copyright Disclaimer Under Section 107 of the Copyright Act 1976, allowance is made for “fair use” for purposes such as criticism, comment, news reporting, teaching, scholarship, and research. Fair use is a use permitted by copyright statute that might otherwise be infringing. Non-profit, educational or personal use tips the balance in favor of fair use.”

Introduction to Clinical Trials

Video introducing cancer clinical trials and their use in clinical practice guidelines

-FAIR USE-
“Copyright Disclaimer Under Section 107 of the Copyright Act 1976, allowance is made for “fair use” for purposes such as criticism, comment, news reporting, teaching, scholarship, and research. Fair use is a use permitted by copyright statute that might otherwise be infringing. Non-profit, educational or personal use tips the balance in favor of fair use.”

Source: Cancer Guidelines – Canada

Data Management: Queries in Clinical Trials

When an item or variable has an error or a query raised against it, it is said to have a “discrepancy” or “query”.

All EDC systems have a discrepancy management tool or also refer to “edit check” or “validation check” that is programmed using any known programming language (i.e. PL/SQL, C# sharp, SQL, Python, etc).

So what is a ‘query’? A query is an error generated when a validation check detects a problem with the data. Validation checks are run automatically whenever a page is saved “submitted” and can identify problems with a single variable, between two or more variables on the same eCRF page, or between variables on different pages. A variable can have multiple validation checks associated with it.

Errors can be resolved in several ways:

  • by correcting the error – entering a new value for example or when the datapoint is updated
  • by marking the variable as correct – some EDC systems required additional response or you can raise a further query if you are not satisfied with the response

Dealing with queries
Queries can be issued and/or answered by a number of people involved in the trial. Some of the common setups are: CDM, CRA or monitors, Site or coordinators.

Types of Queries

  • Auto-Queries or Systems checks
  • Manual Queries
  • Coding Queries
  • SDV related Queries generated during a Monitor visit
  • External Queries – for external loaded data in SAS format

EDC Systems and Discrepancy Output Examples

InForm

Note: All queries are associated to a single data item relevant to that query.

RAVE

Note: Users are only able to see / perform an action on a query based on their
role and the permissions via Core Config.

Timaeus

Note: Queries are highlighted by a red outline and a Warning icon.

OpenClinica

Note: Extensive interfaces for data query.

Query Metrics – It is important to measure the performance of your clinical trials.
Metrics are the same for all clinical studies but not all EDC systems are the same. Standardized metrics encourage performance improvement, effectiveness, and efficiency. Some common metrics are:

  • Outstanding Query
  • Query Answer Time
  • Average Time to Query Resolution
  • Number of closed discrepancies on all ongoing studies

Data management’s experience with data queries in clinical trials

FAIR USE
“Copyright Disclaimer Under Section 107 of the Copyright Act 1976, allowance is made for “fair use” for purposes such as criticism, comment, news reporting, teaching, scholarship, and research. Fair use is a use permitted by copyright statute that might otherwise be infringing. Non-profit, educational or personal use tips the balance in favor of fair use.”

Trademarks: InForm is a trademark or registered trademark of Oracle Corporation. Rave is a trademark or registered trademark of Medidata. Timaeus is a trademark or registered trademark of Cmed Clinical Research.


Anayansi Gamboa has an extensive background in clinical data management as well as experience with different EDC systems including Oracle InForm, InForm Architect, Central Designer, CIS, Clintrial, Medidata Rave, Central Coding, OpenClinica Open Source and Oracle Clinical.

Standard Naming Conventions for InForm Trials

This document is intended to provide a common set of rules to apply to the naming of clinical trials build using InForm EDC system.

Why use naming conventions?

Naming objects consistently, logically and in a predictable way will distinguish similar records from one another at a glance, and by doing so will facilitate the storage and retrieval of records, which will enable users to browse clinical objects more effectively and efficiently. Naming records according to agreed conventions should also make object naming easier for colleagues because they will not have to ‘re-think’ the process each time.

It has been said that InForm follows the “Hungarian” notation because it is one of Microsoft’s “Best Practices” for .Net standards when defining objects (the code to support those objects use it).

Component Prefix
Form (e.g., frmDemo…) frm
Section sct
Itemset its
Radio Control rdc
Item itm
Pulldown Control pdc
Text box txt
Date and time dtm
Group Control grp
Checkbox chk
Calculated Control cal
Simples smp
Study Element elm
Codelist cl
Study Event evt
Codelist Item citm
Workflow Rule wr
Global Conditions gc
Data Entry Rules (e.g., rulDMConsDTCompare) rul
DataType Prefix
Boolean bln
Byte byt
Character chr
Date dtm
Decimal dec
Double Precision dbl
Integer int
Long Integer lng
Object obj
Short Integer sht
Single Precision sng
String str
User-defined Type udt
Object Prefix
Button btn
CheckBox chk
ComboBox cbo
Control ctr
DataSet ds
DataTable dt
Form frm
GroupBox grp
Label lbl
ListBox lst
PictureBox pic
RadioButton rdb
String str
TextBox txt

Remember keep it consistent. This means that you stick to one particular pattern through out your clinical project. This also includes the words you use for namespaces, classes, methods, interfaces, properties and variables. A prerequisite is that they should be meaningful, significant, descriptive and easily understood with respect to purpose and functionality by anyone who reads the source code.

Happy Programming!


Anayansi Gamboa has an extensive background in clinical data management as well as experience with different EDC systems including Oracle InForm, InForm Architect, Central Designer, CIS, Clintrial, Medidata Rave, Central Coding, OpenClinica Open Source and Oracle Clinical.

The Next Best Thing – Timaeus Trial Builder?

First of all, let me clarify by saying that I am not an expert when it comes to Timaeus. I recently came across this EDC tool while working on a project. We were testing out different EDC applications as part of their new infrastructure solution.

At first, I was hesitant to learn about it. All I knew was that you need it to know ‘Python’. The main programming language for their edit checks/validations and back-end structure but after my first encounter with the tool, I changed my mind. This is one of the easiest tool to use and deploy your clinical study you can find in the market, nowadays.

With that being said, What is Timaeus? This another EDC tool, trial builder application provided by Cmed Technology www.cmedresearch.com which helps build eCRF (data entry screens), edit checks/validations, external loading data and other config files.

In order to grasp this new tool, you will need to familiarize yourself with other technologies such as HTML, XML, Emacs, SVN, Python and the like and understand the TMPL element concept.

TMPL stands for “Timaeus Markup Language”. It has a bit of pieces of codes similar to what you see in HTML or XML files.

Even though the system is lacking of front-end features we are so used to in comparison with similar EDC solutions, nevertheless, this tool gets my thumps up for ease of use, cost-effectiveness, change control capabilities and one of the most robust security systems to capture electronic records as per CFR11 regulations.

A New Way to Collect Data – CDASH

There is a general consensus that the old paper-based data management tools and processes were inefficient and should be optimized. Electronic Data Capture has transformed the process of clinical trials data collection from a paper-based Case Report Form (CRF) process (paper-based) to an electronic-based CRF process (edc process).

In an attempt to optimize the process of collecting and cleaning clinical data, the Clinical Data Interchange Standards Consortium (CDISC), has developed standards that span the research spectrum from preclinical through postmarketing studies, including regulatory submission. These standards primarily focus on definitions of electronic data, the mechanisms for transmitting them, and, to a limited degree, related documents, such as the protocol.

Clinical Data Acquisition Standards Harmonization (CDASH)

The newest CDISC standard, and the one that will have the most visible impact on investigative sites and data managers, is Clinical Data Acquisition Standards Harmonization (CDASH).

As its name suggests, CDASH defines the data in paper and electronic CRFs.

Although it is compatible with CDISC’s standard for regulatory submission (SDTM), CDASH is optimized for data captured from subject visits, so some mapping between the standards is required. In addition to standardizing questions, CDASH also references CDISC’s Controlled Terminology standard, a compilation of code lists that allows answers to be standardized as well.

Example: Demographics (DM)

Description/definition variable name Format
Date of Birth* BRTHDTC dd MMM yyyy
Sex** SEX $2
Race RACE 2
Country COUNTRY $3

*CDASH recommends collecting the complete date of birth, but recognizes that in some cases only BIRTHYR and BIRTHMO are feasible.

* *This document lists four options for the collection of Sex: Male, Female, Unknown and Undifferentiated (M|F|U|UN). CDASH allows for a subset of these codelists to be used, and it is typical to only add the options for Male or Female.

The common variables: STUDYID, SITEID or SITENO, SUBJID, USUBJID, and INVID that are all SDTM variables with the exception of SITEID which can be used to collect a Site ID for a particular study, then mapped to SITEID for SDTM.

Common timing variables are VISIT, VISITNUM, VISDAT and VISTIM where VISDAT and VISTIM are mapped to the SDTM –DTM variable.

Note: Certain variables are populated using the Controlled Terminology approach. The COUNTRY codes are populated using ISO3166 standards codes from country code list. This is typically not collected but populated using controlled terminology.

Each variable is defined as:

  • Highly Recommended: A data collection field that should be on the CRF (e.g., a regulatory requirement).
  • Recommended/Conditional: A data collection field that should be collected on the CRF for specific cases or to address TA requirements (may be recorded elsewhere in the CRF or from other data collection sources).
  • Optional: A data collection field that is available for use if needed

The CDASH and CDICS specifications are available on the CDICS website free of charge. There are several tool available to help you during the mapping process from CDASH to SDTM. For example, you could use Base SAS, SDTM-ETL or CDISC Express to easily map clinical data to SDTM.

In general you need to know CDISC standards and have a good knowledge of data collection, processing and analysis.

With the shift in focus of data entry, getting everyone comfortable with using a particular EDC system is a critical task for study sponsors looking to help improve the inefficiencies of the clinical trial data collection process. Certainly the tools are available that can be used to help clinical trial personnel adapt to new processes and enjoy better productivity.


Anayansi Gamboa has an extensive background in clinical data management as well as experience with different EDC systems including Oracle InForm, InForm Architect, Central Designer, CIS, Clintrial, Medidata Rave, Central Coding, OpenClinica Open Source and Oracle Clinical.

Central Designer aCRF Generation

aCRF casebook ;- PDF file

1. Project Explorer – click on the 2nd level study name

2. File ->; Annotated Study Book Options

3. Uncheck Time and Events table

4. Uncheck Study Object Description Tables
5. Change date variable format to match eCRF ;map, setting is not noted on aCRF output

6. File ->; View Annotated Study Book

7. Pop up screen, click Print in lower right

8. Select Printer – Adobe PDF (single click)
9. Click Preferences

10. Click on Layout tab
11. Change to Landscape

12. Click on Advance tab in lower right

13. Change Scaling as needed, check PDF output as needed

14. Select Print, wait for file name box, aCRF is done.

Time and Events Table – CSV file

1. File ->; Annotated Study Book Options

2. Check Time and Events Table

3. File ->; View Annotated Study Book

4. Click on Save Time & Events as button lower left

5. Give it a file name

Note: ;Steps 1-5 have to be repeated every time as aCRF defaults back to base settings after you close out the study.


Anayansi Gamboa has an extensive background in clinical data management as well as experience with different EDC systems including Oracle InForm, InForm Architect, Central Designer, CIS, Clintrial, Medidata Rave, Central Coding, OpenClinica Open Source and Oracle Clinical.

Clinical Trials Acronyms

ADaM – Analysis Data Model (a CDISC standard) ADR – Adverse drug reation
AE – Adverse Event CRA –Clinical Research Associate
ATC – Anatomic-Therapeutic Chemical Coding dictionary CDASH – Clinical Data Acquisition Standards Harmozation (a CDISC initiative)
CDISC – Clinical Interchange Standards Consortium CDM – Clinical Data Management
CDMS – Clinical Data Management system CF – Consent Form
CSR – Clinical Study Report CRB – Case Record book
CT – Clinical Trial CTA – Clinical Trial Agreement
CD – Common Technical Document CRB – Central Review Board
CRF – Case Report Form CRO – Contract Research Organization
CNS – Central Nervous System GMP – Good Manufacturing Practices
GRP – Good Review Practice GXP – Good Pharmaceutical Practice
eCTD – Electronic Common Technical Document EDC – Electronic Data Capture
EDI – Electronic Data Interchange IB – Investigator’s brochure
IC – Informed Consent IND – Investigational New Drug Application (FDA)
IVRS – Interactive voice response system MedDRA – Medical Dictionary for Regulatory Activities
OC – Oracle Clinical SDV – Source document (data) verification
QA – quality assurance QC – quality control
QL/QOL – Qualify of life R&D – Research and development
SAE – Serious Adverse Event SAS – Statistical Analysis System
WHO – World Health Organization  
   

Reference: Part of this post was taken from the Applied Clinical Trials website at actmagazine


Anayansi Gamboa has an extensive background in clinical data management as well as experience with different EDC systems including Oracle InForm, InForm Architect, Central Designer, CIS, Clintrial, Medidata Rave, Central Coding, OpenClinica Open Source and Oracle Clinical.