Category Archives: External Data

Becoming a Data Scientist {EDC Developer + Statistical Expert + Data Manager}

At an early age, I was drawn to computers. I did well in math; I love science and I started enjoying programming when my stepfather gave me a small computer to program games. This was my real experience with programming. I think the programming language was Basic. The computer had some built-in games and basic math problems in it but you could also play around with ‘Basic‘ codes and create your own.

Then I went to a technical school and into college where you take basic classes in information system /technology and took courses in telecommunication management.  Most of the courses were around IP, PBX and Network Administration.  As part of that curriculum, I took a basic programming course and VB.net. I really like that since it has a visual interface (drag and drop to create the interface) and when you click a button you create an event so I like the design aspect of it (I am known to be very creative) then I started to design for people (website design and development, small databases). A lot better than working in telecommunications. I thought VB was a great first language to learn. Later I took a Microsoft Access database development class and we learn database design (relational) and found out I was really good at that.

Before I graduated, I was already working for a well known pharmaceutical company as a database analyst within their data management and biometrics team. They really like what I did with their clinical operations data (investigator data – you know the one that now we need CTMS systems for nowadays). So this was a confirmation that ‘databases’ was my passion. I love designing it, managing and maintaining it.

During my early years in this industry, I spent a lot of time writing SQL codes and SAS programs.  We pulled the messy data (back in those years we used the Clintrial Oracle backend system) and very problem solving oriented. A business question was asked and we would go using either SQL or SAS and go into this messy database and figure it out the answer. I really enjoyed that.

In recent years, I take data from a {EDC} system then write scripts to summarize the data for reporting and put into a data warehouse and then I use a product called ‘IBM Cognos’, which points to the data warehouse to build those reports and worked with different users across different departments (a lot of different audiences for the data) with a lot of different interesting data in there. I have spent time using APIs to extract data via Web Services (usually in XML-ODM format) and generate useful reports in SAS or Excel XML.

People think that being a data analyst is just sitting around a computer screen and crunching data. A lot of it is design-oriented, people-oriented, and problem-solving. So when people ask a question, I get to dive into the data and figure it out the answer.

Next step is to get into predictive analytics and do more data mining and data forecasting.

Are you still excited about becoming a data scientist?

You can start by reading my blog about programming languages you should learn here!

Other tools and programming languages you should learn: Anaconda, R Programming, Python, Business Intelligence Software like Tableau, Big Data Analytics with Hadoop, create new representations of the data using HTML and CSS (for example when you use APIs, XML to extract data from third-party sources).

Anayansi, MPM, an EDC Developer Consultant and clinical programmer for the Pharmaceutical, Biotech, and Medical Device industry with more than 18 years of experience.

Available for short-term contracts or ad-hoc requests.  See my contact page for more details or contact me.

Fair Use Notice: Images/logos/graphics on this page contains some copyrighted material whose use has not been authorized by the copyright owners. We believe that this not-for-profit, educational, and/or criticism or commentary use on the Web constitutes a fair use of the copyrighted material (as provided for in section 107 of the US Copyright Law).

Randomisation

Comments? Join us at {EDC Developer}

Anayansi Gamboa, MPM, an EDC Developer Consultant and clinical programmer for the Pharmaceutical and Biotech industry with more than 13 years of experience.

Available for short-term contracts or ad-hoc requests. See my specialties section (Oracle, SQL Server, EDC Inform, EDC Rave, OpenClinica, SAS and other CDM tools)

As the 3 C’s of life states: Choices, Chances and Changes- you must make a choice to take a chance or your life will never change. I continually seek to implement means of improving processes to reduce cycle time and decrease work effort.

Subscribe to my blog’s RSS feed and email newsletter to get immediate updates on latest news, articles, and tips. I am available on LinkedIn. Connect with me there for technical discussions.

Fair Use Notice: This article/video contains some copyrighted material whose use has not been authorized by the copyright owners. We believe that this not-for-profit, educational, and/or criticism or commentary use on the Web constitutes a fair use of the copyrighted material (as provided for in section 107 of the US Copyright Law. If you wish to use this copyrighted material for purposes that go beyond fair use, you must obtain permission from the copyright owner. Fair Use notwithstanding we will immediately comply with any copyright owner who wants their material removed or modified, wants us to link to their website or wants us to add their photo.

CDISC/CDASH Standards at your Fingertips

A standard database structure using CDISC (Clinical Data Interchange Standards Consortium) and CDASH (Clinical Data Acquisition Standards Harmonization) standards can facilitate the collection, exchange, reporting, and submission of clinical data to the FDA and EMEA. CDISC and CDASH standards provide reusability and scalability to EDC (electronic data capture) trials.

There are some defiance in implementing CDISC in EDC CDMS:

1. Key personnel in companies must be committed to implementing the CDISC/CDASH standards.

2. There is an initial cost for deployment of new technology: SDTM Data Translation Software, Data Storage and Hosting, Data Distribution and Reporting Software.

3. It can be difficult to understand and interpret complex SDTM Metadata concepts and the different implementation guides.

4. Deciding at what point in a study to apply the standards can be challenging: in the study design process, during data collection within the CDMS [CDASH via EDC tools], in SAS prior to report generation [ADaM], or after study completion prior to submission [SDTM].

5. Data management staff [CDM, clinical programmers], biostatisticians, and clinical monitors may find it difficult to converge on a new standard when designing standard libraries and processes.

6. Implementing new standards involves reorganizing the operations of (an organization) so as to improve efficiency [processes and SOPs].

7. Members of Data Management team must be retrained on the use of new software and CDISC/CDASH standards.

standards8. There are technical obstacles related to implementation in several EDC systems, including 8 character limitations [SAS] on numerous variables, determining when to use supplemental qualifiers versus creating new domains, and creating vertical data structure.

Comments? Join us at {EDC Developer}

Anayansi Gamboa, MPM, an EDC Developer Consultant and clinical programmer for the Pharmaceutical and Biotech industry with more than 13 years of experience.

Available for short-term contracts or ad-hoc requests. See my specialties section (Oracle, SQL Server, EDC Inform, EDC Rave, OpenClinica, SAS and other CDM tools)

As the 3 C’s of life states: Choices, Chances and Changes- you must make a choice to take a chance or your life will never change. I continually seek to implement means of improving processes to reduce cycle time and decrease work effort.

Subscribe to my blog’s RSS feed and email newsletter to get immediate updates on latest news, articles, and tips. I am available on LinkedIn. Connect with me there for technical discussions.

Fair Use Notice: This article/video contains some copyrighted material whose use has not been authorized by the copyright owners. We believe that this not-for-profit, educational, and/or criticism or commentary use on the Web constitutes a fair use of the copyrighted material (as provided for in section 107 of the US Copyright Law. If you wish to use this copyrighted material for purposes that go beyond fair use, you must obtain permission from the copyright owner. Fair Use notwithstanding we will immediately comply with any copyright owner who wants their material removed or modified, wants us to link to their website or wants us to add their photo.

Disclaimer: The EDC Developer blog is “one man’s opinion”. Anything that is said on the report is either opinion, criticism, information or commentary. If making any type of investment or legal decision it would be wise to contact or consult a professional before making that decision.

Disclaimer:De inhoud van deze columns weerspiegelen niet per definitie de mening van {EDC Developer}.

Disclaimer: The legal entity on this blog is registered as Doing Business As (DBA) – Trade Name – Fictitious Name – Assumed Name as “GAMBOA”.

Professional Timeline – Clinical Programmer

Professional Timeline

Curriculum Vitae
CV

 

anayansi gamboa

CDASH News: More Thoughts from the CDISC Interchange in Sweden

FAIR USE-
“Copyright Disclaimer Under Section 107 of the Copyright Act 1976, allowance is made for “fair use” for purposes such as criticism, comment, news reporting, teaching, scholarship, and research. Fair use is a use permitted by copyright statute that might otherwise be infringing. Non-profit, educational or personal use tips the balance in favor of fair use.”

CDER Common Data Standards Issues Document

 Source: FDA (Version 1.1/December 2011)

 The Center for Drug Evaluation and Research (CDER) is strongly encouraging sponsors to submit data in standard form as a key part of its efforts to continue with advancement of review efficiency and quality. CDER has been collaborating with CDISC, a standards development organization (SDO), in the development of standards to represent study data submitted in support of regulatory applications. Study data standards are vendor-neutral, platform-independent, and freely available via the CDISC website (http://www.CDISC.org). CDISC study data standards include SDTM (Study Data Tabulation Model) for representation of clinical trial tabulations, ADaM (Analysis Data Model) for clinical trial analysis files, and SEND (Standard for Exchange of Non-clinical Data) for representation of nonclinical animal toxicology studies tabulations.

CDER has accepted SDTM datasets since 2004; however, due to differences in sponsor implementation of the standard, CDER has observed significant variability in submissions containing “standardized” electronic clinical trial data. CDER has received numerous “SDTM-like” applications over the past several years in which sponsors have not followed the SDTM Implementation Guide. Furthermore, aspects of particular sponsor implementations have actually resulted in increased review difficulty for CDER reviewers. In addition, some sponsors have wrongly believed that the submission of SDTM datasets obviates the need for the submission of analysis datasets, resulting in the delay in review due to the need to request these datasets. The goal of this document is to communicate general CDER preferences and experiences regarding the submission of standardized data in order to aid sponsors in the creation of standardized datasets for both tabulation datasets and analysis datasets. .

This document is not intended to replace the need for sponsors to communicate with review divisions regarding data standards implementation approaches or issues, but instead, it is designed to complement and facilitate the interaction between sponsors and divisions. Because of specialized needs in different divisions, it is likely that divisions may have additional requests or preferences. When uncertainty exists regarding a particular data standards implementation or submission issue, the sponsor should contact the review division to discuss further.

The complete documentation on CDER data standards in .pdf version can be found at the following link: CDER

 


Anayansi Gamboa has an extensive background in clinical data management as well as experience with different EDC systems including Oracle InForm, InForm Architect, Central Designer, CIS, Clintrial, Medidata Rave, Central Coding, OpenClinica Open Source and Oracle Clinical.

Why use JReview for your Clinical Trials?

Issues with existing database query tools:
– Limited resources for current database query tools (Crystal Report, SQLServer, etc.)
– Custom reports in SQL Servers required validation
– Reports are not globally accessible.

Why chose Integrated Review?

  • Offers flexibility to the users when viewing the data
  • Users can create their own reports without validation
  • Provides a way for Clinical Data Managers to have real-time access to query and browse clinical patient data in our databases
  • IReview/JReview can then reference the “nonnative” database object using the Foreign Panel and/or ImportSQL capabilities. The result is that the user can remain working in one environment and reference the data that is located in other environments.
  • Easy to setup – no programming, no data extraction or data manipulation required.
  • Generate PDFs directly – for all patients selected

“I-Review would be used solely for cleaning data – providing highest data integrity prior to stats analysis.”

Why use a Patient Profile

  • When you want to review data from multiple Tables for a single subject Describe the factors or characteristics that are deemed critical to the success of a project, such that, in their absence the project will fail.
  • Very powerful when used with a Cross Tab report to provide the detail needed to investigate a finding.
  • Special case review such as SAEs or event adjudication.
  • Provides data to support the narrative writing team.

Advantages

  • Easy to build
  • Excel exportable file
  • Multiple subjects in a single report

Limitations

  • Poor readability
  • Output limited to 13 columns of data per row
  • Can’t edit column headers
  • Page header data is limited to 3 items
  • There is no option to use free text in the header or footer

Formatted Style
Advantages

  • High readability PDF style output which also prevents the manipulation of data
  • Free text entry for page header and footer can be used to add key notations to the report
  • Column header text can be edited to enable use of intuitive labels instead of database codes
  • Scheduling feature allows for running batches of patients and exporting outputs as a group
  • Bookmarks in output allow for quick navigation to data
  • Limitations
  • Creation of profile can be slow in the tool use scheduler
  • Can be very time consuming to develop (use a global template)

Object Storage

  • Private (local accessible by the user only) vs Public (accessible by all users)
  • Usergroup (i.e. CDS, CDM, Clinical, biostat, etc)

Object Level

  • Study (at least one view (object) at the study level)
  • Project (a mixture of project and global level and available across the entire project)
  • StudyGroup
  • Global

Keys to Success

  • Think about your audience – Clinical or Data Managers
  • The goal is to provide a report which is easy to read
  • Develop a “template” using standard modules and data items
  • Establish standard formats all parts of the report
  • Font size
  • Text alignment
  • Page margins
  • Use the same formatting for protocol specific elements


Anayansi Gamboa has an extensive background in clinical data management as well as experience with different EDC systems including Oracle InForm, InForm Architect, Central Designer, CIS, Clintrial, Medidata Rave, Central Coding, OpenClinica Open Source and Oracle Clinical.