Becoming a Data Scientist {EDC Developer + Statistical Expert + Data Manager}

At an early age, I was drawn to computers. I did well in math; I love science and I started enjoying programming when my stepfather gave me a small computer to program games. This was my real experience with programming. I think the programming language was Basic. The computer had some built-in games and basic math problems in it but you could also play around with ‘Basic‘ codes and create your own.

Then I went to a technical school and into college where you take basic classes in information system /technology and took courses in telecommunication management.  Most of the courses were around IP, PBX and Network Administration.  As part of that curriculum, I took a basic programming course and VB.net. I really like that since it has a visual interface (drag and drop to create the interface) and when you click a button you create an event so I like the design aspect of it (I am known to be very creative) then I started to design for people (website design and development, small databases). A lot better than working in telecommunications. I thought VB was a great first language to learn. Later I took a Microsoft Access database development class and we learn database design (relational) and found out I was really good at that.

Before I graduated, I was already working for a well known pharmaceutical company as a database analyst within their data management and biometrics team. They really like what I did with their clinical operations data (investigator data – you know the one that now we need CTMS systems for nowadays). So this was a confirmation that ‘databases’ was my passion. I love designing it, managing and maintaining it.

During my early years in this industry, I spent a lot of time writing SQL codes and SAS programs.  We pulled the messy data (back in those years we used the Clintrial Oracle backend system) and very problem solving oriented. A business question was asked and we would go using either SQL or SAS and go into this messy database and figure it out the answer. I really enjoyed that.

In recent years, I take data from a {EDC} system then write scripts to summarize the data for reporting and put into a data warehouse and then I use a product called ‘IBM Cognos’, which points to the data warehouse to build those reports and worked with different users across different departments (a lot of different audiences for the data) with a lot of different interesting data in there. I have spent time using APIs to extract data via Web Services (usually in XML-ODM format) and generate useful reports in SAS or Excel XML.

People think that being a data analyst is just sitting around a computer screen and crunching data. A lot of it is design-oriented, people-oriented, and problem-solving. So when people ask a question, I get to dive into the data and figure it out the answer.

Next step is to get into predictive analytics and do more data mining and data forecasting.

Are you still excited about becoming a data scientist?

You can start by reading my blog about programming languages you should learn here!

Other tools and programming languages you should learn: Anaconda, R Programming, Python, Business Intelligence Software like Tableau, Big Data Analytics with Hadoop, create new representations of the data using HTML and CSS (for example when you use APIs, XML to extract data from third-party sources).

Anayansi, MPM, an EDC Developer Consultant and clinical programmer for the Pharmaceutical, Biotech, and Medical Device industry with more than 18 years of experience.

Available for short-term contracts or ad-hoc requests.  See my contact page for more details or contact me.

Fair Use Notice: Images/logos/graphics on this page contains some copyrighted material whose use has not been authorized by the copyright owners. We believe that this not-for-profit, educational, and/or criticism or commentary use on the Web constitutes a fair use of the copyrighted material (as provided for in section 107 of the US Copyright Law).

Advertisements