Delhi/NCR:

Mohali:

Dehradun:

Bathinda:

Mumbai:

Nagpur:

Lucknow:

BRAIN ATTACK:

Research Methodology and Biostatistics Series III - Medical Research Protocol

Abhaya Indrayan1*

1Department of Clinical Research, Max Healthcare, New Delhi

Abstract: Validity and reliability of the results of empirical research depends on the precision with which a study is carried out. A medical research protocol answers what, who, when, why, and how of the study, and provides a backbone on which the study rests all through its execution. Thus, careful consideration should be given to the preparation of a protocol.

The frame of a protocol contains (i) a title that is brief but adequate to describe the focus of the study, (ii) the rationale of the study in view of the lacunae in the knowledge, (iii) the objectives and the hypotheses in a measurable format, (iv) the design of the study such as descriptive or analytical, and experimental (e.g., trial) or observational (prospective, retrospective, or cross-sectional), (v) the variables to include and exclude along with their definitions, (vi) the target population to which the results would apply, (vii) the sample size and the sampling strategy, (viii) complete methodology of collecting data such as laboratory, radiology and clinical investigations, (ix) the methods of statistical analysis proposed to be followed, and (x) the anticipated results and their medical utility. It also contains the references and appendices, particularly the case record form. All these are described in the protocol along with complete justification so that any third party, such as the funding agency or the reviewers and examiners, are convinced that the study is worth pursuing.

Key words: Medical Research Protocol, Empirical Research, Study Design, Sample Size Calculation, Statistical Analysis Methods

Introduction

Beside methodological development, medical research can be categorized into three major types – accidental, theoretical, and empirical. Alexander Fleming1 accidentally noticed mould on a contaminated staphylococcus culture plate that led to the development of penicillin. Serendipity plays an important role in this kind of research. Theoretical research draws from the logic and inner conscience. This kind of research is carried out mostly at intra-cellular level these days. For example, discovery of thermal and mechanical transducers by Julius and Patapoutian2, that convert heat, cold, and touch into nerve impulse, provided insight into the mechanics of pain. For the uninitiated, medical research comprises collecting data on a set of subjects and analysing it to extract significant messages. This is called empirical research as it is based on data. The data may come from the previous studies (meta-analysis), from the records, from the direct interface with the subjects, or a combination of these. The subjects can be animals as in many laboratory experiments, or the human beings. The human beings can be the general population as in most public health studies or a set of patients as in most clinical research. Accidental and theoretical research may also need validation on actual data through empirical studies.

For reliable and valid results3 , empirical research requires that it is properly planned with regard to the credentials of the investigators, rationale of the study, objectives and hypotheses, subjects to be included, and the methodology to be followed. Comprehensive statement of all these is called a protocol, a document that expresses commitment and cannot be breached unless sufficient reasons are stated. Since this is expected to be the backbone of a study and is to be followed through the process of the study, considerable thought is given at the time of its development. This requires anticipating all that could go wrong and compromise the results, as well as listing the steps to counter such occurrences. Protocol tends to establish the quality of research and the utility of the anticipated results so that the examiners and agencies can assess whether to support the research. The Scientific Committee, and the Ethics Committee of the institute examine the protocol, suggest modifications, and approve if found adequate. The study is carried out after such approval. An unappreciated role of a protocol is that it is an asset for writing the introduction and methodology sections of the paper, report, or thesis at the end of the study.

Frame of a research protocol

The frame of a research protocol can be described as follows. This frame is slightly different from the ones adopted by some research agencies, including the National Board of Examinations (NBE) for Post-graduate (PG) theses, but is required to comprehensively answer what, why, who, when, and how of the study.

  1. Title and the investigators
  2. Rationale of the study
  3. Objectives and the hypothesis
  4. Target population
  5. Study design
  6. Sample size and the sampling strategy
  7. Specification of the variables under study
  8. Methodology of medical investigations and ethics
  9. Methods of statistical analysis
  10. Anticipated results and their utility

A protocol ends with a list of references and appendices.

Some of these components will be discussed in detail in the subsequent articles in this series. A brief explanation is as follows.

1. Title and the investigators Refer to the first article4 in this series regarding selecting a research question. That will give the title. The title is brief but states all the important content of the study. For example, it could be on prevalence of frailty in kidney transplant cases, or on some correlation such as relation of frailty on the 2-year mortality in kidney transplant cases. The title also generally specifies the design such as a prospective study, a retrospective study, or a clinical trial for the efficacy and side-effects of a regimen. The title could be in a usual narrative format such as “Performance of Bae model for prediction of stroke in cases approaching a tertiary care hospital”, or in a question format such as “Do blood glucometers agree with the laboratory values in paediatric cases?” Some researchers can come up with an innovative title that raises inquisitiveness of the readers and create interest.

In the case of PG thesis, the investigator is the student, and the guides are listed. For full research, there may be a Principal Investigator (PI) and some co-investigators or collaborators. Their qualification and affiliation with department and institution are stated. Some agencies evaluating the protocol may like to have more details of the investigators regarding their expertise and experience for the topic of research. These details establish investigators’ credentials for successfully completing the study.

There is an increasing trend around the world to setup a sort of consortium of investigators/institutions for pooling their resources and for achieving wide spectrum of expertise and a larger sample size representing a cross-section of the target population. This also helps in enhancing the credentials of the study and tends to develop a perception that the study results are believable.

2. Rationale of the study Often, called ‘Background’ and sometimes ‘Introduction’, the rationale at the beginning of the protocol establishes the need of the study. It includes citations, focusing on the lacunae in the literature or gap in knowledge regarding the topic of research or the conflicting reports. A detailed review of literature can be separately included when the lacunae or the conflict cannot be clearly established through a summary in the rationale section. Few sentences on how the results of the study could benefit the science of medicine are also included in this section.

3. Objectives and the hypothesis The objectives are generally divided into primary and secondary, although this division is not necessary. As the name implies, primary objectives describe the focus, and the secondary are those that could also be achieved with the available data on the primary objectives, possibly with marginal additions. The primary objectives emanate from the title, or one may say that the title emanates from the primary objectives. All the objectives are best described in SMART format (Specific, Measurable, Achievable, Relevant, and Time-based). For example, ‘outcome’ of a treatment is a non-specific term and not measurable, whereas duration of hospital stay, and mortality are specific and measurable. An outcome such as 6-month mortality after discharge may be very different from 2-year mortality. Thus, specification of time is necessary for some outcomes. Consider whether the protocol has sufficient provisions of expertise, facilities, and time to achieve those objectives. The entire study depends on the objectives, and these should be formulated after careful consideration of all aspects of the research and must align with the title.

Hypotheses are the statement of the anticipated results which are tested for their truthfulness through the study. For example, the hypothesis may be that a test regimen has 10% better efficacy than the existing regimen. Efficacy can be specified in terms of, say, cure rate (relief or absence of signs and symptoms) within a month. Or the hypothesis could be that an artificial intelligence-based system has at least 5% more accuracy in diagnosing a condition compared with the clinical and laboratory investigations.

4. Target population The title and the objectives pretty much decide the target population to which the results would apply. For a study on risk factors for leakage in kyphoplasty, the target population could be the patients with T-score less than -1 and the patients may be restricted to those undergoing kyphoplasty for osteoporotic or pathological fractures. Patients not undergoing post-op Computed Tomography (CT) scans and not consenting to participate in the study can be excluded. This defines the target population. A big issue in all such studies is that the patients are restricted to those admitted to a particular hospital, or even those admitted in the care of a particular surgeon. The generalisability suffers due to such restrictions, yet the study may provide some useful leads on the risk factors for leakage. Another limitation comes from the consent. Only those who have confidence in themselves and in the hospital may provide informed consent, and in them the outcome may be different. All these limitations are considered while drawing conclusion at the end of the study.

5. Study design The study designs will be discussed in detail in a future article in this series. The basic forms of design are descriptive and analytical, and analytical can be experimental where the investigators do some intervention to study the change in the course of the disease, or observational where naturally occurring events are recorded. Laboratory studies on animals and clinical trials are experiments. The effect of maternal anaemia on birth weight is an observational study when no specific intervention is planned. This study could be prospective where mothers with and without anaemia are followed up for birth weight of their child, retrospective where mothers of low and normal birth weight babies are retrospectively studied for anaemia during pregnancy, or cross-sectional where a sample of a newborns is studied for maternal anaemia and birth weight together.

A later article in this series will have full details of clinical trials, including randomized double-blind controlled trials, considered as the gold standard for assessing efficacy and side effects of a regimen. For harmful interventions such as smoking, an observational study is the only choice. Prospective format of observational studies tends to give more believable results. The choice of the design depends on the objectives of the study, the target population, and the resources available, particularly the time frame of the study.

6. Sample size and sampling strategy It is hardly ever possible in medical research to study the entire target population – first because of limitation of resources, and second, more importantly, the future cases can never be studied. The irony is that the findings of the study are mostly used on future cases. Thus, a judicious sample is chosen that can represent the entire target population, if not the future cases, to get an unbiased representative sample. In the case of clinical research, cases admitted in a hospital in a certain duration are considered adequate provided all consecutive cases meeting the inclusion and exclusion are included. When a large number of cases are available, a random sample is chosen. Various methods of sampling – both random and non-random – will be presented in a future article in this series. Note for the time being that convenience samples are rarely representative, and the findings cannot be directly applied to the target population. Thus, a convenience sample is not appropriate for research.

Sample size is a big issue these days because of greater emphasis on quality. This determines the reliability and the power of the study to produce useful results. The sample size is generally calculated for the most important outcome parameter and depends on the inter-individual variability, confidence level, the tolerable Type-I error, and power. All these will also be discussed in a subsequent article.

7. Specification of the variables under study Variables are the qualitative characteristics or quantitative measurements that vary from person to person and time to time. Whether descriptive or analytical, all empirical research collect data on the specified variables on the sample subjects. This includes the specifics of intervention in the case of trials. The primary and secondary objectives decide what variables to study but they must be fully specified in the protocol. For example, if obesity is to be studied, specify that it will be measured by body mass index, waist-hip ratio, body roundness index, or any other. Even for a simple variable as age, decide that it will be age in completed years (age last birthday) or in nearest years. In case a variable such age is categorized, make categories relevant to the outcome. Specify the scoring system if any is proposed to be used and specify how the score will be calculated. In the case of follow-up, specify the time points.

For analytical studies, clearly specify the antecedents and the outcomes. Both should be amenable to qualitative assessment or quantitative measurement. For a variable such as anaemia, specify that actual haemoglobin (Hb) level will be considered, or a threshold will be used to define anaemia.

8. Methodology for medical investigation and ethics The protocol specifies what data will be collected by observation, by interview, by examination, and by laboratory and radiological investigations. Training schedule, if needed, is also included. A good protocol also specifies the time when these data by different methods will be collected. Laboratory and radiological investigation methods are also specified including details of the exact location in the body for these investigations. State how the validity and reliability of the tools and data will be ensured. Also include the method to assess the relief to the patient. A large study may require a Data Safety and Monitoring Board (DSMB) to ensure that the protocol is being followed and there is no harm to the participants. Since medical research pertains to living beings, the entire investigation must conform to ethics. For example, if the research requires some extra expenditure on the part of the patient, its provision should be stated. State also how any adverse events will be taken care of.

9. Methods of statistical analysis State that the initial analysis will be descriptive with regard to the demographic and clinical characteristics of the subjects on the study, followed by the analysis for associations, correlations, regressions, odds ratio, durations, etc., along with their confidence intervals where appropriate and tests of significance such as Student t-test, chi-square test, and F-test with complete specification of the method for each variable. Specify the level of significance (such as 0.05) and the software to be used. In some studies, sensitivity, specificity, predictivities, and Receiver Operating Characteristic Curve (ROC) may be needed. Some may be on prediction models which may require validation. Evaluation of risk factors may require multivariable logistic regression analysis. Justify each method of analysis proposed to be used. Other advanced methods such as calibration plots, decision curves, discriminant functions, and cluster analysis are rarely used but may be required in specific studies. A future article in this series will provide more details.

10. Anticipated results and their medical utility A study is carried out in anticipation of a new finding, confirmation or refutation of the existing process, or just assessing the extent of a problem in a specified segment of the population. The protocol states the expected result from the study and how it may help the medical science in improving the health of the people or its segment. This is stated for both the positive and negative results. Limitations of the study due to the possibility of a truncated sample, incomplete investigations in some cases, errors in the data, etc., are also clearly mentioned. Such closing remarks in the protocol may be helpful in crystalizing the thoughts and in convincing the funding agencies and reviewers that the study is worth supporting.

References and Appendices

  1. Fleming A. On the antibacterial action of cultures of a penicillin, with special reference to their use in the isolation of B. influenza. Br J Exp Pathol. 1929;10:780-90.
  2. The Noble Prize. Press Release. 2021 Nobel Prize in Physiology or Medicine jointly to David Julius and Arden Patapoutian for discoveries of receptors for temperature and touch. Available at:https://www.nobelprize.org/prizes/medicine/2021/pressrelease/. Last accessed on 25th July 2024
  3. Indrayan A. Research Methodology and Biostatistics Series – II: Quality of medical research. Max Medical J. 2024;1(2):139-43. 4. Indrayan A. Research Methodology and Biostatistics Series. Max Medical J. 2024;1(1):154-6. 5. MEDLINE®/ PubMed® Journal Article Citation Format. https://www.nlm.nih.gov/bsd/policy/cit_format.html. Last accessed 29 July 2024.
  4. Indrayan A. Research Methodology and Biostatistics Series. Max Medical J. 2024;1(1):154-6.
  5. MEDLINE®/ PubMed® Journal Article Citation Format. https://www.nlm.nih.gov/bsd/policy/cit_format.html. Last accessed 29 July 2024.