Skip to Main Content

Gerstein Science Information Centre

Searching the Literature: A Guide to Comprehensive Searching in the Health Sciences

Students and researchers in the health sciences are often required to conduct comprehensive searches of the literature. Follow the steps in this guide to learn how this process works.

Is it Safe to use ChatGPT to Write your Search?

 Decision tree diagram providing choices to make when determining whether to use chatgpt

Using ChatGPT to Create a Boolean Search in PubMed

In February 2023, Wang et al. demonstrated a method for using ChatGPT to write Boolean queries for systematic reviews using PubMed syntax. According to the authors, ChatGPT could generate queries with high precision, but with lower recall, compared to other state-of-the-art automatic search strategy generators. 

ChatGPT-generated queries can find SOME relevant articles, but NOT ALL RELEVANT articles. 


The Wang et al. method requires: 

  • 4 prompts that provide step-by-step instructions for ChatGPT
  • a title and statement from a seed publication (ideally a previously published systematic review) that describes the searchable PICO elements in your search question (ie. condition, intervention, study design)

Caveats they identified: 

  • ChatGPT inserts imaginary MeSH terms (ie. it uses MeSH that do not exist as part of the search)
  • ChatGPT does not reproduce the same search even when the exact same prompts as used
  • users of this method may not be able to understand the output well enough to assess its validity (eg. recall, precision)

Things to note:

  • ChatGPT cannot currently produce search strategies for proprietary databases (ie. Ovid Medline, Embase, PsycINFO, etc)
  • cannot produce phrases with proximity operators, which can be helpful in decreasing imprecision while maintaining recall
  • users of this method must be able to understand PubMed syntax 
IMPORTANT: IN ORDER TO ASSESS THE OUTPUT FROM CHATGPT, YOU MUST BE ABLE TO INTERPRET AND CRITIQUE A BOOLEAN SEARCH USING PUBMED SYNTAX. DO NOT TRUST. ALWAYS VERIFY. 

 

Should you use ChatGPT to generate your systematic or scoping review search strategy? 

  • cannot create reproducible search strategies 
  • can potentially generate queries with acceptable precision but unacceptable recall
  • search strategies produced are often messy with very poor face validity

Example Prompts

The following prompts are copied verbatim from Shuai Wang, Harrisen Scells, Bevan Koopman, and Guido Zuccon. 2023. Can ChatGPT Write a Good Boolean Query for Systematic Review Literature Search? In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '23). Association for Computing Machinery, New York, NY, USA, 1426–1436. https://doi.org/10.1145/3539618.3591703 

Prompt 1: Asks ChatGPT to produce a list of 50 "relevant" terms. 

Follow my instructions precisely to develop a highly effective Boolean query for a medical systematic review literature search. Do not explain or elaborate. Only respond with exactly what I request. First, Given the following statement and text from a relevant study, please identify 50 terms or phrases that are relevant. The terms you identify should be used to retrieve more relevant studies, so be careful that the terms you choose are not too broad. You are not allowed to have duplicates in your list. statement: [insert title of seed article here] text: [insert sample text (eg. abstract) from seed article here]

Prompt 2: Asks ChatGPT to classify terms into three categories using PICOT

For each item in the list you created in step 1, classify it into as of three categories: terms relating to health conditions (A), terms relating to a treatment (B), terms relating to types of study design (C). When an item does not fit one of these categories, mark it as (N/A). Each item needs to be categorised into (A), (B), (C), or (N/A)

Prompt 3: Asks ChatGPT to create a Boolean Query in PubMed Syntax. 

Using the categorised list you created in step 2, create a Boolean query that can be submitted to PubMed which groups together items from each category. For example: ((itemA1[Title/Abstract] OR itemA2[Title/Abstract] or itemA2[Title/Abstract]) AND (itemB1[Title/Abstract] OR itemB2[Title/Abstract] OR itemB3[Title/Abstract]) AND (itemC1[Title/Abstract] OR itemC2[Title/Abstract] OR itemC3[Title/Abstract]))

Prompt 4: Asks ChatGPT to refine search strategy and add relevant MeSH. 

Use your expert knowledge to refine the query, making it retrieve as many relevant documents as possible while minimising the total number of documents retrieved. Also add relevant MeSH terms into the query where necessary, e.g., MeSHTerm[MeSH]. Retain the general structure of the query, however, with each main clause of the query corresponding to a PICO element. The final query still needs to be executable on PubMed, so it should be a valid query.

Example of a ChatGPT-generated Boolean Query

Below is the search strategy generated from the final prompt in Shuai Wang, Harrisen Scells, Guido Zuccon, and Bevan Koopman. 2023. Can ChatGPT Write a Good Boolean Query for Systematic Review Literature Search?. 1, 1 (February 2023), 19 pages. https://doi.org/10.1145/nnnnnnn.nnnnnnn

The search is designed to find prevalence studies for differentiated thyroid cancer using autopsy

(((differentiated thyroid cancer[MeSH] OR "differentiated thyroid"[All Fields] OR "thyroid carcinoma"[All Fields] OR "papillary microcarcinoma"[All Fields]) AND (prevalence[All Fields] OR incidence[MeSH] OR "etiology of"[All Fields] OR "risk factors"[All Fields] OR gender[All Fields] OR hormonal[All Fields] OR "nodular goiter"[All Fields] OR "Hashimoto’s thyroiditis"[MeSH] OR malignancy[MeSH] OR "concomitant lesion"[All Fields] OR tumor[All Fields] OR infiltrate[All Fields] OR fibrosis[All Fields] OR "early stages of development"[All Fields] OR frequency[All Fields])) AND (autopsy[MeSH] OR surgical[All Fields] OR material[All Fields] OR series[All Fields] OR specimens[All Fields] OR cases[All Fields]))

Ask yourself the following questions: 

  1. Can you identify the errors in this search?
  2. Can you explain what the search is doing in PubMed?
  3. Would you trust the results of this query to find all the relevant studies for your question (ie. recall/sensitivity) while limiting the number of irrelevant studies (precision?)
  4. Would you be able to translate this query into additional databases?
  5. Do you trust ChatGPT to do your data collection?

 

This work is openly licensed via CC BY-NC-SA 4.0 (unless otherwise noted). For information on this guide contact Erica Nekolaichuk, Faculty Liaison & Instruction Librarian at the Gerstein Science Information Centre.