Skip to Main Content Carnegie Mellon University Libraries

A guide to Sysrev for collaborative literature and document reviews

About Sysrev auto-labeling

Sysrev has a built-in generative AI auto-label feature that uses the OpenAI's GPT-4o model. This can be used to automate the labeling process. Sysrev generates an auto-label report that allows users to compare auto-labeling results to human labeling, such that labels can be improved, assessed and optimized to maximize accuracy and provide a transparent assessment process.

There are a few important things to know about the auto-label feature:

  • It costs money to run the auto-labeler: Running the auto-labeler requires funds in your Sysrev project. If you are a project lead, fill out this funds request form to request funds from the CMU Libraries to use the auto-label feature. Note that funds are approved on a case-by-case basis. At this time, individual users are not able to add their own funds to projects in the CMU Organizational account.
  • The cost of running the auto-labeler varies based on how many records you are labeling, the number of labels that are activated for auto-labeling and the amount of content that is being 'read' by the auto-labeler (e.g., citation only vs full PDF), among other factors.
  • Only owners and admins on the project can run the auto-labeler: If your user status for a project is set at 'member', you will not be able to use the auto-labeler.

 

There are some useful settings that you can apply to control how the auto-labeler runs, including:

  • You can choose which auto-labels to activate: Each label you create has settings to indicate whether the label should be included or ignored in the the auto-labeling process. Note that the default is to include the label in the auto-labeling process.
  • You can choose what information the auto-label should 'read': You can choose, at the label level, whether the auto-labeler should consider citation information (e.g., title and abstract) only or should also 'read' attached content like PDFs.
  • You can have the auto-label pre-populate label answers or not: You can set whether or not the auto-label fills in label answers.
  • You can choose to auto-label only a subset of records: You can set an article filter and/or limit the auto-labeler to a certain number of records to control how many records are labeled in each run of the auto-labeler.

 

These features and settings are covered in more detail below.

Suggested workflow for testing and running auto-labels

In general, before running the auto-labeler across the entirety of your project documents, you should test, assess and optimize your auto-labels on a small random sample of records. Here is a recommended workflow for optimizing and then using the auto-labeler for a literature or document review project.

  1. Two human reviewers label records in a project. Go to Manage and then the Options section of Settings and select Full for Article Review Priority. This allows you to quickly accumulate a number of double-reviewed records. Consider skipping records without abstracts to avoid running the auto-labeler on incomplete metadata.
  2. Set Prefill review answers to No: This is found in the Options section of Manage -> Settings. We advise turning this off while refining your label prompts, and turning it on once label prompts are finalized and you are ready to auto-label all records.
  3. Set article filter to only double-reviewed records: To achieve this, set the following two labels:
    1. Match -> Filter Type: Consensus + Consensus Status: Determined + Inclusion: Any
    2. Exclude -> Filter Type: Consensus + Consensus Status: Single + Inclusion: Any
  4. Ensure that your label auto-label settings are correct: Review each active label to ensure that only desired labels have the Auto Label Extraction box checked, and that Full Text is set to the desired content (citations only vs. attachments). Note: This is important for ensuring that you don't accidentally overspend your budget.
  5. Set auto-label Max Articles to desired number of records for initial assessment: This setting will work through the filtered articles in order up to the max number provided. We recommend auto-labeling at least 20 double-reviewed articles at a time while testing and refining auto-labeling prompts.
  6. Run the auto-labeler.
  7. Review the Auto Label Report: This will appear at the bottom of the Overview page. More information about reading and using the report is in the box below.
  8. Refine label question prompts: See below for tips on improving your label question prompts.
  9. Set article filter for a new set of double-labeled records: If you want to test your revised prompts on a new set of records (recommended to avoid overfitting your prompts to one small set of articles), you can add an additional filter to the already filtered set of double-screened records: Exclude -> Filter Type: Auto label + Label: Any label. Alternatively, you can increase the Max Articles setting in the auto-labeler to label both previously auto-labeled records and a set of new records.
  10. Review the Auto Label Report and repeat steps 8-10 until you have reached a comfortable accuracy score.
  11. Auto-label the remaining records in the project: At this point, you can set Prefill review answers to Yes (in the Options section of Manage -> Settings). Run the auto-labeler for all records by removing filters and increasing the Max Articles setting. Once complete, reviewers can review articles to check the auto-label answers to ensure accuracy.

Understanding the auto-label report

[content coming soon]

Tips for optimizing your label prompts

There are a number of techniques you can use to improve the accuracy of your auto-labels. Here are a few things to try:

  • Break down more complex questions into parts. For example, if you want to know if an article meets many criteria for inclusion and you want the auto-labeler to provide a Yes or No answer, ask it to work through a set of questions related to each criteria. In the following example, we want to include articles that are systematic reviews or meta-analyses about the impacts of wildfires on health, the environment or economic factors. We can break this down and give the auto-labeler the following prompt in the label's Question section:

 

Answer true or false for the following questions about this article.

1. Is this a systematic review or meta-analysis?

2. Does this focus on the impacts of wildfires?

3. Does this include at least one health, environmental or economic impact of wildfires?

If all of the answers are true, include this article. If any of the answers are false, exclude the article.

 

  • Use another genAI tool like chatGPT, Microsoft Copilot or Google Gemini to improve your prompt. Provide the prompt and some examples of correct and incorrect answers and ask it to help you revise the prompt to achieve better results.

 

  • Filter out records without abstracts. When running the auto-labeler on just citation data (not PDFs), it performs more accurately when it has abstracts, and not just titles to rely on. We recommend, when doing the initial human screening, to skip records without abstracts. In this way, when you set your article filters for the auto-labeler, it will only include articles with abstracts.