MACg vs OpenEvidence: A Case-Based Comparison for Clinical Decision Support

This is a case-based evaluation of MACg for clinical decision support using the same primary care scenarios previously used to assess OpenEvidence. Across five cases, MACg and OpenEvidence generally reached the same high-level clinical conclusions, but MACg consistently provided more specific, evidence-grounded, and operationally actionable recommendations. Its responses more often included treatment prioritization, dosing detail, monitoring considerations, alternative strategies, and case-specific reasoning. In one instance, MACg also identified an internal inconsistency in the case narrative. Overall, the evaluation suggests that MACg may offer added value by translating published evidence and guidelines into clearer, more practice-ready clinical recommendations.

Published on Jun 14, 2026 | Ome Ogbru, PharmD

MACg

Artificial intelligence is increasingly used in clinical workflows. However, the real question is not whether an AI tool can generate an answer. The more important question is whether it can generate a clinically useful, evidence-grounded, and operationally actionable answer. This evaluation was designed to test that question for MACg, an evidence-first AI platform (GPT-5 models) built around PubMed-grounded retrieval and traceable citation support, using the same case scenarios previously used to assess OpenEvidence in primary care settings.

The purpose of the evaluation was straightforward: take a set of published primary care cases, submit the same case narratives to MACg, instruct MACg to use guidelines and published literature, and then compare its recommendations with the responses reported for OpenEvidence. The goal was not simply to determine whether MACg could reach the “right” broad answer, but whether it could provide recommendations that were more specific, better contextualized, and easier for clinicians to translate into practice.

Why this evaluation matters

Many AI systems can summarize medical knowledge. Fewer can reliably support a clinical decision in a way that reflects guideline alignment, patient-specific reasoning, and practical management details. In everyday care, clinicians rarely need a vague overview. They need help selecting the next step, weighing alternatives, recognizing contraindications, and anticipating follow-up needs. This evaluation, therefore, focused on whether MACg could do more than retrieve information—whether it could function as a more usable layer of decision support.

How the comparison was structured

The evaluation used the same five cases that had been used in a published OpenEvidence assessment. These cases covered common but clinically meaningful primary care scenarios:

resistant hypertension,
depression with incomplete response to treatment,
type 2 diabetes complicated by obesity and heart failure,
obesity pharmacotherapy after the semaglutide plateau,
and statin initiation in a younger patient with a nonzero coronary artery calcium score but high percentile rank.

For this evaluation, the case narrative served as the prompt, with the added instruction that MACg should base its recommendation on clinical guidelines and published evidence. The resulting MACg outputs were then compared qualitatively with the corresponding OpenEvidence responses reported in the source material.

What MACg recommended across the cases

Across the five cases, MACg consistently identified the management step that was also broadly supported in the published OpenEvidence comparison, but it generally delivered a more developed recommendation.

In the resistant hypertension case, MACg recommended adding spironolactone 12.5–25 mg daily as the preferred fourth-line agent after losartan and amlodipine, especially given prior chlorthalidone-associated hypokalemia. It also emphasized careful monitoring of potassium and renal function and named amiloride as a supported alternative if spironolactone could not be used.

In the depression case, MACg recommended augmentation rather than further dose escalation, identifying low-dose aripiprazole as a guideline- and evidence-supported next step for persistent symptoms despite sertraline, bupropion, and psychotherapy. It explicitly noted that bupropion 450 mg/day was already at the usual maximum dose and expanded the discussion to include adherence review, suicidality assessment, and adverse-effect monitoring.

In the diabetes plus heart failure case, MACg recommended starting an SGLT2 inhibitor with proven heart failure benefit, such as empagliflozin 10 mg daily or dapagliflozin 10 mg daily, while continuing metformin if renal function allowed and moving away from glipizide, which contributes to weight gain and hypoglycemia without improving heart failure outcomes.

In the obesity case, MACg concluded that switching from semaglutide 2.4 mg to tirzepatide would likely produce additional weight loss based on current reviews, comparative real-world evidence, and head-to-head trial data. It also went a step further by identifying a numerical inconsistency in the case's weight trajectory, flagging the need to confirm the true weight trend before making a prescribing decision.

In the cardiovascular prevention case, MACg recommended initiating a moderate-intensity statin because a CAC score of 17 combined with a 90th–100th percentile rank for age signaled premature subclinical atherosclerosis and supported treatment escalation. It also stated what should not be done, including routine aspirin use, ischemia testing based solely on CAC, or short-interval repeat CAC scanning.

How MACg compared with OpenEvidence

A notable finding was that MACg and OpenEvidence were generally concordant on the high-level management decision in all five cases. Both systems identified spironolactone for resistant hypertension, augmentation for persistent depression, SGLT2 inhibition for diabetes with heart failure, tirzepatide as a reasonable switch after semaglutide plateau, and statin therapy for the patient with elevated CAC percentile.

The difference was not usually in the overall answer, but in the depth and usability of the reasoning. MACg's responses were typically more specific, hierarchical, and implementation-focused. It more often named a preferred option, explained why alternatives were less suitable, included a starting dose, described laboratory or clinical monitoring, and incorporated patient-specific features such as prior intolerance, comorbidity burden, or treatment plateau.

By contrast, OpenEvidence's responses were often correct but more generalized. In some cases, OpenEvidence presented several reasonable paths without ranking them as clearly. In others, it provided the right direction but with less attention to practical prescribing details or contingency planning. The obesity case was particularly revealing, because MACg not only supported the switch to tirzepatide but also identified an inconsistency in the source narrative that OpenEvidence did not flag. That suggests MACg may contribute not just evidence retrieval, but also a basic layer of case plausibility checking.

Strengths highlighted by the evaluation

The strongest theme across the evaluation was actionability. MACg did not stop at the level of “consider this class of drug.” It generally translated evidence into a management plan with a clearer bedside orientation.

Other apparent strengths included:

Guideline concordance, with recommendations framed around accepted therapeutic sequencing or risk-based treatment logic.

Contextualization, meaning recommendations were adapted to the specific details of the case rather than offered as a generic summary.

Monitoring awareness, including laboratory follow-up, adverse-effect counseling, and treatment-response reassessment.

Comparative reasoning, especially when more than one plausible management path existed.

These are important characteristics for any clinician-facing AI system because the practical challenge is rarely retrieving an answer in the abstract; it is choosing and operationalizing the most appropriate next step.

Important limitations

This was not a formal accuracy trial with blinded adjudication, standardized scoring rubrics, inter-rater reliability, or statistical performance comparisons. It was a qualitative case-based evaluation using published vignettes. That means the findings should be interpreted as illustrative rather than definitive.

In addition, vignette performance does not guarantee equivalent performance in real-world workflows, where clinical data may be incomplete, contradictory, or longitudinally complex. The comparison therefore says more about how MACg performs on structured decision-support tasks than about how it will perform under all clinical conditions.

Clinicians should note that MACg uses information derived from PubMed and may not always have access to the full text of the cited publication. Accordingly, clinicians should consult the original article to verify the completeness and accuracy of the information provided. Iterative questioning of MACg and uploading the full publication will yield more robust recommendations.

Bottom line

The overall takeaway from this evaluation is that MACg and OpenEvidence often reach the same broad clinical conclusion. However, MACg appears to add value through greater specificity, stronger case contextualization, and more operationally useful recommendations. Rather than simply identifying a likely correct answer, MACg more often explains why that answer is preferred, how it should be implemented, what alternatives exist, and what monitoring should follow.

For clinicians, that distinction matters. A decision support tool becomes most useful not when it mirrors textbook knowledge, but when it helps turn evidence into a safe, practical next step. Based on this evaluation, MACg appears well-positioned to do that while still requiring the same careful clinical oversight that should apply to any AI-assisted workflow.

References

Hurt R, Stephenson C, Gilman E. The Use of an Artificial Intelligence Platform OpenEvidence to Augment Clinical Decision-Making for Primary Care Physicians. J Prim Care Community Health. 2025; doi:10.1177/21501319251332215

Start creating & editing content in minutes with AINGENS' MACg.

Discover all the amazing things you'll create with AI.

Learn More About MACg

100 AI Slide Creation Prompts for Medical & Scientific Professionals

100 AI Slide Presentation Prompts for Busy Medical and Scientific Professionals

This article provides 100 ready‑to‑use AI prompts to help medical and scientific professionals create high‑quality slide presentations with MACg. The prompts are organized into practical categories, including clinical education, drug and mechanism‑of‑action decks, PubMed‑driven evidence reviews, specialty teaching, medical affairs and HEOR, training and curriculum, data‑visualization, and conference or grant presentations. Each prompt uses clear “Instruction:” wording and customizable placeholders (e.g., [condition], [audience], [timeframe]) so users can quickly adapt them to their specific topic and setting. By removing blank‑slide paralysis, the collection speeds up evidence‑based slide creation while supporting consistent, structured, and compliant communication.

AINGENS Team

Feb 4, 2026

AI Slide-Ready Tables Made Easy: 10 Plug-and-Play Prompts for MACg Users

This article is a practical guide for MACg users who want to create beautiful, slide-ready tables from clinical and scientific data. It explains why the first slide output doesn't need to be perfect and shows how to iterate like a pro using MACg's image generator and flexible slide builder. The core of the article is a set of 10 copy‑pasteable prompt templates for different table styles, from primary outcomes and safety summaries to subgroup grids and multi-domain overviews. It also emphasizes attaching or pasting datasets so MACg can generate accurate, presentation-ready visuals.

AINGENS Team

Mar 27, 2026

10 Tips For Using MACg and Other AI Solutions For Medical and Scientific Writing

This article shares 10 practical tips for medical writers, pharma/biotech professionals, HCPs, and researchers on how to use MACg and other AI solutions effectively and responsibly. It emphasizes that MACg's AI agent training knowledge is not a citable evidence base and must be anchored in real, verifiable sources such as publications, CSRs, labels, guidelines, and internal documents. Readers are encouraged to treat all AI outputs as Version 0.1, iterating with expert review rather than accepting the first draft. The article frames MACg as a drafting and thinking assistant, not an autonomous author, and stresses the importance of clear prompts about audience, purpose, region, timeframe, and scope. It recommends breaking complex projects into smaller tasks, using MACg to improve structure and clarity, and explicitly asking for limitations and caveats. Finally, it advocates creating institutional checklists and making expert human review and sign‑off mandatory for any impactful content.

AINGENS Team

Mar 1, 2026

MACg vs OpenEvidence: A Case-Based Comparison for Clinical Decision Support