Research

LLM Hallucination Mitigation

Mohamed Nejjar's research on making Large Language Models more reliable — tackling hallucination at the prompt level.

Why This Research Matters

Large Language Models (LLMs) are transforming enterprise workflows — from legal document analysis to medical triage to customer service automation. But they have a critical flaw: hallucination. LLMs confidently generate plausible-sounding outputs that are factually wrong, internally inconsistent, or entirely fabricated.

Most hallucination research focuses on the model side: better training data, improved architectures, constrained decoding strategies. Mohamed Nejjar's research takes a fundamentally different approach — a shift-left strategy that tackles hallucination risk at the prompt level, before generation even begins.

Bachelor's Thesis2025TUM

Echo: Mitigating Hallucination Potential in User Prompts Through AI-Guided Iterative Refinement

Technical University of Munich — School of Software Engineering and AI

The core insight: every LLM output has two actors — the model and the user. The user's prompt is a controllable input surface that significantly influences hallucination risk, yet this dimension remains vastly under-researched.

Key Research Contributions

Novel Hallucination Taxonomy

Distinguishes between Prompt Risk (token-level ambiguity, vagueness, presuppositions) and Meta Risk (structural issues like multi-hop complexity, scope overload, conflicting constraints).

Prompt Risk Density (PRD)

A novel quantitative metric for measuring hallucination potential before generation. Weighted scoring across risk categories, normalized by prompt complexity.

Multi-Agent Pipeline

Four specialized agents (Analyzer, Initiator, Conversation, Preparator) that collaborate to analyze, diagnose, and iteratively refine prompts with human-in-the-loop validation.

High-Stakes Domain Impact

Demonstrated applicability in law, healthcare, and finance — where LLM hallucinations carry real consequences. Better prompts bridge the gap between expensive closed-source and smaller open-source models.

Peer-Reviewed Publication2024JSEP150+ Citations

LLMs for Science: Usage for Code Generation and Data Analysis

Journal of Software Engineering and Applications (JSEP) · 35+ citations on Scopus

Co-authored a peer-reviewed paper examining the use of Large Language Models in research and data analysis. The work has been cited over 130 times, contributing to the academic discourse on responsible AI adoption in scientific workflows. This research established the empirical foundation for Mohamed Nejjar's subsequent work on LLM reliability and hallucination mitigation.

Research Philosophy

Mohamed Nejjar's research sits at the intersection of AI reliability and practical enterprise deployment. The goal is not to publish papers in isolation, but to produce research that directly improves how organizations deploy Large Language Models in production — safely, reliably, and at scale.

This philosophy is informed by hands-on experience building AI systems at BCG Platinion, Allianz SE, and Fraunhofer — environments where LLM reliability is not an academic curiosity but an operational requirement.