Why I Worked on This
Like many developers, I use AI coding assistants daily. While they speed up writing infrastructure-as-code, I’ve noticed they occasionally suggest Ansible modules or playbook structures that don’t actually exist. These hallucinations waste time and could introduce security risks if unchecked. I needed a reliable way to validate AI-generated Ansible code before deploying it to production.
My Real Setup
I run most of my infrastructure using Ansible with a few dozen playbooks managing VMs on Proxmox. My workflow involves:
- Writing new playbooks in VS Code with Copilot assistance
- Using
ansible-lintfor basic syntax checks - Running Molecule tests for validation before deployment
I wanted to extend this pipeline to catch AI hallucinations early.
What Worked (and Why)
1. Molecule as a Hallucination Detector
Molecule is already part of my workflow, so I enhanced it to:
- Generate a playbook with an AI assistant
- Run
molecule init scenario --driver dockerto create test cases - Add specific verification steps in
verify.yml:
- name: Verify no unknown Ansible modules used
command: ansible-doc --list | grep -Fx "{{ item }}"
loop: "{{ ansible_facts.modules }}"
register: module_check
ignore_errors: yes
failed_when: module_check is failed
This checks if all modules in the playbook actually exist in Ansible’s documentation.
2. Dependency Validation
For external roles, I added a pre-test hook:
#!/bin/bash
ansible-galaxy install -r requirements.yml
if [ $? -ne 0 ]; then
echo "Error: Required roles not found"
exit 1
fi
This ensures AI-suggested roles in requirements.yml are valid.
3. AI Self-Checking
Before finalizing a playbook, I prompt the AI to:
- List all Ansible modules and roles used
- Verify their existence
- Suggest alternatives if any are invalid
Example prompt: “List all Ansible modules and roles in this playbook. For each one, confirm it exists in official documentation or Galaxy. If any don’t exist, suggest working alternatives.”
What Didn’t Work
1. Over-Reliance on Linting
I initially tried using just ansible-lint, but it only catches syntax errors, not hallucinated modules. For example, it didn’t flag some_nonexistent_module as invalid.
2. Manual Registry Checks
I considered manually checking each package, but this was too time-consuming. With Molecule automated checks, this became unnecessary.
3. AI Self-Checking Limitations
While helpful, AI can still miss its own hallucinations. For example, it once suggested community.general.pip_install (which doesn’t exist) and claimed it was valid. This is why automated testing is crucial.
Key Takeaways
- Automate validation – Integrate Molecule into your CI/CD pipeline to catch hallucinations before deployment.
- Combine tools – Use Molecule for testing,
ansible-docfor module verification, and AI self-checking as a first pass. - Update dependencies – Regularly refresh your Ansible modules and roles to avoid outdated hallucinations.
- Stay skeptical – Always verify AI suggestions, even if they seem plausible.
This approach has significantly reduced the time I spend debugging AI-generated code while improving the reliability of my infrastructure.