DevForge
DevForge Agentic AI engineering agent Β· Jira β†’ reviewed PR

From a Jira ticket
to a reviewed pull request.

DevForge picks up a labeled Jira ticket, reads the whole repo, writes code and tests, validates them, self-corrects on failure, enforces engineering policy as code β€” then opens a draft PR for a human to approve. It never merges on its own.

≀ 3
Self-correcting attempts
2-layer
Policy enforcement
0
Auto-merges Β· human decides
πŸ› οΈ
DevForge Agent
Running Β· attempt 1 / 3
JIRA β†’ GITHUB
KAN-12 Add card count badge to each column header
βœ“Cloned sandbox repo Β· 6 files in LLM context
βœ“Policy gate passed Β· 5 rules checked, 0 violations
βœ“Generated app.js + tests/test_render.js
βœ“node --check Β· node --test β†’ 4 passing
βŽ‡ Draft PR #42 opened β†’ Jira moved to Review
Pipeline Jira Automation Power Automate Python Β· FastAPI agent Β· Azure Container Apps Generate β†’ validate β†’ retry Azure AI Foundry Β· LLM Policy gate GitHub App Β· draft PR
What it does

An engineer's workflow, automated β€” with the guardrails.

Read the ticket, understand the codebase, write tested code, check it against the rules, and hand a clean draft PR to a human. Six disciplines, one loop.

🧠

Context-aware generation

The full sandbox repository is cloned per run and injected into the prompt, so generated code matches the project's real files, conventions, and APIs.

full-repo contextcode + tests
πŸ§ͺ

Test-driven validation gate

Before anything is committed, generated code runs through syntax checks and the project's own test suite β€” node --check then node --test.

syntax checkunit tests
♻️

Self-healing retry loop

When validation fails, the exact errors are fed back to the LLM to fix its own output β€” iterating up to three attempts. This is what makes it agentic, not one-shot.

failure feedback≀ 3 attempts
πŸ›‘οΈ

Dual-layer policy enforcement

The LLM is asked to refuse policy-violating tickets (semantic), and a deterministic Python gate scans every generated file against pattern rules. Both must pass.

LLM refusaldeterministic scan
πŸ“œ

Policy-as-Code

Engineering rules live in versioned YAML, auditable in git. Add, remove, or tighten a constraint by editing a file β€” no code change, no redeploy.

rules.yamlgit-versioned
πŸ‘€

Human-in-the-loop by design

Output is always a draft PR with the agent's reasoning and validation summary. A reviewer approves every merge β€” the agent never ships on its own.

draft-PR onlynever auto-merge
The agentic loop

It writes, checks its own work, and fixes it.

Every run is a closed loop: generate β†’ enforce policy β†’ validate against real tests. A failure isn't the end β€” the validator's errors become the next prompt, and the agent tries again, up to three times.

♻️

Each attempt runs in a clean clone of the repo β€” no state leaks between tries, so every retry is reproducible and honest.

01
✍️

Generate

LLM writes source + tests from the ticket and full repo context.
02
πŸ›‘οΈ

Policy gate

LLM refusal + deterministic Python scan. A violation blocks the run.
03
πŸ§ͺ

Validate

Syntax check + the project's test suite on the generated code.
↻ On failure, errors are fed back to the LLM β€” retry up to 3Γ—
βœ“ SuccessDraft PR Β· β†’ Review
β›” BlockedPolicy violation
βœ— Failed3 attempts exhausted
LAYER 1

LLM refusal β€” semantic

The policy text is in the prompt. If a ticket asks for something forbidden, the model refuses to generate code and returns the rule it would have broken.

LAYER 2

Deterministic Python gate

Every generated file is pattern-scanned against the YAML rules β€” catching anything the model might have slipped through. Both layers must pass before a commit.

NO_SECRETS_IN_CODE NO_DISABLE_SECURITY NO_EVAL_OR_EXEC NO_DELETE_EXISTING_TESTS NO_PRODUCTION_CONFIG_CHANGES
Policy as code

Some tickets should never ship.

Engineering rules are declared in versioned YAML and enforced twice. A ticket asking to hardcode an API key, disable a security check, or touch production config is refused before any code is written β€” the ticket moves to Blocked with a full audit trail.

πŸ”’

Three deterministic outcomes β€” success Β· failed Β· blocked β€” drive the downstream Jira transition every time. No ambiguity for the reviewer.

User journey

From ticket to reviewed code

A developer adds one label. About thirty seconds later, the work is done, validated, and waiting for review β€” or safely blocked.

Developer creates ticket adds the ai-augmented label
~30s Β· automated
Policy check passes?
Blocked
No β€” policy violation
No code written β›”status: Blocked Β· rule explained on Jira
Build
Yes β€” generate code + tests
Tests pass? retry up to 3Γ—
Draft PR opened βœ“+ Jira comment Β· status: Review
Human reviewer decides approve & merge Β· or request changes
Architecture approach

Hybrid: low-code + pro-code

Low-code (Jira Automation + Power Automate) handles the event-driven orchestration and Jira write-backs; a pro-code Python / FastAPI service runs the agentic reasoning, with deterministic validation and policy enforcement.

Trust & governance, built in.

A clean draft PR for legitimate work β€” a blocked ticket with a full audit trail for anything risky. Exactly what enterprises need to adopt AI in the engineering loop.