Back to Case Studies
LegalTechLegalTech

LegalOps Hub — Malaysian Legal AI Agent System

Automated legal document processing with 15 specialized AI agents for Malaysian law firms.

Nov 2025
20 min read
Live Demo
LegalOps Hub — Malaysian Legal AI Agent System

Project Overview

The LegalOps Hub orchestrates 15 specialized AI agents across 4 distinct workflows: Intake (5 agents), Drafting (5 agents), Research (2 agents), and Evidence (3 agents). Each agent is purpose-built for a specific task in the Malaysian legal context, handling challenges like mixed Malay-English documentation, complex party name extraction, and court-specific template compliance. The system uses Google Gemini 2.0 Flash for high-speed bilingual reasoning and LangGraph for sophisticated state management across the agent swarm.\n\nThe tech stack includes: Frontend (Next.js 14 App Router, React 18, TailwindCSS, TypeScript, Zustand, Lucide React, Framer Motion), Backend (FastAPI, Python 3.11+, LangGraph, Google Gemini 2.0 Flash, ChromaDB, PostgreSQL/SQLite, SQLAlchemy, Alembic, Pytesseract, PDF2Image, LangDetect, PyPDF2), and Infrastructure (Docker, GCP, Vercel, Gunicorn).

15 Specialized
Total Agents
100%
Intake Success
100%
Drafting Success
100%
Research Success
100%
Evidence Success
86%
OCR Accuracy
87%
Draft Alignment
~90%
Time Savings

System Architecture

The system is built on a modular, multi-agent architecture orchestrated by LangGraph. Each workflow (Intake, Drafting, Research, Evidence) operates as an independent graph that can be triggered via API. State is managed through 'Matter Snapshots'—structured JSON payloads that allow agents to communicate without passing massive document contexts.

System Architecture
Figure 1: System Architecture Diagram

DocumentCollectorAgent

Validates and ingests files from various connectors (upload, email, drive). Handles file type validation, generates document , and creates initial matter record.

OCRLanguageAgent

Extracts text from PDFs and images with language detection. Uses hybrid approach: PyPDF2 text extraction first for speed, falls back to Pytesseract for scanned documents. Implements per-sentence language detection using `langdetect` to handle mixed Malay/English documents. Segments text with high granularity (page/sentence level) for precise citations.

TranslationAgent

Transfers legal text between Malay and English. Optimized execution flow often skips massive batch translation at intake to preserve original context, instead passing 'parallel texts' to case structuring. Supports bi-directional translation using Google Translate API or LLM fallback.

CaseStructuringAgent

Parses unstructured text into a structured JSON matter snapshot. Extracts Parties (Plaintiff, Defendant), dates, amounts, and metadata. Structuring logic handles complex names and addresses typical in legal filings.

RiskScoringAgent

Calculates a composite 1-5 complexity score. Evaluates 4 dimensions: Jurisdictional (25%), Language (30%), Volume (20%), and Time Pressure (25%). Flags matters for human review if score >= 4.0.

IssuePlannerAgent

Identifies legal causes of action and required prayers. Analyzes matter snapshot to propose primary and alternative legal theories (e.g., Breach of Contract s.40, Negligence). Suggests specific prayers for relief mapped to verified templates. Retrieves relevant precedents to support each issue.

TemplateComplianceAgent

Selects and enforces court-specific formatting. Retrieves correct template ID (e.g., 'TPL-HighCourt-MS-v2') based on jurisdiction (Peninsular vs East Malaysia) and court level. Ensures correct headers, intitulation, and defined terms.

MalayDraftingAgent

Generates the primary pleading in formal Bahasa Malaysia. Uses Gemini 2.0 with strict prompting to adhere to Malaysian legal register ('Bahasa Istana/Mahkamah'). Auto-formats defined terms (PLAINTIF, DEFENDAN) and paragraph numbering (1.1, 1.2). Generates standard sections: Introduction, Facts, Breach, Relief, Prayers.

EnglishCompanionAgent

Creates a mirror English version for reference. Generates an English 'Companion Draft' that aligns paragraph-by-paragraph with the Malay original. Does not just translate, but drafts in proper legal English to ensure conceptual equivalence.

ConsistencyQAAgent

Validates consistency between Malay and English versions. Checks for numeral mismatches, missing dates, and proper noun spelling consistency. Returns a QA report highlighting potential discrepancies for human review.

ResearchAgent

Searches case law databases. Integrates with CommonLII (or mock data) to find binding and persuasive authorities. Filters by court hierarchy (Federal Court > Court of Appeal > High Court).

ArgumentBuilderAgent

Synthesizes research into a legal argument memo. Maps found cases to specific legal issues identified by the IssuePlanner. Drafts a structured legal argument (IRAC format: Issue, Rule, Analysis, Conclusion) for use in written submissions.

TranslationCertificationAgent

Certifies documents for court submission. Generates 'Certificate of Translation' headers for non-native language documents, suitable for statutory declaration requirements.

EvidenceBuilderAgent

Compiles the Evidence Packet. Indexes all uploaded documents, pleadings, and affidavits. Organizes them into logical sequences for the Bundle of Documents.

HearingPrepAgent

Prepares the final Hearing Bundle and Scripts. Generates a comprehensive 4-tab Bundle (Pleadings, Submissions, Authorities, Translations). Produces bilingual 'Oral Submission Scripts' ('Skrip Hujahan Lisan') with cues for the lawyer. Includes 'If Judge Asks' section with AI-generated FAQ preparation based on case weaknesses.

Implementation Details

Code Example

python
# Data Schema (SQLAlchemy)\nclass Matter(Base):\n    __tablename__ = 'matters'\n    id = Column(String, primary_key=True)  # MAT-YYYYMMDD-XXXX\n    title = Column(String)\n    status = Column(Enum('intake', 'drafting', 'research', 'ready'))\n    parties = Column(JSON)  # List of Plaintiff/Defendant objects\n    risk_scores = Column(JSON)  # Composite score details\n    human_review_required = Column(Boolean)\n    documents = relationship('Document', back_populates='matter')\n    pleadings = relationship('Pleading', back_populates='matter')

Agent Memory

Instead of passing raw document text between all 15 agents, we pass only structured 'Matter Snapshots' (JSON summaries). If an agent needs the full text, it fetches it by database ID. This prevents context window bloat and allows the system to scale to hundreds of pages without LLM token limits becoming a bottleneck.

Workflow

1

Stage 1 - Ingestion: User uploads PDF/Images via Dashboard. The `OrchestrationController` initializes the 'intake' graph. `DocumentCollector` validates inputs. `OCRLanguageAgent` processes files: PDFs are parsed via PyPDF2 or rendered to images for Tesseract OCR. Text is segmented and language-tagged.\n\nStage 2 - Analysis: `CaseStructuringAgent` extracts parties ('Ali bin Abu', 'Syarikat XYZ Sdn Bhd'), key dates, and contract values. `RiskScoringAgent` computes a risk matrix. If complexity > 4.0, 'Human Review' flag is raised.\n\nStage 3 - Strategy: User initiates Drafting flow. `IssuePlannerAgent` proposes causes of action (e.g., 'Breach of Contract'). User confirms selections. `TemplateComplianceAgent` locks in the correct High Court template.\n\nStage 4 - Execution: `MalayDraftingAgent` writes the Statement of Claim (Pernyataan Tuntutan). `EnglishCompanionAgent` drafts the parallel English version. `ConsistencyQAAgent` runs final validation checks.\n\nStage 5 - Preparation: `EvidenceBuilder` compiles the bundle. `HearingPrepAgent` generates oral scripts ('Yang Arif, saya hadir untuk Plaintif...') and anticipates judge's questions.

Workflow Diagram
Figure 2: Workflow Diagram

Results & Impact

"Reduces time-to-first-draft by approximately 90%. Transforms the manual process of cross-referencing documents and translating legal terms into a unified, instant workflow. Enables junior lawyers to handle complex cases with AI guardrails."

Functional Agents

12 of 15 agents fully operational (80% overall success rate).

Intake Workflow

100% success rate for document ingestion and OCR.

Drafting Workflow

100% success rate for bilingual pleading generation.

Research Workflow

100% success rate for case law search and argument synthesis.

Evidence Workflow

100% success rate (TypeError in bundling logic pending fix).

Bilingual Alignment

87% average alignment between Malay and English drafts.

OCR Confidence

86% accuracy on scanned PDF documents.

Risk Score Baseline

Average complexity: 1.25/5.0 (low baseline in testing).

LegalTechLangGraphMulti-Agent SystemGoogle GeminiBilingual AIOCRFastAPINext.jsChromaDBSQLAlchemy

About the Author

Rahul Patil, AI Context Engineer

Rahul Patil

AI Context Engineer

20+
Projects Delivered
1.5+
Industry Experience

Rahul Patil

AI Context Engineer

Apex Neural

Rahul is an AI Context Engineer experienced in architecting agentic AI systems, scalable backend services, and full-stack SaaS platforms. His work includes LLM integrations, automation systems, OCR and document processing, web scraping, and fine-tuned AI models. He focuses on delivering production-ready AI solutions that solve real business problems.

Ready to Build Your AI Solution?

Get a free consultation and see how we can help transform your business.