LegalMind AI — Dockerized Legal Document Analyzer with Google Gemini and Docker Hub Deployment
LegalMind AI — Dockerized Legal Document Analyzer with Google Gemini and Docker Hub Deployment
October 30, 2025
Introduction
This project demonstrates the complete development, containerization, and deployment lifecycle of LegalMind AI, an intelligent platform designed to automate the analysis of legal documents. Manual legal review is a time-consuming and expensive process. This application solves this by using a FastAPI backend to extract clauses from PDF documents, which are then classified for risk (High, Medium, Low) and explained in simple terms using Google's Gemini AI. The system is presented to the user through a modern Nginx-powered web interface.
The complete solution was built as a multi-container application, orchestrated with Docker Compose, and published to Docker Hub. This documentation covers the project split into three logical parts, the architecture, procedures with original command placeholders, container modifications, outcomes, and references.
Objectives of Part 1: Backend Container
Build the application image for the FastAPI backend (service name
legalmind_ai-backend) using apython:3.11-slimbase.Package all Python dependencies (
requirements.txt), includingfastapi,uvicorn,pdfplumber, andgoogle-generativeai.Implement the core API endpoints for health checks (
/health) and document analysis (/extract_clauses/,/classify_clause/).Ensure the container runs locally, exposes port 8000, and successfully processes PDF files and communicates with the Gemini AI API.
Objectives of Part 2: Frontend Container
Build the application image for the Nginx frontend (service name
legalmind_ai-frontend) using annginx:alpinebase.Package all static web assets (HTML, CSS, vanilla JavaScript).
Implement a custom Nginx configuration (
nginx.conf) to serve the static files and act as a reverse proxy, forwarding all/api/requests to thelegalmind_ai-backendcontainer.Use Docker Compose to build and run the full multi-container application (
legalmind_ai-backendandlegalmind_ai-frontend) with a single command.Verify end-to-end connectivity from the browser (port 80) to the backend service.
Objectives of Part 3: Pushing the Image into Docker Hub
Tag the final locally-built images (e.g.,
legalmind_ai-backendandlegalmind_ai-frontend) with a personal Docker Hub username andlatesttag.Log in to the Docker Hub registry via the command-line interface.
Push both the backend (
legalmind_ai-backend) and frontend (legalmind_ai-frontend) application images to a public Docker Hub repository.Provide the final, reproducible Docker Hub links and
docker pullcommands for both images.
Name of the containers involved and the download links
Application Images (My Project)
1. Backend Image (legalmind_ai-backend)
My Docker Hub Repo:
[your-username]/legalmind-ai-backend:latestPull Command:
docker pull [your-username]/legalmind-ai-backend:latestDocker Hub Link:
https://hub.docker.com/r/[your-username]/legalmind-ai-backend
2. Frontend Image (legalmind_ai-frontend)
My Docker Hub Repo:
[your-username]/legalmind-ai-frontend:latestPull Command:
docker pull [your-username]/legalmind-ai-frontend:latestDocker Hub Link:
https://hub.docker.com/r/[your-username]/legalmind-ai-frontend
Base Images (Official)
1. Python
Image:
python:3.11-slimPurpose: Base image for the FastAPI backend (
legalmind_ai-backend).Docker Hub Link:
https://hub.docker.com/_/python
2. Nginx
Image:
nginx:alpinePurpose: Base image for the frontend web server (
legalmind_ai-frontend) and reverse proxy.Docker Hub Link:
https://hub.docker.com/_/nginx
Name of the other software involved along with the purpose
FastAPI: Modern Python web framework used to build the high-performance backend REST API.
Google Generative AI (Gemini): The AI engine used for all Natural Language Processing (NLP) tasks, including clause classification, risk assessment, and explanation generation.
Uvicorn: An ASGI server used to run the FastAPI application inside the backend container.
pdfplumber: A Python library used to extract text and data from uploaded PDF documents.
Vanilla JavaScript (ES6+): Used for all frontend logic, including API calls (fetch) and dynamically rendering the analysis results on the page.
Nginx: A high-performance web server used in the frontend container to serve static files and act as a reverse proxy.
Docker & Docker Compose: The core containerization and orchestration tools used to package, build, and run the entire application.
Overall Architecture for the Project
The system follows a three-tier microservices architecture, as illustrated in Figure 1 below.
Architecture Summary:
User (Browser): Accesses the static website served from the Frontend Container (legalmind_ai-frontend) on
http://localhost.Frontend Container (Nginx):
Serves the
index.html,index.css, andscript.jsfiles.Acts as a reverse proxy: any request to
http://localhost/api/...is forwarded to thelegalmind_ai-backendcontainer's port 8000, avoiding CORS issues.
Backend Container (legalmind_ai-backend):
Listens on port 8000.
Receives API requests from the Nginx proxy.
Uses
pdfplumberto extract text from the PDF.Communicates with the external Google Gemini AI API to get analysis (classification, risk, explanation).
Returns the JSON analysis back to the frontend.
Overall Flow:
User (Browser): Uploads a PDF document.
Frontend Container (Nginx): Receives the file and proxies the request to
/api/.Backend Container (FastAPI): Receives the file, uses
pdfplumberto extract text.Google Gemini AI (External API): The backend sends the extracted text for analysis.
Backend Container (FastAPI): Receives the JSON analysis (risk, category, etc.) from Gemini.
Frontend Container (Nginx): Receives the final JSON response from the backend.
User (Browser): The frontend's JavaScript renders the JSON data as an interactive report.
Architecture Image
Description About the Architecture
As shown in the Architecture diagram , LegalMind AI follows a microservices architecture with a clear separation of concerns between the frontend presentation layer and the backend processing layer. The system is fully containerized using Docker, which packages the application and its dependencies, ensuring consistent deployment and execution regardless of the host environment. The frontend is a lightweight, static single-page application (SPA) built with vanilla JavaScript, HTML, and CSS. It is served by a high-performance Nginx container (the legalmind_ai-frontend service), which also acts as a reverse proxy, directing all API calls (e.g., /api/*) to the legalmind_ai-backend (backend) service. This approach decouples the client from the server and simplifies configuration.
The backend (the legalmind_ai-backend service) is a robust Python application built with the FastAPI framework and run by a Uvicorn ASGI server inside its own Docker container. This container houses all the core business logic. When a user uploads a PDF, the backend uses the pdfplumber library for clause extraction. It then communicates externally with the Google Gemini AI API to perform complex NLP tasks: clause classification, risk assessment, and explanation generation. To optimize performance and manage API costs, the backend implements an LRU cache for AI responses and includes a fallback system of pre-written explanations. The entire multi-container setup is orchestrated by Docker Compose, which manages the application's services, networks, and health checks, allowing the full stack to be launched with a single docker-compose up command.
The data flow is a key part of this design. When a user visits http://localhost, they are greeted by the legalmind_ai-frontend (frontend) container, which serves the static index.html page. When that user uploads a PDF, the JavaScript in their browser does not send the file directly to the backend. Instead, it sends the request to its own server (the Nginx container at localhost:80). Nginx then inspects the request path. Seeing that it starts with /api/, it "proxies" this request over the internal Docker network to the legalmind_ai-backend container at http://legalmind_ai-backend:8000. This is a critical security and design pattern that hides the backend from the public internet and completely avoids any browser-side CORS (Cross-Origin Resource Sharing) errors.
This decoupled architecture is also highly scalable. While this project runs both containers on a single machine, in a production environment, we could run multiple legalmind_ai-backend containers for heavy processing load, with the legalmind_ai-frontend Nginx container acting as a load balancer to distribute requests between them. Because the entire application state is managed per-request (stateless) and no persistent database is used in this version, it is horizontally scalable. The final images, pushed to Docker Hub, encapsulate this entire logic, allowing anyone to pull and deploy this full-stack application in minutes using a simple Docker Compose file.
Procedure - Part 1: Backend Container
Step 1: Project Folder Structure
Organized the project into backend and frontend directories to hold the code for each service.
Step 2: Backend Dockerfile Preparation
Created a Dockerfile in the backend directory, using python:3.11-slim as the base, copying requirements.txt, installing dependencies, and setting the CMD to run Uvicorn.
Step 3: FastAPI Application (app.py)
Developed the app.py with FastAPI, including the /health endpoint and the analysis endpoints (/extract_clauses/, /classify_clause/).
Step 4: Build the Backend Image
Navigated to the backend directory and built the image.
Command- cd backend
Command- docker build -t legalmind_ai-backend .
Step 5: Run and Verify the Backend Container
Ran the container, mapping port 8000.
Command- docker run -p 8000:8000 legalmind_ai-backend
Step 6: Test Health Check
In a new terminal, tested the running container's health endpoint.
Command- curl http://localhost:8000/health
Output: {"status": "healthy", "service": "LegalMind AI"}
Procedure - Part 2: Frontend Container
Step 1: Frontend Dockerfile Preparation
Created a Dockerfile in the frontend directory using nginx:alpine, and copying the static files and custom Nginx configuration.
Step 2: Nginx Proxy Configuration (nginx.conf)
Wrote a custom nginx.conf file. The key section is the location /api/ block, which proxies requests to the backend service, named legalmind_ai-backend by Docker Compose.
proxy_pass http://legalmind_ai-backend:8000/;
Step 3: Docker Compose Configuration (docker-compose.yml)
Created the docker-compose.yml file in the root directory. This file defines two services: legalmind_ai-backend (building from ./backend) and legalmind_ai-frontend (building from ./frontend), setting up port mapping 80:80 for the frontend and managing the internal network.
Step 4: Start the Full Application Stack
From the root directory, ran Docker Compose.
Command- docker-compose up --build
Step 5: Verify Running Containers
In a new terminal, checked the status of the services.
Command- docker-compose ps
Step 6: Access and Test the Full Application
Opened a web browser and navigated to http://localhost. Uploaded a sample PDF and verified that the frontend successfully displayed the AI-generated analysis received from the backend.
Procedure - Part 3: Pushing the Image into Docker Hub
Step 1: Tag the Local Images
Assuming the images built by Docker Compose are named legalmind_ai-backend:latest and legalmind_ai-frontend:latest (as defined in the docker-compose.yml file's image: tag or as the service name).
Command- docker tag legalmind_ai-backend palaniyappan11/legalmind_ai-backend:latest
Command- docker tag legalmind_ai-frontend palaniyappan11/legalmind_ai-frontend:latest
Step 2: Login to Docker Hub
Authenticated with the Docker Hub registry using the CLI.
Command- docker login
Step 3: Push Images to Docker Hub
Pushed both tagged images to my Docker Hub repository.
Command- docker push palaniyappan11/legalmind-ai-backend:latest
Command- docker push palaniyappan11/legalmind-ai-frontend:latest
[Screenshot of the terminal output showing the successful push of both images]
Step 4: Verify on Docker Hub
Logged into the Docker Hub website (hub.docker.com) and navigated to my repositories to confirm that both legalmind-ai-backend and legalmind-ai-frontend were published with the latest tag.
How to Run the Project (Deployment Guide)
The LegalMind AI application is fully containerized and published to Docker Hub. This makes it simple for anyone to deploy and run locally.
Because LegalMind AI is a multi-container application (a backend and a frontend), the easiest way to run it is with Docker Compose.
Method 1: Build from Source (For Developers)
If you have cloned the complete repository from GitHub, you can build and run the application from the source code.
Step 1: Navigate to Project Directory
Command- cd legalmind_ai (Or your project's root folder name)
Step 2: Run with Docker Compose
This single command will build both the legalmind_ai-backend and legalmind_ai-frontend images and start the containers.
Command- docker-compose up --build
Step 3: Access the Application
Once the containers are running, open your browser and go to:
http://localhost
Method 2: Run from Docker Hub Images (For Deployment)
This is the recommended method for running the pre-built application on any machine with Docker.
Step 1: Create a docker-compose.yml file
Create a new, empty directory (e.g., legalmind-deploy) and inside it, create a file named docker-compose.yml. Paste the following content into the file. This file tells Docker to pull the pre-built images from Docker Hub.
version: '3.8'
services:
legalmind_ai-backend:
image: [your-username]/legalmind-ai-backend:latest
restart: always
legalmind_ai-frontend:
image: [your-username]/legalmind-ai-frontend:latest
restart: always
ports:
- "80:80"
depends_on:
- legalmind_ai-backend
(Remember to replace [your-username] with your actual Docker Hub ID)
Step 2: Pull the Docker Images
Open your terminal in the same directory as your new docker-compose.yml file. Run the following commands to pull the pre-built images from Docker Hub. (This step is optional, as docker-compose up will do it automatically, but it's good practice).
Command- docker pull [your-username]/legalmind-ai-backend:latest
Command- docker pull [your-username]/legalmind-ai-frontend:latest
This downloads the backend and frontend images containing all the application code and dependencies.
Step 3: Run the Application
Now, run the application using Docker Compose.
Command- docker-compose up
This command will read your docker-compose.yml file, start both the legalmind_ai-backend and legalmind_ai-frontend containers, and connect them on an internal Docker network.
Step 4: Access the Application
Once the containers start successfully, open your browser and go to:
http://localhost
You will see the LegalMind AI web interface. You can now upload a PDF document to get the full AI analysis.
Step 5: Stop and Remove Containers
To stop the application, press CTRL+C in the terminal where compose is running. To clean up and remove the containers and network, run:
Command- docker-compose down -v
What modification is done in the containers after downloading
The base images (python:3.11-slim and nginx:alpine) were not modified after downloading. Instead, new application images were built on top of them using a Dockerfile. The key modifications included:
Installing Dependencies: For the backend, this involved using
pipto install all Python libraries fromrequirements.txt(like FastAPI, Uvicorn, and pdfplumber) andapt-getfor any system-level build tools.Adding Application Code: The custom Python source code for the FastAPI application was copied into the backend image's
/appdirectory.Copying Static Content: The
index.html,index.css, andscript.jsfiles for the user interface were copied into the Nginx frontend image's public HTML directory.Implementing Custom Configuration: The default Nginx configuration was replaced with a custom
nginx.conffile to serve the static files and, most importantly, to act as a reverse proxy, forwarding all/api/requests to the backend container.Setting Start Commands: A
CMDwas set in the backendDockerfileto automatically run the Uvicorn server on port 8000, and the Nginx container uses the new configuration by default upon starting.
Github link / dockerhub link of your modified containers
Project GitHub Repository:
Docker Hub Repository (LegalMind AI Backend):
Docker Hub Link:
https://hub.docker.com/repositories/palaniyappan11Pull Command: docker pull palaniyappan11/legalmind-ai-backend:latest
Docker Hub Repository (LegalMind AI Frontend):
Docker Hub Link: https://hub.docker.com/repositories/palaniyappan11
Pull Command: docker pull palaniyappan11/legalmind-ai-frontend:latest
What are the outcomes of your DA?
A reproducible Docker image for the FastAPI backend (
legalmind_ai-backend), containing all AI logic and PDF processing capabilities.A reproducible Docker image for the Nginx frontend (
legalmind_ai-frontend), containing the UI and proxy configuration.A multi-service Docker Compose file that orchestrates the entire application, managing networking and dependencies with a single command.
Successful upload of both the backend and frontend images to a public Docker Hub repository, making the application distributable.
A verified end-to-end analysis pipeline, taking a raw PDF upload and returning structured, AI-driven insights on a web UI.
A clear, scalable decoupled microservices architecture where the frontend and backend are independent and communicate only via an API.
Successful integration of a third-party AI service (Google Gemini) for complex NLP tasks within a containerized environment.
A functional Nginx reverse proxy setup that routes API traffic and serves static content from a single port (80), simplifying user access and avoiding CORS issues.
Implementation of performance optimization via an LRU cache, reducing redundant API calls to the AI, saving costs, and improving response time.
Conclusion
This Digital Assignment successfully demonstrates the complete lifecycle of developing, containerizing, and deploying a modern, AI-powered web application. The LegalMind AI platform effectively solves the problem of manual legal document review by leveraging a microservices architecture, Docker containerization, and the power of Large Language Models like Google Gemini.
The project achieved all objectives across the three DA parts: building the backend (Part 1), building the frontend (Part 2), and publishing the final images (Part 3). The final result is a scalable, maintainable, and production-ready system that can be launched on any machine with Docker using either Docker Compose or the docker pull commands.
References
Python Official Image: Python Software Foundation. (Link:
https://hub.docker.com/_/python)Nginx Official Image: Nginx, Inc. (F5). (Link:
https://hub.docker.com/_/nginx)FastAPI Documentation: (Link:
https://fastapi.tiangolo.com/)Google AI for Developers: (Link:
https://ai.google.dev/)Docker Official Documentation: (Link:
https://docs.docker.com/)IITB Docker tutorial: For its clear and comprehensive tutorials on Docker and containerization principles. (Link:
[Please add the IITB Tutorial Link here])
Acknowledgement
I wish to express my sincere gratitude to the faculty and staff of the School of Computer Science and Engineering (SCOPE) at VIT for providing the academic framework and opportunity to complete this Digital Assignment for the Cloud Computing - BCSE408L course during the Fall 2025-26 semester. I extend special thanks to my professor, Subbulakshmi T, for their invaluable guidance, instruction, and support throughout this project. I would also like to thank my family and friends for their continuous support and encouragement.
Appendix: Useful Commands
Compose (build, start, stop)
# Build and start all services
docker-compose up --build
# Stop and remove all services, networks, and volumes
docker-compose down -v
# List running services
docker-compose ps
# View logs for a specific service (e.g., backend)
docker-compose logs -f legalmind_ai-backend
# View logs for the frontend
docker-compose logs -f legalmind_ai-frontend
Test Endpoints
# Test the frontend (should return HTML)
curl http://localhost
# Test the backend health check (via the Nginx proxy)
curl http://localhost/api/health
# Test the backend health check directly (if port 8000 is mapped)
curl http://localhost:8000/health
Tag & Push to Docker Hub
# Tag the images (replace [your-username])
docker tag legalmind_ai-backend [your-username]/legalmind_ai-backend:latest
docker tag legalmind_ai-frontend [your-username]/legalmind_ai-frontend:latest
# Login to Docker Hub
docker login
# Push the images (replace [your-username])
docker push [your-username]/legalmind_ai-backend:latest
docker push [your-username]/legalmind-ai-frontend:latest
Comments
Post a Comment