LegalMind AI — Dockerized Legal Document Analyzer with Google Gemini and Docker Hub Deployment

 

LegalMind AI — Dockerized Legal Document Analyzer with Google Gemini and Docker Hub Deployment

October 30, 2025

Introduction

This project demonstrates the complete development, containerization, and deployment lifecycle of LegalMind AI, an intelligent platform designed to automate the analysis of legal documents. Manual legal review is a time-consuming and expensive process. This application solves this by using a FastAPI backend to extract clauses from PDF documents, which are then classified for risk (High, Medium, Low) and explained in simple terms using Google's Gemini AI. The system is presented to the user through a modern Nginx-powered web interface.

The complete solution was built as a multi-container application, orchestrated with Docker Compose, and published to Docker Hub. This documentation covers the project split into three logical parts, the architecture, procedures with original command placeholders, container modifications, outcomes, and references.

Objectives of Part 1: Backend Container

  • Build the application image for the FastAPI backend (service name legalmind_ai-backend) using a python:3.11-slim base.

  • Package all Python dependencies (requirements.txt), including fastapi, uvicorn, pdfplumber, and google-generativeai.

  • Implement the core API endpoints for health checks (/health) and document analysis (/extract_clauses/, /classify_clause/).

  • Ensure the container runs locally, exposes port 8000, and successfully processes PDF files and communicates with the Gemini AI API.

Objectives of Part 2: Frontend Container

  • Build the application image for the Nginx frontend (service name legalmind_ai-frontend) using an nginx:alpine base.

  • Package all static web assets (HTML, CSS, vanilla JavaScript).

  • Implement a custom Nginx configuration (nginx.conf) to serve the static files and act as a reverse proxy, forwarding all /api/ requests to the legalmind_ai-backend container.

  • Use Docker Compose to build and run the full multi-container application (legalmind_ai-backend and legalmind_ai-frontend) with a single command.

  • Verify end-to-end connectivity from the browser (port 80) to the backend service.

Objectives of Part 3: Pushing the Image into Docker Hub

  • Tag the final locally-built images (e.g., legalmind_ai-backend and legalmind_ai-frontend) with a personal Docker Hub username and latest tag.

  • Log in to the Docker Hub registry via the command-line interface.

  • Push both the backend (legalmind_ai-backend) and frontend (legalmind_ai-frontend) application images to a public Docker Hub repository.

  • Provide the final, reproducible Docker Hub links and docker pull commands for both images.

Name of the containers involved and the download links

Application Images (My Project)

1. Backend Image (legalmind_ai-backend)

  • My Docker Hub Repo: [your-username]/legalmind-ai-backend:latest

  • Pull Command: docker pull [your-username]/legalmind-ai-backend:latest

  • Docker Hub Link: https://hub.docker.com/r/[your-username]/legalmind-ai-backend

2. Frontend Image (legalmind_ai-frontend)

  • My Docker Hub Repo: [your-username]/legalmind-ai-frontend:latest

  • Pull Command: docker pull [your-username]/legalmind-ai-frontend:latest

  • Docker Hub Link: https://hub.docker.com/r/[your-username]/legalmind-ai-frontend

Base Images (Official)

1. Python

2. Nginx

  • Image: nginx:alpine

  • Purpose: Base image for the frontend web server (legalmind_ai-frontend) and reverse proxy.

  • Docker Hub Link: https://hub.docker.com/_/nginx

Name of the other software involved along with the purpose

  • FastAPI: Modern Python web framework used to build the high-performance backend REST API.

  • Google Generative AI (Gemini): The AI engine used for all Natural Language Processing (NLP) tasks, including clause classification, risk assessment, and explanation generation.

  • Uvicorn: An ASGI server used to run the FastAPI application inside the backend container.

  • pdfplumber: A Python library used to extract text and data from uploaded PDF documents.

  • Vanilla JavaScript (ES6+): Used for all frontend logic, including API calls (fetch) and dynamically rendering the analysis results on the page.

  • Nginx: A high-performance web server used in the frontend container to serve static files and act as a reverse proxy.

  • Docker & Docker Compose: The core containerization and orchestration tools used to package, build, and run the entire application.

Overall Architecture for the Project

The system follows a three-tier microservices architecture, as illustrated in Figure 1 below.

Architecture Summary:

  • User (Browser): Accesses the static website served from the Frontend Container (legalmind_ai-frontend) on http://localhost.

  • Frontend Container (Nginx):

    • Serves the index.html, index.css, and script.js files.

    • Acts as a reverse proxy: any request to http://localhost/api/... is forwarded to the legalmind_ai-backend container's port 8000, avoiding CORS issues.

  • Backend Container (legalmind_ai-backend):

    • Listens on port 8000.

    • Receives API requests from the Nginx proxy.

    • Uses pdfplumber to extract text from the PDF.

    • Communicates with the external Google Gemini AI API to get analysis (classification, risk, explanation).

    • Returns the JSON analysis back to the frontend.

Overall Flow:

  1. User (Browser): Uploads a PDF document.

  2. Frontend Container (Nginx): Receives the file and proxies the request to /api/.

  3. Backend Container (FastAPI): Receives the file, uses pdfplumber to extract text.

  4. Google Gemini AI (External API): The backend sends the extracted text for analysis.

  5. Backend Container (FastAPI): Receives the JSON analysis (risk, category, etc.) from Gemini.

  6. Frontend Container (Nginx): Receives the final JSON response from the backend.

  7. User (Browser): The frontend's JavaScript renders the JSON data as an interactive report.

Architecture Image



Description About the Architecture

As shown in the Architecture diagram , LegalMind AI follows a microservices architecture with a clear separation of concerns between the frontend presentation layer and the backend processing layer. The system is fully containerized using Docker, which packages the application and its dependencies, ensuring consistent deployment and execution regardless of the host environment. The frontend is a lightweight, static single-page application (SPA) built with vanilla JavaScript, HTML, and CSS. It is served by a high-performance Nginx container (the legalmind_ai-frontend service), which also acts as a reverse proxy, directing all API calls (e.g., /api/*) to the legalmind_ai-backend (backend) service. This approach decouples the client from the server and simplifies configuration.

The backend (the legalmind_ai-backend service) is a robust Python application built with the FastAPI framework and run by a Uvicorn ASGI server inside its own Docker container. This container houses all the core business logic. When a user uploads a PDF, the backend uses the pdfplumber library for clause extraction. It then communicates externally with the Google Gemini AI API to perform complex NLP tasks: clause classification, risk assessment, and explanation generation. To optimize performance and manage API costs, the backend implements an LRU cache for AI responses and includes a fallback system of pre-written explanations. The entire multi-container setup is orchestrated by Docker Compose, which manages the application's services, networks, and health checks, allowing the full stack to be launched with a single docker-compose up command.

The data flow is a key part of this design. When a user visits http://localhost, they are greeted by the legalmind_ai-frontend (frontend) container, which serves the static index.html page. When that user uploads a PDF, the JavaScript in their browser does not send the file directly to the backend. Instead, it sends the request to its own server (the Nginx container at localhost:80). Nginx then inspects the request path. Seeing that it starts with /api/, it "proxies" this request over the internal Docker network to the legalmind_ai-backend container at http://legalmind_ai-backend:8000. This is a critical security and design pattern that hides the backend from the public internet and completely avoids any browser-side CORS (Cross-Origin Resource Sharing) errors.

This decoupled architecture is also highly scalable. While this project runs both containers on a single machine, in a production environment, we could run multiple legalmind_ai-backend containers for heavy processing load, with the legalmind_ai-frontend Nginx container acting as a load balancer to distribute requests between them. Because the entire application state is managed per-request (stateless) and no persistent database is used in this version, it is horizontally scalable. The final images, pushed to Docker Hub, encapsulate this entire logic, allowing anyone to pull and deploy this full-stack application in minutes using a simple Docker Compose file.

Procedure - Part 1: Backend Container

Step 1: Project Folder Structure

Organized the project into backend and frontend directories to hold the code for each service.

Step 2: Backend Dockerfile Preparation

Created a Dockerfile in the backend directory, using python:3.11-slim as the base, copying requirements.txt, installing dependencies, and setting the CMD to run Uvicorn.



Step 3: FastAPI Application (app.py)

Developed the app.py with FastAPI, including the /health endpoint and the analysis endpoints (/extract_clauses/, /classify_clause/).

Step 4: Build the Backend Image

Navigated to the backend directory and built the image.

Command- cd backend

Command- docker build -t legalmind_ai-backend .

Step 5: Run and Verify the Backend Container

Ran the container, mapping port 8000.

Command- docker run -p 8000:8000 legalmind_ai-backend

Step 6: Test Health Check

In a new terminal, tested the running container's health endpoint.

Command- curl http://localhost:8000/health

Output: {"status": "healthy", "service": "LegalMind AI"}


Procedure - Part 2: Frontend Container

Step 1: Frontend Dockerfile Preparation

Created a Dockerfile in the frontend directory using nginx:alpine, and copying the static files and custom Nginx configuration.


Step 2: Nginx Proxy Configuration (nginx.conf)

Wrote a custom nginx.conf file. The key section is the location /api/ block, which proxies requests to the backend service, named legalmind_ai-backend by Docker Compose.

proxy_pass http://legalmind_ai-backend:8000/;


Step 3: Docker Compose Configuration (docker-compose.yml)

Created the docker-compose.yml file in the root directory. This file defines two services: legalmind_ai-backend (building from ./backend) and legalmind_ai-frontend (building from ./frontend), setting up port mapping 80:80 for the frontend and managing the internal network.



Step 4: Start the Full Application Stack

From the root directory, ran Docker Compose.

Command- docker-compose up --build


Step 5: Verify Running Containers

In a new terminal, checked the status of the services.

Command- docker-compose ps


Step 6: Access and Test the Full Application

Opened a web browser and navigated to http://localhost. Uploaded a sample PDF and verified that the frontend successfully displayed the AI-generated analysis received from the backend.


Procedure - Part 3: Pushing the Image into Docker Hub

Step 1: Tag the Local Images

Assuming the images built by Docker Compose are named legalmind_ai-backend:latest and legalmind_ai-frontend:latest (as defined in the docker-compose.yml file's image: tag or as the service name).

Command- docker tag legalmind_ai-backend palaniyappan11/legalmind_ai-backend:latest

Command- docker tag legalmind_ai-frontend palaniyappan11/legalmind_ai-frontend:latest


Step 2: Login to Docker Hub

Authenticated with the Docker Hub registry using the CLI.

Command- docker login


Step 3: Push Images to Docker Hub

Pushed both tagged images to my Docker Hub repository.

Command- docker push palaniyappan11/legalmind-ai-backend:latest

Command- docker push palaniyappan11/legalmind-ai-frontend:latest

[Screenshot of the terminal output showing the successful push of both images]

Step 4: Verify on Docker Hub

Logged into the Docker Hub website (hub.docker.com) and navigated to my repositories to confirm that both legalmind-ai-backend and legalmind-ai-frontend were published with the latest tag.




How to Run the Project (Deployment Guide)

The LegalMind AI application is fully containerized and published to Docker Hub. This makes it simple for anyone to deploy and run locally.

Because LegalMind AI is a multi-container application (a backend and a frontend), the easiest way to run it is with Docker Compose.

Method 1: Build from Source (For Developers)

If you have cloned the complete repository from GitHub, you can build and run the application from the source code.

Step 1: Navigate to Project Directory

Command- cd legalmind_ai (Or your project's root folder name)

Step 2: Run with Docker Compose

This single command will build both the legalmind_ai-backend and legalmind_ai-frontend images and start the containers.

Command- docker-compose up --build

Step 3: Access the Application

Once the containers are running, open your browser and go to:

http://localhost

Method 2: Run from Docker Hub Images (For Deployment)

This is the recommended method for running the pre-built application on any machine with Docker.

Step 1: Create a docker-compose.yml file

Create a new, empty directory (e.g., legalmind-deploy) and inside it, create a file named docker-compose.yml. Paste the following content into the file. This file tells Docker to pull the pre-built images from Docker Hub.

YAML
version: '3.8'

services:
  legalmind_ai-backend:
    image: [your-username]/legalmind-ai-backend:latest
    restart: always
    
  legalmind_ai-frontend:
    image: [your-username]/legalmind-ai-frontend:latest
    restart: always
    ports:
      - "80:80"
    depends_on:
      - legalmind_ai-backend

(Remember to replace [your-username] with your actual Docker Hub ID)

Step 2: Pull the Docker Images

Open your terminal in the same directory as your new docker-compose.yml file. Run the following commands to pull the pre-built images from Docker Hub. (This step is optional, as docker-compose up will do it automatically, but it's good practice).

Command- docker pull [your-username]/legalmind-ai-backend:latest

Command- docker pull [your-username]/legalmind-ai-frontend:latest

This downloads the backend and frontend images containing all the application code and dependencies.

Step 3: Run the Application

Now, run the application using Docker Compose.

Command- docker-compose up

This command will read your docker-compose.yml file, start both the legalmind_ai-backend and legalmind_ai-frontend containers, and connect them on an internal Docker network.

Step 4: Access the Application

Once the containers start successfully, open your browser and go to:

http://localhost

You will see the LegalMind AI web interface. You can now upload a PDF document to get the full AI analysis.

Step 5: Stop and Remove Containers

To stop the application, press CTRL+C in the terminal where compose is running. To clean up and remove the containers and network, run:

Command- docker-compose down -v

What modification is done in the containers after downloading

The base images (python:3.11-slim and nginx:alpine) were not modified after downloading. Instead, new application images were built on top of them using a Dockerfile. The key modifications included:

  1. Installing Dependencies: For the backend, this involved using pip to install all Python libraries from requirements.txt (like FastAPI, Uvicorn, and pdfplumber) and apt-get for any system-level build tools.

  2. Adding Application Code: The custom Python source code for the FastAPI application was copied into the backend image's /app directory.

  3. Copying Static Content: The index.html, index.css, and script.js files for the user interface were copied into the Nginx frontend image's public HTML directory.

  4. Implementing Custom Configuration: The default Nginx configuration was replaced with a custom nginx.conf file to serve the static files and, most importantly, to act as a reverse proxy, forwarding all /api/ requests to the backend container.

  5. Setting Start Commands: A CMD was set in the backend Dockerfile to automatically run the Uvicorn server on port 8000, and the Nginx container uses the new configuration by default upon starting.

Github link / dockerhub link of your modified containers

What are the outcomes of your DA?

  • A reproducible Docker image for the FastAPI backend (legalmind_ai-backend), containing all AI logic and PDF processing capabilities.

  • A reproducible Docker image for the Nginx frontend (legalmind_ai-frontend), containing the UI and proxy configuration.

  • A multi-service Docker Compose file that orchestrates the entire application, managing networking and dependencies with a single command.

  • Successful upload of both the backend and frontend images to a public Docker Hub repository, making the application distributable.

  • A verified end-to-end analysis pipeline, taking a raw PDF upload and returning structured, AI-driven insights on a web UI.

  • A clear, scalable decoupled microservices architecture where the frontend and backend are independent and communicate only via an API.

  • Successful integration of a third-party AI service (Google Gemini) for complex NLP tasks within a containerized environment.

  • A functional Nginx reverse proxy setup that routes API traffic and serves static content from a single port (80), simplifying user access and avoiding CORS issues.

  • Implementation of performance optimization via an LRU cache, reducing redundant API calls to the AI, saving costs, and improving response time.

Conclusion

This Digital Assignment successfully demonstrates the complete lifecycle of developing, containerizing, and deploying a modern, AI-powered web application. The LegalMind AI platform effectively solves the problem of manual legal document review by leveraging a microservices architecture, Docker containerization, and the power of Large Language Models like Google Gemini.

The project achieved all objectives across the three DA parts: building the backend (Part 1), building the frontend (Part 2), and publishing the final images (Part 3). The final result is a scalable, maintainable, and production-ready system that can be launched on any machine with Docker using either Docker Compose or the docker pull commands.

References

  • Python Official Image: Python Software Foundation. (Link: https://hub.docker.com/_/python)

  • Nginx Official Image: Nginx, Inc. (F5). (Link: https://hub.docker.com/_/nginx)

  • FastAPI Documentation: (Link: https://fastapi.tiangolo.com/)

  • Google AI for Developers: (Link: https://ai.google.dev/)

  • Docker Official Documentation: (Link: https://docs.docker.com/)

  • IITB Docker tutorial: For its clear and comprehensive tutorials on Docker and containerization principles. (Link: [Please add the IITB Tutorial Link here])

Acknowledgement

I wish to express my sincere gratitude to the faculty and staff of the School of Computer Science and Engineering (SCOPE) at VIT for providing the academic framework and opportunity to complete this Digital Assignment for the Cloud Computing - BCSE408L course during the Fall 2025-26 semester. I extend special thanks to my professor, Subbulakshmi T, for their invaluable guidance, instruction, and support throughout this project. I would also like to thank my family and friends for their continuous support and encouragement.

Appendix: Useful Commands

Compose (build, start, stop)

# Build and start all services

docker-compose up --build

# Stop and remove all services, networks, and volumes

docker-compose down -v

# List running services

docker-compose ps

# View logs for a specific service (e.g., backend)

docker-compose logs -f legalmind_ai-backend

# View logs for the frontend

docker-compose logs -f legalmind_ai-frontend

Test Endpoints

# Test the frontend (should return HTML)

curl http://localhost

# Test the backend health check (via the Nginx proxy)

curl http://localhost/api/health

# Test the backend health check directly (if port 8000 is mapped)

curl http://localhost:8000/health

Tag & Push to Docker Hub

# Tag the images (replace [your-username])

docker tag legalmind_ai-backend [your-username]/legalmind_ai-backend:latest

docker tag legalmind_ai-frontend [your-username]/legalmind_ai-frontend:latest

# Login to Docker Hub

docker login

# Push the images (replace [your-username])

docker push [your-username]/legalmind_ai-backend:latest

docker push [your-username]/legalmind-ai-frontend:latest


Author: Palaniyappan S

Comments