Projects
Buka tautan projectKembali ke daftar
ProjectDocument Automation2023—
Automated Reporting Platform
Distributed reporting platform that automates PDF and DOCX generation through template-based rendering, asynchronous RabbitMQ processing, MongoDB ticket lifecycle management, ETA prediction, and Amazon S3 cloud storage. Built to handle high-volume reporting workloads without blocking user requests.
Tech Stack
- Python
- FastAPI
- RabbitMQ
- MongoDB
- Amazon S3
Key Features
- Dynamic PDF and DOCX generation using reusable template-based rendering
- MongoDB ticket lifecycle: every request gets a ticket with status, progress, history, and ETA
- RabbitMQ-powered async job queue — API responds immediately, workers process in background
- ETA prediction engine based on queue length, active workers, and historical processing time
- Scheduled recurring report generation (daily, weekly, monthly, batch)
- Amazon S3 cloud storage with secure download URL generation for completed reports
Architecture
FastAPI → RabbitMQ → Python Workers → S3
- 1.Frontend submits report request to FastAPI service
- 2.Service creates MongoDB ticket and dispatches job to RabbitMQ queue
- 3.Python workers consume queue messages and generate PDF/DOCX via templates
- 4.Generated files uploaded to Amazon S3 with metadata linked to the ticket
- 5.Users track progress via ticket status and retrieve files via secure download URLs
Data / Processing Flow
- 01User submits report generation request via application
- 02Ticket created in MongoDB with request info, status, and initial ETA
- 03Job dispatched to RabbitMQ — API returns immediately with ticket ID
- 04Python worker consumes job and renders PDF or DOCX from template
- 05Generated file uploaded to Amazon S3
- 06Ticket updated with completion status and S3 download URL
Highlights & Metrics
- Non-blocking API with async queue dispatch for all report requests
- ETA prediction based on queue length and historical processing time
- Supports PDF and DOCX output via dynamic template rendering
Use Cases
- Operational reporting from structured datasets
- Executive report generation in presentation-ready PDF or DOCX format
- Compliance documentation with standardized reusable templates
- Scheduled recurring reports on configurable intervals
- High-volume batch report generation with distributed workers
My Contributions
- Designed the distributed reporting architecture.
- Developed REST APIs using FastAPI.
- Built async job processing with RabbitMQ and Python worker pool.
- Implemented MongoDB-based ticket lifecycle with full processing history.
- Developed ETA prediction based on queue workload and historical processing time.
- Built dynamic PDF and DOCX template rendering engines.
- Integrated Amazon S3 for document storage and secure download URL generation.
- Designed scalable background processing workflows with horizontal worker scalability.
Technical Highlights
- Distributed worker architecture scales horizontally by adding more RabbitMQ consumers
- ETA prediction considers queue length, active worker count, and historical processing averages
- MongoDB ticket lifecycle provides full audit trail from submission to completion
- Template-based rendering supports both PDF and DOCX output from the same data source
https://yusufrifqi.work/projects/automated-reporting