MACHINE LEARNING • DIGITAL TRANSFORMATION • DEVELOPMENT • DEVOPS • MACHINE LEARNING • DIGITAL TRANSFORMATION • DEVELOPMENT • DEVOPS •
Challenge
To build a platform that gathers adverse medication effect data from several sources and creates informative summaries for labs.
Reduction in data entry errors by 85%
Project
We built a serverless platform on AWS using Python and MLOps practices
that included:
Automated ML Pipelines
Leveraging AWS services, we built automated pipelines for training and deploying Machine Learning models for efficient document processing, focusing specifically on PDF data extraction using OCR.
Infrastructure as Code (IaC)
We managed infrastructure with AWS SAM, enabling consistent and version-controlled deployments using CI/CD tools.

Achievements
This solution streamlines approval processes during the research and development phase of product life cycles, particularly beneficial for the pharmaceutical industry. The platform can handle both XML files and PDFs, using different methods to extract and process the information
XML files are directly converted to JSON

PDFs require OCR + Machine Learning model orchestration to transform the information into a usable format.