2026: An Intelligent Natural Language Processing Pipeline for Public Procurement Data: Enabling Predictive Policy Analysis (EDF)
This project addresses a critical bottleneck in evidence-based food policy: the severe fragmentation and inconsistency of public procurement data. We propose to develop and validate an open-source, intelligent data-cleaning pipeline that uses AI to automate the transformation of raw, unstructured bid data into a unified, analysis-ready resource. To demonstrate its utility, we will conduct a proof-of-concept case study, applying an econometric model to the cleansed data to derive initial insights into bidding dynamics. This foundational infrastructure will directly unlock timely, rigorous analysis of values-based procurement policies, empowering municipalities to design strategies that effectively support local economies, disadvantaged vendors, and environmental goals.
Cornell: Houtian (Frank) Ge (Cornell SC Johnson College of Business / Dyson School)
EDF: Daniel Kaiser (Director of Agriculture Innovation, Climate-Smart Agriculture)