Dataset of government-developed OS software

Ilma Jaganjac, Angelos-Ermis Mangos, Marvin Blommestijn, Pravesha Ramsundersingh.

Group 9.

Paper. Website. Source code.

(Visit our website!) This project creates a dataset and analysis framework to evaluate the sustainability of government-developed open-source software. It collects repositories from five countries: the United States, the Netherlands, Greece, Germany, and France, and assesses them across five dimensions: technical, environmental, economical, social, and individual. The technical evaluation focuses on code quality, documentation, testing practices, and modularity using automated AI analysis. Environmental sustainability is measured through energy efficiency and carbon footprint estimates based on runnability and programming language efficiency. The economical dimension identifies redundant projects via clustering and textual similarity analysis, which could help reduce development costs. Social sustainability is examined by analyzing commit message sentiment and community engagement, while the individual aspect looks at contributor diversity, retention, and project openness. An interactive dashboard built with Streamlit visualizes these metrics with radar charts, bar graphs, and pie charts, providing clear insights into each country's performance and areas for improvement. The project aims to support sustainable software engineering practices in the public sector by addressing challenges such as limited documentation, low testing standards, and language barriers.