DIGITAL LIBRARY
DATA SCIENCE EDUCATION THROUGH COMMUNITY-CENTERED OPEN SCIENCE TRAINING
1 Montgomery College (UNITED STATES)
2 BioData Sage (UNITED STATES)
About this paper:
Appears in: INTED2026 Proceedings
Publication year: 2026
Article: 1138
ISBN: 978-84-09-82385-7
ISSN: 2340-1079
doi: 10.21125/inted.2026.1138
Conference name: 20th International Technology, Education and Development Conference
Dates: 2-4 March, 2026
Location: Valencia, Spain
Abstract:
This presentation shares outcomes and insights from FAIR Forward 2025, a pioneering initiative addressing the underrepresentation of BIPOC communities in data science and open research. Supported by the Open Research Community Accelerator Program, this virtual program engaged BIPOC students and professionals in real-world hands-on data science training centered around the open and FAIR principles of research.

The initiative addresses critical gaps in STEM education accessibility and the research divide within underrepresented communities. Traditional data science education often assumes prior programming knowledge and access to resources, creating barriers for students from institutions with limited research infrastructure. FAIR Forward 2025 dismantled these barriers through a comprehensive virtual learning journey combining foundational workshops that used free and open-source tools with a platform for applying the lessons.

The program structure included a multi-week workshop series covering open-source data, R programming fundamentals, version control with GitHub, and data visualization techniques. All materials were designed for beginners, requiring no prior coding experience. The program culminated in a two-day virtual hackathon where participants analyzed real-world human health and behavioral datasets to address community-relevant challenges, competing for prizes while earning participation certificates. This applied experience gave students the opportunity to experiment with the tools they learned and to apply FAIR principles in real time using real-world datasets.

The initiative achieved remarkable demographic outcomes. Not only were all our participating students from the BIPOC community, but the initiative also comprised 70% women and non-binary individuals, significantly exceeding the typical representation in data science programs. Preliminary feedback indicates significant knowledge gains among participants with no prior programming experience, high engagement rates across the virtual format, and successful completion of hackathon projects addressing public health disparities. The emphasis on community-relevant applications and collaboration proved particularly effective in attracting and retaining participants from underrepresented backgrounds.

This initiative offers a replicable model for institutions seeking to expand access to STEM education while advancing open science practices. By centering project ideas on students’ interests, the program fostered emotional resonance, keeping participants invested. It demonstrates that targeted recruitment, community-focused applications, and learning spaces that invite learners at all levels of data science skills can effectively build foundational programming knowledge and engage everyone in data science. The presentation will share programmatic details, participant demographics and outcomes, lessons learned, and recommendations for educators and institutions developing similar initiatives. We will discuss challenges, including virtual engagement strategies, varying levels of technology access among participants, and approaches to sustaining momentum over an extended time period.
Keywords:
Open science, open source, equity in education, data science, hands-on learning.