This is the repo of the OpenEdu B Team submitted as part of the deploy(impact) 2022 edition sponsored by women++.
The initial challenge was defined as:
create a structure to describe the metadata of content found on OpenEdu.ch and research/implement ways to discover content that has been published (crawlers, moderation)
Talking to the challenge-setter, we further specified key requirements:
- Limited to CC-by-SA license
- Focus on content already hosted in the Wiki Universe
- Use links to refer to existing content instead of uploading material
- Avoiding duplication is not key
- Finding relevant content is important (as even Wiki staff is often unaware)
We identfied the main users as content creator, moderator, content searcher, and admin.
Based on these insights, we built our solution with the following key elements:
- Ontology which structures the existing data in classes and relationships
- Better user experience by simplifying forms and flows, and suggesting features like Wiki Login, moderation interface, and dashboards.
- Automation (based on NLP) to reduce the work of the content uploader and moderator, and identify records with related content
- Crawlers and scrapers to identify relevant content on Wikimedia & other Wiki* sites
- A robust, scalable architecture using Azure, dockers & containers, and microservices.
Figma: to demonstrate the user experience. video & documentation is stored under docs/ux-ui
Demo website: to demonstrate the working integration of backend, crawlers/scrapers, data, and NLP results (for related content). Detailed information about the website backend and frontend can be found here
The folders are separated into 3 folders:
src
: Location of all code, along with any parsable filesdocs
: Location of all documentation regarding the projecttest
: Any test files that we have produced.
These three folders are then sub-divided into different modules, which include (depending on the presence of the relevant files):
backend
: source code for containers and microservicesdata-science
: ontology, Natural Language Processing (NLP) codedevops
: database, container architecturefrontend
: demo website (linked above)presentation
: final presentation of 19 Nov 2022 incl. videoux-ui
: user journeys, mockups, Figma website
- Andrina Beuggert: Product Owner (email, main contact person)
- Michal Burgunder: Data Scientist (tech lead, ontology) (email)
- Maria Giakoumelou: Team Satellite (email)
- Elena Kameneva: UX/UI Designer (email)
- Noëline Lepais: Data Scientist (NLP) (email)
- Sofia Strukova: Data Scientist (ontology) (email)
- Ceylan Thompson: Scrum Master (email)
- Mascha Tikhomirova: Backend (email)