8 September 2021
In May, we kicked off another Project Friday. This time, a group of seven Cmotions colleagues, of different origins and levels, came together. As always, the aim was to make a brilliant deliverable, to learn a lot, and (most of all) to have a lot of fun.
The first challenge was to come up with a topic, which would lead to a cool product that everyone is excited about. After brainstorming for a while, it was soon decided to enter the CommonLit Readability Prize (a Kaggle competition) focusing on the following question:
“To what extent can machine learning identify the appropriate reading level of a passage of text, and help inspire learning?”
This question arose as CommonLit, Inc. and Georgia State University wanted to offer texts of the right level of challenge to 3rd to 12th grade students, in order to stimulate the natural development of their reading skills. Until now, the readability of (English) texts, has been based mainly on expert assessments, or on well-known formulas, such as the Flesch-Kincaid Grade Level, which often lack construct and theoretical validity. Other, often commercial, solutions failed to meet the proof and transparency requirements.
From May to August ’21, we worked on developing a Python notebook that is able to assess readability scores better than current standards. We worked with a training dataset of about 3.000 text excerpts that were rated by 27 professionals. In these months, we investigated which features affect readability the most, which models and architecture are best to use, and how to tweak them. The name of our team? Textfacts!
Certainly, you want to know if we succeeded in predicting text readability better than existing solutions, right? Stay tuned and follow our blogs for our stories! Do you have any questions or do you want to know more about this cool project right now? Please click here to visit the competition webpage or contact us, via firstname.lastname@example.org.
This article was previously published on TheAnalyticsLab.nl.
This article is a result of an innovative project in our Analytics Lab. We are always curious of the developments in the field of Data Science and we love to explore them ourselves. That is why we founded The Analytics Lab as part of Cmotions. Here we explore the limits and possibilities of Data Science, and we are always happy to share our experience, knowledge and vision with you.
Do you want to know more about this subject? Please contact Mike te Beest using the details below
17 April 2023
In every company it’s a struggle to make sure we only keep the documents we want... read more
7 September 2022
We don’t know if you’ve heard already, but there is yet another crisis on our horizon:... read more
15 September 2021
Project-Friday At Cmotions, we love a challenge. Especially those that make us both think and have... read more