Abstract

Evaluating readability of web documents has gained attention due to several factors such as improving the effectiveness of writing and to reach a wider spectrum of audience. Current practices in this direction follow several statistical measures in evaluating readability of the document. In this paper, we have proposed a machine learning-based model to compute readability of web pages. The minimum educational standards required (grade level) to understand the contents of a web page are also computed. The proposed model classifies the web pages into highly readable, readable or less readable using specified feature set. To classify a web page with the aforementioned categories, we have incorporated the features such as sentence count, word count, syllable count, type-token ratio and lexical ambiguity. To increase the usability of the proposed model, we have developed an accessible browser extension to perform the assessments of every web page loaded into the browser.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)
You do not currently have access to this article.