Abstract
Despite the fact that Wikipedia is one of the most popular sources of information in the world, it is often criticized for the poor quality of content. In this online encyclopaedia articles on the same topic can be created and edited independently in different languages. Some of this language versions can provide valuable information on a specific topics. Wikipedia articles may include infobox, which used to collect and present a subset of important information about its subject. This study presents method for quality assessment of Wikipedia articles and information contained in their infoboxes. Choosing the best language versions of a particular article will allow for enrichment of information in less developed version editions of particular articles.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
References
Hodel-Widmer, T.B., Dittrich, K.R.: Concept and prototype of a collaborative business process environment for document processing. Data Knowl. Eng. 52(1), 61–120 (2005)
Oeberst, A., Cress, U., Back, M., Nestler, S.: Individual versus collaborative information processing: the case of biases in Wikipedia. In: Cress, U., Moskaliuk, J., Jeong, H. (eds.) Mass Collaboration and Education. CCLS, vol. 16, pp. 165–185. Springer, Cham (2016). doi:10.1007/978-3-319-13536-6_9
Staub, T., Hodel, T.: Wikipedia vs. academia: an investigation into the role of the internet in education, with a special focus on Wikipedia. Univ. J. Educ. Res. 4(2), 349–354 (2016)
Callahan, E.S., Herring, S.C.: Cultural bias in wikipedia content on famous persons. J. Am. Soc. Inform. Sci. Technol. 62(10), 1899–1915 (2011)
Bao, P., Hecht, B., Carton, S., Quaderi, M., Horn, M., Gergle, D.: Omnipedia: bridging the Wikipedia language gap. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 1075–1084. ACM (2012)
Węcel, K., Lewoniewski, W.: Modelling the quality of attributes in Wikipedia infoboxes. In: Abramowicz, W. (ed.) BIS 2015. LNBIP, vol. 228, pp. 308–320. Springer, Cham (2015). doi:10.1007/978-3-319-26762-3_27
Stvilia, B., Twidale, M.B., Smith, L.C., Gasser, L.: Assessing information quality of a community-based encyclopedia. In: Proceedings of the ICIQ, pp. 442–454 (2005)
Blumenstock, J.E.: Size matters: word count as a measure of quality on Wikipedia. In: WWW, pp. 1095–1096 (2008)
Warncke-Wang, M., Cosley, D., Riedl, J.: Tell me more: an actionable quality model for Wikipedia. In: WikiSym 2013, pp. 1–10 (2013)
Anderka, M.: Analyzing and predicting quality flaws in user-generated content: the case of Wikipedia. Ph.D., Bauhaus-Universitaet Weimar Germany(2013)
Lipka, N., Stein, B.: Identifying featured articles in Wikipedia: writing style matters. In: Proceedings of the 19th International Conference on World Wide Web, pp. 1147–1148 (2010)
Horn, C., Zhila, A., Gelbukh, A., Kern, R., Lex, E.: Using factual density to measure informativeness of web documents. In: Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013), 22–24 May 2013, Oslo University, Norway. NEALT Proceedings Series 16, vol. 085, pp. 227–238. Linköping University Electronic Press (2013)
Khairova, N., Lewoniewski, W., Węcel, K.: Estimating the quality of articles in russian wikipedia using the logical-linguistic model of fact extraction. In: Abramowicz, W. (ed.) BIS 2017. LNBIP, vol. 288, pp. 28–40. Springer, Cham (2017). doi:10.1007/978-3-319-59336-4_3
Lex, E., Voelske, M., Errecalde, M., Ferretti, E., Cagnina, L., Horn, C., Stein, B., Granitzer, M.: Measuring the quality of web content using factual information. In: Proceedings of the 2nd Joint WICOW/AIRWeb Workshop on Web Quality - WebQuality 2012, p. 7 (2012)
Wu, G., Harrigan, M., Cunningham, P.: Characterizing Wikipedia pages using edit network motif profiles. In: Proceedings of the 3rd International Workshop on Search and Mining User-Generated Contents, pp. 45–52. ACM (2011)
Velázquez, C.G., Cagnina, L.C., Errecalde, M.L.: On the feasibility of external factual support as wikipedia’s quality metric. Procesamiento del Lenguaje Natural 58, 93–100 (2017)
Lewoniewski, W., Węcel, K., Abramowicz, W.: Quality and importance of Wikipedia articles in different languages. In: Dregvaite, G., Damasevicius, R. (eds.) ICIST 2016. CCIS, vol. 639, pp. 613–624. Springer, Cham (2016). doi:10.1007/978-3-319-46254-7_50
Lewoniewski, W., Węcel, K., Abramowicz, W.: Analiza porównawcza modeli jakości informacji w narodowych wersjach Wikipedii. In: Poręebska-Miąc, T. (eds.) Systemy Wspomagania Organizacji SWO 2015. Wydawnictwo Uniwersytetu Ekonomicznego w Katowicach, pp. 133–154 (2015)
Dang, Q.V., Ignat, C.L.: Quality assessment of Wikipedia articles without feature engineering. In: 2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL), pp. 27–30. IEEE (2016)
Dalip, D.H., Gonçalves, M.A., Cristo, M., Calado, P.: A general multiview framework for assessing the quality of collaboratively created content on web 2.0. J. Assoc. Inf. Sci. Technol. 68(2), 286–308 (2017)
Färber, M., Bartscherer, F., Menne, C., Rettinger, A.: Linked data quality of DBpedia, Freebase, OpenCyc, Wikidata, and YAGO. Semant. Web J., 3813–3842 (2016)
Kontokostas, D., Westphal, P., Auer, S., Hellmann, S., Lehmann, J., Cornelissen, R., Zaveri, A.: Test-driven evaluation of linked data quality. In: Proceedings of the 23rd International Conference on World Wide Web, pp. 747–758. ACM (2014)
Mihindukulasooriya, N., Rico, M., García-Castro, R., Gómez-Pérez, A.: An analysis of the quality issues of the properties available in the Spanish DBpedia. In: Puerta, J.M., Gámez, J.A., Dorronsoro, B., Barrenechea, E., Troncoso, A., Baruque, B., Galar, M. (eds.) CAEPIA 2015. LNCS, vol. 9422, pp. 198–209. Springer, Cham (2015). doi:10.1007/978-3-319-24598-0_18
Jang, S., Megawati, M., Choi, J., Yi, M.: Semi-automatic quality assessment of linked data without requiring ontology. In: NLP-DBPEDIA@ ISWC, pp. 45–55 (2015)
Tacchini, E., Schultz, A., Bizer, C.: Experiments with Wikipedia cross-language data fusion. In: Workshop on Scripting and Development (2009)
Mendes, P.N., Mühleisen, H., Bizer, C.: Sieve: linked data quality assessment and fusion. In: Proceedings of the 2012 Joint EDBT/ICDT Workshops, EDBT-ICDT 2012, pp. 116–123. ACM, New York (2012)
Zaveri, A., Rula, A., Maurino, A., Pietrobon, R., Lehmann, J., Auer, S.: Quality assessment for linked data: a survey. Semant. Web 7(1), 63–93 (2016)
Lewoniewski, W., Węcel, K., Abramowicz, W.: Analysis of references across Wikipedia languages. In: Proceedings of the Information and Software Technologies: 23rd International Conference, ICIST 2017, Druskininkai, Lithuania, 12–14 October 2017 (2017). doi:10.1007/978-3-319-67642-5_47
Lange, D., Böhm, C., Naumann, F.: Extracting structured information from Wikipedia articles to populate infoboxes. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, CIKM 2010, pp. 1661–1664. ACM, New York (2010)
Schmidt, R., Möhring, M., Härting, R.-C., Zimmermann, A., Heitmann, J., Blum, F.: Leveraging textual information for improving decision-making in the business process lifecycle. In: Neves-Silva, R., Jain, L.C., Howlett, R.J. (eds.) Intelligent Decision Technologies. SIST, vol. 39, pp. 563–574. Springer, Cham (2015). doi:10.1007/978-3-319-19857-6_48
Palmero Aprosio, A., Giuliano, C., Lavelli, A.: Automatic expansion of DBpedia exploiting Wikipedia cross-language information. In: Cimiano, P., Corcho, O., Presutti, V., Hollink, L., Rudolph, S. (eds.) ESWC 2013. LNCS, vol. 7882, pp. 397–411. Springer, Heidelberg (2013). doi:10.1007/978-3-642-38288-8_27
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Lewoniewski, W. (2017). Enrichment of Information in Multilingual Wikipedia Based on Quality Analysis. In: Abramowicz, W. (eds) Business Information Systems Workshops. BIS 2017. Lecture Notes in Business Information Processing, vol 303. Springer, Cham. https://doi.org/10.1007/978-3-319-69023-0_19
Download citation
DOI: https://doi.org/10.1007/978-3-319-69023-0_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-69022-3
Online ISBN: 978-3-319-69023-0
eBook Packages: Business and ManagementBusiness and Management (R0)