Sentiment Analysis of 2019 Presidential Candidates on Kompas.com through Topic-driven Crawlers

Muhammad Nasar

Abstract


Indonesian society is now transforming into the digital age. Web technology as a medium of information in various fields including social, business, education, government, etc., has been popularly used. This situation encourages various parties to be able to utilize the data available on it. One of the uses is sentiment analysis through web mining. Stakeholders certainly want to know whether opinion on their organization is positive, neutral, or negative. Sentiment analysis can be used to measure impressions or attitudes expressed in written language automatically. Unfortunately, the web is only designed for human readability, not for computer analysis. Besides not having a standard structure, web documents are also mixed with tag characters (html tags, java scripts, images, ads, etc.) that need to be eliminated before the data mining process.This study tried to construct a basic topic-driven web crawler as a model to harvest data of formal news on the web. Google search engine is used for topic search. The output of the crawler in the form of unstructured data (html document) is filtered so that the targeted text can be cleaned. This purified text is processed using the Naïve Bayes algorithm to determine the polarity of its sentiments. The topic or case study used is the news of the University of Muhammadiyah Malang on the site kompas.com, one of Indonesia's online news media. The experiement showed a page with the requested topic on the kompas.com site can be taken and the sentiment of the extracted text can be analyzed.

Keywords


web mining, big data analytics, sentiment analysis, web crawler

Full Text:

PDF


DOI: https://doi.org/10.22219/sentra.v0i4.2421

Refbacks



Seketariat

Fakultas Teknik

Universitas Muhammadiyah Malang Kampus III

Jl. Raya Tlogomas 246 Malang, 65144