Nonalcoholic fatty liver disease on Twitter: a sentiment analysis

Sentiment analysis is a technique for exploring a piece of text with the aim to investigate sentiments hidden within it. The use of sentiment analysis in health care could assist in understanding how individuals discuss and feel about a specific topic. Currently, there are scarce data regarding the use of sentiment analysis related to nonalcoholic fatty liver disease (NAFLD), which is the most common chronic liver disease worldwide and is associated with hepatic and extra-hepatic complications. Hence, the aim of this report was to assess the sentiments of NAFLD expressed in messages posted on Twitter, one of the most popular social media platforms worldwide. We chose the hashtags #FattyLiver, #NAFLD, #NASH, and #MAFLD as terms to identify the messages related to NAFLD on Twitter. Messages containing at least one of these hashtags were collected using the standard Application Programming Interface provided by Twitter. The sentiment analysis revealed that sentiments hidden within messages related to NAFLD were substantially neutral and that “breastcancer” and “cancer” were two of the most common words used, suggesting that a large part of messages focused on the relationship between NAFLD and extra-hepatic cancers. Conversely, the association between NAFLD and cardiovascular disease seems to be less relevant for Twitter community. These observations might be useful for developing better public health strategies and for promoting a constructive attitude among subjects that read and discuss about NAFLD (and its complications) on social media. Page 2 of Mantovani et al. Metab Target Organ Damage 2021;1:6 https://dx.doi.org/10.20517/mtod.2021.09 5


Abstract
Sentiment analysis is a technique for exploring a piece of text with the aim to investigate sentiments hidden within it. The use of sentiment analysis in health care could assist in understanding how individuals discuss and feel about a specific topic. Currently, there are scarce data regarding the use of sentiment analysis related to nonalcoholic fatty liver disease (NAFLD), which is the most common chronic liver disease worldwide and is associated with hepatic and extra-hepatic complications. Hence, the aim of this report was to assess the sentiments of NAFLD expressed in messages posted on Twitter, one of the most popular social media platforms worldwide. We chose the hashtags #FattyLiver, #NAFLD, #NASH, and #MAFLD as terms to identify the messages related to NAFLD on Twitter. Messages containing at least one of these hashtags were collected using the standard Application Programming Interface provided by Twitter. The sentiment analysis revealed that sentiments hidden within messages related to NAFLD were substantially neutral and that "breastcancer" and "cancer" were two of the most common words used, suggesting that a large part of messages focused on the relationship between NAFLD and extra-hepatic cancers. Conversely, the association between NAFLD and cardiovascular disease seems to be less relevant for Twitter community. These observations might be useful for developing better public health strategies and for promoting a constructive attitude among subjects that read and discuss about NAFLD (and its complications) on social media.
Keywords: Nonalcoholic fatty liver disease, NAFLD, Twitter, sentiment analysis Dear Editor, Sentiment analysis is a methodology for analyzing a piece of text in order to investigate sentiments hidden within it (in terms of positive [+1], negative [-1], or neutral [0]) [1] . In health care, the use of sentiment analysis can help in better understanding how individuals talk and feel regarding a specific health topic [2] .
In the social media era, the use of sentiment analysis is quite useful since social media platforms are specific and natural environments where individuals can share and try to find information such as those on health [2] . To date, there are scarce data [3] regarding the use of sentiment analysis related to nonalcoholic fatty liver disease (NAFLD), which is the most common chronic liver disease worldwide, affecting nearly 30% of adults in general population and up to 70% of individuals with type 2 diabetes [4] . In addition, over the last decades, it has become increasingly clear that NAFLD is a systemic disease, which is not only associated with relevant hepatic complications (e.g., hepatic failure and hepatocellular carcinoma) but also with serious extra-hepatic complications [4] such as cardiovascular disease [5] , cancer [6] , and type 2 diabetes [7] . For these reasons, we believe that having knowledge of the sentiment expressed by social media users towards NAFLD is important to understand the impact that such information may have on individuals with NAFLD. Hence, based on this background, the aim of this report was to assess the sentiments of NAFLD expressed in messages posted on Twitter, one of the most popular social media worldwide.
We chose the hashtags #FattyLiver, #NAFLD (i.e., NAFLD), #NASH (i.e., nonalcoholic steatohepatitis), and #MAFLD (i.e., metabolic associated fatty liver disease, which is the new definition proposed by several experts in 2020 [8] ) as terms to identify the messages related to NAFLD on Twitter, as they are the most popular. Messages (commonly referred to as "tweets") containing at least one of these hashtags were collected using the standard Application Programming Interface (API) provided by Twitter. Briefly, API is a software intermediary that allows two applications to talk to each other. The standard API of Twitter provides a subset of the current tweets and their metadata [2] . However, it has several limitations [2] . For instance, the data obtained for every tweet are quite limited, the tweets obtained are a sample of the total tweets, and the number of requests per minute is limited as well [2] . For each tweet we retrieved the text of the tweet, including any emojis. For each hashtag, we decided to extract the first 500 tweets (when possible by the API) in English since 2019-01-01 to 2021-08-20. All analyses were made by Python 3.9.5 using the following libraries: "BeatifulSoup", "Tweepy", "NLTK", and "TextBlob". essentially revealed a neutral sentiment (polarity equal to 0). For tweets containing the hashtags #FattyLiver the most common words used were: "fattyliver", "liver", "tests", "breastcancer", "screening", and "cancer" (panel B). For tweets containing the hashtags #NAFLD the most common words used were: "nafld", "liver", "disease", "fatty", "breastcancer", and "nonalcoholic" (panel D). For tweets containing the hashtags #NASH the most common words used were: "nash", "cancer", "fattyliver", "breastcancer", and "tests" (panel F). For tweets containing the hashtags #MAFLD the most common words used were: "nafld", "liver", "disease", "fatty", "breastcancer", and "nonalcoholic" (panel D). From our analysis, it emerged that sentiments hidden within tweets related to NAFLD were mostly neutral. Interestingly, it also appeared that "breastcancer" and "cancer" were two of the most common words used in the tweets related to NAFLD. In this context, recently, in a 2021 meta-analysis of 10 cohort studies with a total of 182,202 middle-aged individuals (25% with NAFLD) and 8485 incident cases of extrahepatic cancers at different sites over a median follow-up of 5.8 years, Mantovani et al. [6] showed that NAFLD was associated with a moderately increased risk of developing extra-hepatic cancers (especially gastrointestinal cancers, breast cancer, and gynaecological cancers). Twitter community seemed to post several messages related to NAFLD containing information about the association between NAFLD and extra-hepatic cancers. The reason for which the focus seemed greater for breast cancer is unknown. The fact that we do not know if whether the sex of the individual who posted the message is male or female limits our ability to analyze this aspect. In addition, it is possible to suppose that a portion of the tweets was posted by communication offices of academic journals that might be aiming to emphasize some aspects of the natural history of NAFLD. Conversely, it seems that the association between NAFLD and cardiovascular disease (which is, to date, the leading cause of death in NAFLD patients [4] ) is not adequately recognized by Twitter community. However, it is important to note that the awareness regarding the strong association between NAFLD and cardiovascular disease is also low among some health practitioners. For instance, using a national digestive disease specialists survey on cardiovascular risk management in Spanish hospitals, Iruzubieta et al. [9] documented that approximately one-fifth of respondents performed an elementary physical examination to address the cardiovascular risk in NAFLD patients.
Our report has some limitations. Although we extracted many tweets focusing on NAFLD (in English), this is a random sample and might not be completely representative. Moreover, not all tweets were indexed or made available by the standard API search interface. Finally, our analysis could be biased by the limits of the standard API of Twitter. Hence, additional research needs to consider other keywords, other languages, and other social media channels.
In conclusion, the use of sentiment analysis techniques revealed that tweets on NAFLD were substantially neutral and mainly focused on the relationship between NAFLD and extra-hepatic cancers (especially breast cancer). The association between NAFLD and cardiovascular disease, instead, appeared to be less relevant for Twitter community. These observations may be useful for developing better public health strategies and for promoting a constructive attitude among subjects that read and discuss about NAFLD (and its complications) on social media.

Authors' contributions
Made substantial contributions to conception and design of the study: Mantovani A, Dalbeni A Performed data analysis and interpretation: Mantovani A Approved the version to be published and agreed to be accountable for all aspects of the work: Mantovani A, Beatrice G, Zusi C, Dalbeni A Performed the draft: Mantovani A Critical revision of the intellectual content: Beatrice G, Zusi C, Dalbeni A

Availability of data and materials
Not applicable.

Financial support and sponsorship
None.