Editorial & Advertiser disclosure

Global Banking & Finance Review® is an online platform offering news, analysis, and opinion on the latest trends, developments, and innovations in the banking and finance industry worldwide. The platform covers a diverse range of topics, including banking, insurance, investment, wealth management, fintech, and regulatory issues. The website publishes news, press releases, opinion and advertorials on various financial organizations, products and services which are commissioned from various Companies, Organizations, PR agencies, Bloggers etc. These commissioned articles are commercial in nature. This is not to be considered as financial advice and should be considered only for information purposes. It does not reflect the views or opinion of our website and is not to be considered an endorsement or a recommendation. We cannot guarantee the accuracy or applicability of any information provided with respect to your individual or personal circumstances. Please seek Professional advice from a qualified professional before making any financial decisions. We link to various third-party websites, affiliate sales networks, and to our advertising partners websites. When you view or click on certain links available on our articles, our partners may compensate us for displaying the content to you or make a purchase or fill a form. This will not incur any additional charges to you. To make things simpler for you to identity or distinguish advertised or sponsored articles or links, you may consider all articles or links hosted on our site as a commercial article placement. We will not be responsible for any loss you may suffer as a result of any omission or inaccuracy on the website.

Home > Technology > Data Science Against Disinformation: How Artificial Intelligence and Machine Learning Can Fact-Check Claims of Digital Election Campaigns

Data Science Against Disinformation: How Artificial Intelligence and Machine Learning Can Fact-Check Claims of Digital Election Campaigns

Published by Jessica Weisman-Pitts

Posted on September 1, 2021

9 min read

Last updated: January 21, 2026

Circuit board showcasing advanced technology relevant to contextual AI in advertising - Global Banking & Finance Review — This image features a detailed circuit board, symbolizing the advanced technology driving contextual AI. It relates to the article's discussion on how contextual advertising optimizes digital investment without compromising consumer data privacy.

By Professor Dr Mohammad Mahdavi, Programme Leader Data Science, AI, and Digital Business at GISMA Business School

Becoming digital allows election campaigns to promote their candidates more effectively and economically. Campaigners can collect and analyse users’ public data on online social networks to target their potential voters, approach them with personalised messages, and convince them to vote for their specific candidates.

While this approach is generally sound, what usually happens in election campaigns reminds us of the famous Sergey Nechayev’s quote: “The end justifies the means.” Some candidates do not hesitate to resort to any ethical/nonethical means to increase their election chances or harm their rival’s reputation. One of these nonethical means is to propagate disinformation, which is defined as false information that is spread deliberately to deceive. As an example, Donald Trump once tweeted, “I WON THIS ELECTION, BY A LOT!”

Traditionally, journalists have to go through hours of archival data collection and analysis to fact-check such a claim. Although the above example tweet was easy to refute based on the US 2020 election statistics, fact-checking of claims is not always that easy. Consider a hypothetical election campaigner who claims that, under his last four-year presidency, “the country became the first economic power of Europe.” Here, our journalist will have a harder fact-checking task. They’d first need to filter out this particular sentence from the candidate’s long speech as this sentence contains a claim. Next, they need to come up with some definitions for “being the first economic power” and “Europe”. Then, they need to collect some data related to the economic indexes of different European countries during a specific period. Finally, they must analyse the data to see whether the numbers support the original candidate’s claim or not. This process, which is usually tedious, time consuming, and error prone, has to be repeated for every candidate and every claim.

This is where data science, artificial intelligence and machine learning-based approaches can come into the picture to facilitate this fact-checking process for our journalist. Data science approaches can (semi)automate each of the above tasks. First, we can train a claim detection classifier that processes all sentences of each candidate’s speech to automatically filter those that contain a claim. Second, a keyword extraction approach can automatically extract the most important phrases of our filtered sentence, such as “our country”, “the first economic power”, and “Europe”. Third, an information retrieval system can automatically search and retrieve all the archival (un)structured datasets related to these keywords. These datasets could be an unstructured economic report containing the specified keywords or a structured table of economic indexes of different countries that is annotated with similar keywords. Finally, we can have a final trained model to take all these collected datasets and the original candidate’s claim to estimate the truth score of the claim based on the collected data.

Seems like magic? Wait a second! We are not yet at the point where this whole process can be automated, as described above. Although we can technically build all the described systems, in practice, their performance might not be that impressive. The main reason is that to train a smart approach for each of the mentioned steps, we usually need to collect a large set of training examples. For example, for training the claim detection classifier, we need to provide thousands (or even millions) of examples of sentences that do/don’t contain a claim.

Where there is a lack of enough training data, these data science approaches might not always generate a correct and complete result set. The claim detection classifier might miss some of the claims or wrongly mark some normal sentences. Similarly, the keyword extraction approach might miss some keywords or extract non-keyword phrases. The same is true for the information retrieval system that might miss some relevant datasets or retrieve some irrelevant ones. The final task to estimate the truth score of the claim based on the collected data is perhaps the most challenging step as this score estimation could be a subjective calculation. If we, human beings, cannot agree on the truth score of a claim based on the current facts and data, how can we expect machines to do this task for us accurately?

Despite the natural challenges of this fact-checking process and the imperfectness of the data science approaches, they can still support humans in this process. That is why such systems are usually called “decision support systems” as they are not going to completely take over a human’s role – At least not yet! These systems support human beings, in our example of the journalist, the systems supported the decision-making processes. This way, the journalist can save hours of the groundwork for since they already have some initial data and results with minimal amount of effort. Therefore, the journalist can take these data as an initial seed for further investigation.

By Professor Dr Mohammad Mahdavi, Programme Leader Data Science, AI, and Digital Business at GISMA Business School

Becoming digital allows election campaigns to promote their candidates more effectively and economically. Campaigners can collect and analyse users’ public data on online social networks to target their potential voters, approach them with personalised messages, and convince them to vote for their specific candidates.

While this approach is generally sound, what usually happens in election campaigns reminds us of the famous Sergey Nechayev’s quote: “The end justifies the means.” Some candidates do not hesitate to resort to any ethical/nonethical means to increase their election chances or harm their rival’s reputation. One of these nonethical means is to propagate disinformation, which is defined as false information that is spread deliberately to deceive. As an example, Donald Trump once tweeted, “I WON THIS ELECTION, BY A LOT!”

Traditionally, journalists have to go through hours of archival data collection and analysis to fact-check such a claim. Although the above example tweet was easy to refute based on the US 2020 election statistics, fact-checking of claims is not always that easy. Consider a hypothetical election campaigner who claims that, under his last four-year presidency, “the country became the first economic power of Europe.” Here, our journalist will have a harder fact-checking task. They’d first need to filter out this particular sentence from the candidate’s long speech as this sentence contains a claim. Next, they need to come up with some definitions for “being the first economic power” and “Europe”. Then, they need to collect some data related to the economic indexes of different European countries during a specific period. Finally, they must analyse the data to see whether the numbers support the original candidate’s claim or not. This process, which is usually tedious, time consuming, and error prone, has to be repeated for every candidate and every claim.

This is where data science, artificial intelligence and machine learning-based approaches can come into the picture to facilitate this fact-checking process for our journalist. Data science approaches can (semi)automate each of the above tasks. First, we can train a claim detection classifier that processes all sentences of each candidate’s speech to automatically filter those that contain a claim. Second, a keyword extraction approach can automatically extract the most important phrases of our filtered sentence, such as “our country”, “the first economic power”, and “Europe”. Third, an information retrieval system can automatically search and retrieve all the archival (un)structured datasets related to these keywords. These datasets could be an unstructured economic report containing the specified keywords or a structured table of economic indexes of different countries that is annotated with similar keywords. Finally, we can have a final trained model to take all these collected datasets and the original candidate’s claim to estimate the truth score of the claim based on the collected data.

Seems like magic? Wait a second! We are not yet at the point where this whole process can be automated, as described above. Although we can technically build all the described systems, in practice, their performance might not be that impressive. The main reason is that to train a smart approach for each of the mentioned steps, we usually need to collect a large set of training examples. For example, for training the claim detection classifier, we need to provide thousands (or even millions) of examples of sentences that do/don’t contain a claim.

Where there is a lack of enough training data, these data science approaches might not always generate a correct and complete result set. The claim detection classifier might miss some of the claims or wrongly mark some normal sentences. Similarly, the keyword extraction approach might miss some keywords or extract non-keyword phrases. The same is true for the information retrieval system that might miss some relevant datasets or retrieve some irrelevant ones. The final task to estimate the truth score of the claim based on the collected data is perhaps the most challenging step as this score estimation could be a subjective calculation. If we, human beings, cannot agree on the truth score of a claim based on the current facts and data, how can we expect machines to do this task for us accurately?

Despite the natural challenges of this fact-checking process and the imperfectness of the data science approaches, they can still support humans in this process. That is why such systems are usually called “decision support systems” as they are not going to completely take over a human’s role – At least not yet! These systems support human beings, in our example of the journalist, the systems supported the decision-making processes. This way, the journalist can save hours of the groundwork for since they already have some initial data and results with minimal amount of effort. Therefore, the journalist can take these data as an initial seed for further investigation.