cds490

Covid Disinformation Model

Social media exponentially improves communication across networks for all types of content - including disinformation. The novel coronavirus has been a target for disinformation campaigns with dissemination of content such as the development of vaccines, COVID conspriracy theories targeted at Bill Gates, etc.). If instances of disinformation can be detected autonomously, moderators will be able to detect emerging content before they become viral, and launch counter messaging.

Pre-labeled data used to train a Long Short Term Memory neural network written using Tensorflow to recognize instances of COVID disinformation on Twitter. The repositories for the dataests can be found below:

After scraping the data (n = 10,678), the Tweets are pre-processed with the spaCy module (stopword removal, word-stemming, etc.):

The tokenized text is then used to train the LSTM model - achieving an accuracy score of .95 on the validation set.

To view confirm the importance of disinforation analsysis, I created a graph of the mentions network. We can clearly see that there are "superspreaders" of disinformation. If we can detect disinformation and create networks, we can determine key influencers (nodes) in the networks and implement more effective countermessage targetting.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.ipynb_checkpoints		.ipynb_checkpoints
data		data
model1		model1
model2		model2
model3		model3
README.md		README.md
class_matrix.PNG		class_matrix.PNG
data_analysis.ipynb		data_analysis.ipynb
graphfinal.png		graphfinal.png
nx_analysis.ipynb		nx_analysis.ipynb
spacy_lemma.PNG		spacy_lemma.PNG

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

cds490

About

Releases

Packages

Languages

shoang22/cds490

Folders and files

Latest commit

History

Repository files navigation

cds490

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages