Guardians of Truth : The spread of misinformation through deepfakes and fake news threatens trust online. Develop AI powered solutions to combat these issues
In the era of rampant deepfakes and misinformation, Project Bloodline confronts these rising threats. Our platform safeguards users from deceptive content while providing reliable news updates, empowering them to stay informed and resist manipulation.
In today's digital landscape, the proliferation of deepfake technology and fake news poses significant threats to society. Deepfakes, which are highly realistic manipulated media, have the potential to deceive individuals and manipulate public opinion. Similarly, the spread of fake news undermines trust in information sources and leads to widespread misinformation. Recognizing the urgent need to address these challenges, our team has developed the Bloodline project, which offers a suite of tools designed to safeguard individuals from the perils of manipulated media and false information.
Key Features
- Guardian (Deepfake Detection): Advanced AI model that analyzes images to expose deepfakes, protecting users from manipulated media.
- TruthGuard (Fake News Detection): Combats the spread of misinformation by pinpointing fake news articles, promoting truth and transparency.
- News Update (News-Bot): Delivers trustworthy news updates from reputable sources, enabling users to stay informed about global events.
vid_rec.1.1.mov
- HTML
- CSS
- JavaScript
- Bootstrap
- Express.js
- Node.js
- PostgreSQL
- FastAPI
- Vision Transformers (Clip ViT-14) Repo Link
- Fully Connected Neural Network (FCNN)
- LSTM
- Gemini
- FastApi
- FastAi
- cv2
- Pillow
- Tensorflow
- Source: Collected from various Generative AI (Gen-AI) models, including Generative Adversarial Networks (GAN), Diffusion models, and DALLE.
- Description: This dataset comprises a diverse collection of manipulated media generated by cutting-edge AI models. These deepfake videos and images serve as the foundation for training our AI models to accurately detect and classify manipulated content. By leveraging data sourced from state-of-the-art Generative AI models, our deepfake detection system is equipped to identify sophisticated manipulations and ensure the integrity of visual media.
Dataset links:
Diffusion Models: Link
GAN's: Link
- Source: Sourced from Kaggle datasets containing a mixture of authentic and fabricated news articles.
- Description: The fake news detection dataset consists of a comprehensive collection of news articles spanning various topics and genres. This dataset enables our AI models to distinguish between genuine and fabricated news content, facilitating the identification and mitigation of misinformation. By training on a diverse range of news articles, our fake news detection system is equipped to combat the spread of false information and promote the dissemination of accurate and reliable news sources.
Dataset links:
We have implemented deepfake image classification using Progan dataset. It involves reading and preprocessing images from a ProGAN dataset, extracting features using the CLIP (Contrastive Language-Image Pre-training) model, visualizing feature embeddings, and implementing image classification using nearest neighbor and linear classification approaches.
- Reading data from ProGan (Real and Fake Images) for Training Phase
- It iterate over each file and Reads the image using cv2.imread()
- It checks if the image is in RGB Format and appends the image if it satisfies this criterion
- Now we Resize the image into (224,224,3) dimensions
-
- JPEG Compression : It converts the format of image to .jpg with 80%
-
- Gaussian Blur : It applies a gaussian blur filter with kernel size=(5,5) and std dev=0 to half of the images using np.random.rand()<0.5
Features from the images are extracted using the CLIP (Contrastive Language-Image Pre-training) model. The features are then saved for further analysis and classification tasks.
For Linear Classification, I am training a neural network using TensorFlow's Keras API (FCNN network)
-
Data Preparation:
features_real
andfeatures_fake
are concatenated intoX_train
and corresponding labelsY_train
are defined. Then both are converted into numpy arrays -
Hyperparameters:
- Batch Size: 30 (as instructed in the research paper)
- Epochs: 7 (The model acheives a training accuracy of 1 so trained on less no of epochs)
-
Model Structure: It consists of two fully connected layers:
- First layer has 768 units and uses the ReLU activation function.
- Output layer has 1 unit with a sigmoid activation function, for binary classification problems.
-
Model Compilation: The model is compiled using the Adam optimizer and binary crossentropy loss function.
- Used Adam as optimizer
- Used Binary crossentropy loss as Loss Function
- I had trained on Progan image set
Dataset | Accuracy | SOTA |
---|---|---|
CycleGan | 97.3% | 98.5% |
Dalle and Laion | 92.9% | 86.78% |
Glide-100 and Laion | 86.9% | 79% |
In 2021, Washington D.C. was stunned by viral deepfake images of Tom Cruise. Our model can accurately detect real and fake images, providing crucial insights during such incidents.
- News input by user in form of Articles
- Prediction in terms of real and fake base on our model
- Providing correct news by integrating Gemini model based on the given article by user
- Merged data from multiple sources containing factual and news articles, both real and fake, to create a comprehensive dataset.
- Modified input text by converting it to lowercase, removing digits, new lines, HTTP links, HTML tags, and special characters to ensure uniformity and improve model performance.
- Utilizes the TextDataLoaders class from the Fastai library to load the prepared dataset.
- Reads data from CSV files, including news articles labeled as real or fake.
- Implements a text classifier using the AWD_LSTM architecture.
- Utilizes a bidirectional LSTM (Long Short-Term Memory) neural network for text classification.
-
Creating a DataLoader object
dls
containg training and validation data- Validation split- 20%
-
Initializes a text classifier model (
learn
) using the AWD-LSTM architectureSpecifies the DataLoader (
dls
) and evaluation metric (accuracy). -
Trraining the text classifier model (
learn
) using the one-cycle policy- Epochs=10
- Learning Rate=0.01
-
Creating an interpretation object (
interp
) from the trained model (learn
) for classification tasks -
Generating a confusion matrix plot using the interpretation object (
interp
), showing the performance of the model on different classes.
The model trained using LSTM obtained 93% accuracy Below is the confusion matrix on training data
We gave some fake news related to a fake Twitter account impersonating Republic TV is stirring controversy with biased polls, mistaken for the official channel. Despite being called out as parody, it continues to tweet divisive polls, prompting criticism from users.Our model is predicted correctly as fake
We gave another prompt as current news regarding placement scenario in IIT's and model correctly identified the situation.
Guardian is an advanced AI model specifically designed to analyze images and detect deepfakes. Leveraging state-of-the-art technology, Guardian employs sophisticated algorithms to scrutinize images and identify signs of manipulation.
By detecting deepfakes, Guardian helps individuals safeguard their online presence and protect themselves from potential harm. Furthermore, Guardian also verifies the authenticity of images, ensuring that users can trust the content they encounter online.
TruthGuard is a powerful tool developed to detect fake news articles and combat the spread of misinformation. Utilizing cutting-edge machine learning techniques, TruthGuard analyzes news content to distinguish between genuine and fabricated information.
By training on a diverse dataset comprising both real and fake news articles, TruthGuard has been fine-tuned to accurately classify news content and provide users with reliable information. Additionally, TruthGuard cross-references its results with a Gemini model to identify relevant news articles, ensuring that users receive timely and pertinent updates.
To ensure the security of user data, the Bloodline project implements a robust sign-in process with secure storage mechanisms. Users can securely access the platform by providing their credentials through a POST request system.
User data is stored in a PostgreSQL database, utilizing industry-standard encryption techniques to safeguard sensitive information. By employing bcrypt hashing for password storage, the Bloodline project prioritizes the protection of user privacy and security.
The Bloodline project incorporates advanced image verification capabilities to identify and combat deepfakes. Leveraging a comprehensive dataset sourced from state-of-the-art AI models such as GAN and DALL-E, the project's AI models have been trained to recognize patterns indicative of image manipulation.
Using the Clip Vit-14 model for feature extraction, the project's AI accurately classifies images as either genuine or manipulated, providing users with the assurance of authenticity.
In today's rapidly evolving world, access to accurate and reliable news is essential for making informed decisions and understanding global events. To facilitate this, the Bloodline project integrates a news-bot feature, providing users with the latest news updates from reputable sources. By aggregating news content from trusted sources, the news-bot ensures that users have access to timely and credible information, enabling them to stay informed and engaged with current affairs.
Follow these steps to set up the project locally on your machine.
Make sure you have the following installed on your machine:
- Git
- Node.js
- npm (Node Package Manager)
- Postgre-SQL
git clone https://github.com/toshan07/Guardians-of-Truth
cd Guardians-of-Truth
**Installation**
Install the project dependencies using npm:
```bash
npm install
Install the model packages from requirement.txt :
pip install -r requirements.txt
Running the Project
Open the ejs_server
, by running the index.js
file.
npm i
nodemon index.js
Open the Test_Ml_Models
, then run the model_deepfake
and model_fake_news
file to run the Fast Api Server which is used for deploying the models.
Team: The BLOODLINE 🔥🔥
The project was developed by the following contributors-
- Toshan Gupta
- Naman Singhania
- Mayank Goel
- Vishnu