Frosthack 2k24

Theme: Unleashing the Power of Generative AI

Problem Statement

Guardians of Truth : The spread of misinformation through deepfakes and fake news threatens trust online. Develop AI powered solutions to combat these issues

Solution

In the era of rampant deepfakes and misinformation, Project Bloodline confronts these rising threats. Our platform safeguards users from deceptive content while providing reliable news updates, empowering them to stay informed and resist manipulation.

Introduction

In today's digital landscape, the proliferation of deepfake technology and fake news poses significant threats to society. Deepfakes, which are highly realistic manipulated media, have the potential to deceive individuals and manipulate public opinion. Similarly, the spread of fake news undermines trust in information sources and leads to widespread misinformation. Recognizing the urgent need to address these challenges, our team has developed the Bloodline project, which offers a suite of tools designed to safeguard individuals from the perils of manipulated media and false information.

Key Features

Guardian (Deepfake Detection): Advanced AI model that analyzes images to expose deepfakes, protecting users from manipulated media.
TruthGuard (Fake News Detection): Combats the spread of misinformation by pinpointing fake news articles, promoting truth and transparency.
News Update (News-Bot): Delivers trustworthy news updates from reputable sources, enabling users to stay informed about global events.

vid_rec.1.1.mov

Tech Stack

Frontend:

HTML
CSS
JavaScript
Bootstrap

Backend:

Express.js
Node.js
PostgreSQL
FastAPI

AI Models:

Vision Transformers (Clip ViT-14) Repo Link
Fully Connected Neural Network (FCNN)
LSTM
Gemini

Libraries:

FastApi
FastAi
cv2
Pillow
Tensorflow

Dataset

Deepfake Detection

Source: Collected from various Generative AI (Gen-AI) models, including Generative Adversarial Networks (GAN), Diffusion models, and DALLE.
Description: This dataset comprises a diverse collection of manipulated media generated by cutting-edge AI models. These deepfake videos and images serve as the foundation for training our AI models to accurately detect and classify manipulated content. By leveraging data sourced from state-of-the-art Generative AI models, our deepfake detection system is equipped to identify sophisticated manipulations and ensure the integrity of visual media.

Dataset links:

Diffusion Models: Link

GAN's: Link

Fake News Detection

Source: Sourced from Kaggle datasets containing a mixture of authentic and fabricated news articles.
Description: The fake news detection dataset consists of a comprehensive collection of news articles spanning various topics and genres. This dataset enables our AI models to distinguish between genuine and fabricated news content, facilitating the identification and mitigation of misinformation. By training on a diverse range of news articles, our fake news detection system is equipped to combat the spread of false information and promote the dissemination of accurate and reliable news sources.

Dataset links:

Bharat fake news kosh Link
IFND dataset Link
Fake real news : Link

Functioning of model

1. Guardian: Detection of DeepFake Images

We have implemented deepfake image classification using Progan dataset. It involves reading and preprocessing images from a ProGAN dataset, extracting features using the CLIP (Contrastive Language-Image Pre-training) model, visualizing feature embeddings, and implementing image classification using nearest neighbor and linear classification approaches.

Data Reading

Reading data from ProGan (Real and Fake Images) for Training Phase
It iterate over each file and Reads the image using cv2.imread()
It checks if the image is in RGB Format and appends the image if it satisfies this criterion
Now we Resize the image into (224,224,3) dimensions

Preprocessing

1. JPEG Compression : It converts the format of image to .jpg with 80%
1. Gaussian Blur : It applies a gaussian blur filter with kernel size=(5,5) and std dev=0 to half of the images using np.random.rand()<0.5

Feature Extraction

Features from the images are extracted using the CLIP (Contrastive Language-Image Pre-training) model. The features are then saved for further analysis and classification tasks.

Linear Classification Approach

For Linear Classification, I am training a neural network using TensorFlow's Keras API (FCNN network)

Data Preparation: features_real and features_fake are concatenated into X_trainand corresponding labels Y_train are defined. Then both are converted into numpy arrays
Hyperparameters:
- Batch Size: 30 (as instructed in the research paper)
- Epochs: 7 (The model acheives a training accuracy of 1 so trained on less no of epochs)
Model Structure: It consists of two fully connected layers:
- First layer has 768 units and uses the ReLU activation function.
- Output layer has 1 unit with a sigmoid activation function, for binary classification problems.
Model Compilation: The model is compiled using the Adam optimizer and binary crossentropy loss function.
- Used Adam as optimizer
- Used Binary crossentropy loss as Loss Function

Accuracy

I had trained on Progan image set

Dataset	Accuracy	SOTA
CycleGan	97.3%	98.5%
Dalle and Laion	92.9%	86.78%
Glide-100 and Laion	86.9%	79%

Resullts

In 2021, Washington D.C. was stunned by viral deepfake images of Tom Cruise. Our model can accurately detect real and fake images, providing crucial insights during such incidents.

2. TruthGuard: Detection of Fake News

Functioning of model:

News input by user in form of Articles
Prediction in terms of real and fake base on our model
Providing correct news by integrating Gemini model based on the given article by user

Data Reading and preprocessing

Merged data from multiple sources containing factual and news articles, both real and fake, to create a comprehensive dataset.
Modified input text by converting it to lowercase, removing digits, new lines, HTTP links, HTML tags, and special characters to ensure uniformity and improve model performance.

Data Loading

Utilizes the TextDataLoaders class from the Fastai library to load the prepared dataset.
Reads data from CSV files, including news articles labeled as real or fake.

Model Architecture

Implements a text classifier using the AWD_LSTM architecture.
Utilizes a bidirectional LSTM (Long Short-Term Memory) neural network for text classification.

Training of Model

Creating a DataLoader object dls containg training and validation data
- Validation split- 20%
Initializes a text classifier model (learn) using the AWD-LSTM architecture

Specifies the DataLoader (dls) and evaluation metric (accuracy).
Trraining the text classifier model (learn) using the one-cycle policy
- Epochs=10
- Learning Rate=0.01
Creating an interpretation object (interp) from the trained model (learn) for classification tasks
Generating a confusion matrix plot using the interpretation object (interp), showing the performance of the model on different classes.

Accuracy:

The model trained using LSTM obtained 93% accuracy Below is the confusion matrix on training data

Responses:

We gave some fake news related to a fake Twitter account impersonating Republic TV is stirring controversy with biased polls, mistaken for the official channel. Despite being called out as parody, it continues to tweet divisive polls, prompting criticism from users.Our model is predicted correctly as fake

We gave another prompt as current news regarding placement scenario in IIT's and model correctly identified the situation.

Features

1. Guardian and TruthGuard AI Models

Guardian: Defending Against Deepfakes

Guardian is an advanced AI model specifically designed to analyze images and detect deepfakes. Leveraging state-of-the-art technology, Guardian employs sophisticated algorithms to scrutinize images and identify signs of manipulation.

By detecting deepfakes, Guardian helps individuals safeguard their online presence and protect themselves from potential harm. Furthermore, Guardian also verifies the authenticity of images, ensuring that users can trust the content they encounter online.

TruthGuard: Combatting Fake News

TruthGuard is a powerful tool developed to detect fake news articles and combat the spread of misinformation. Utilizing cutting-edge machine learning techniques, TruthGuard analyzes news content to distinguish between genuine and fabricated information.

By training on a diverse dataset comprising both real and fake news articles, TruthGuard has been fine-tuned to accurately classify news content and provide users with reliable information. Additionally, TruthGuard cross-references its results with a Gemini model to identify relevant news articles, ensuring that users receive timely and pertinent updates.

2. Sign-in Process with Secure Storage

To ensure the security of user data, the Bloodline project implements a robust sign-in process with secure storage mechanisms. Users can securely access the platform by providing their credentials through a POST request system.

User data is stored in a PostgreSQL database, utilizing industry-standard encryption techniques to safeguard sensitive information. By employing bcrypt hashing for password storage, the Bloodline project prioritizes the protection of user privacy and security.

3. Image Verification

The Bloodline project incorporates advanced image verification capabilities to identify and combat deepfakes. Leveraging a comprehensive dataset sourced from state-of-the-art AI models such as GAN and DALL-E, the project's AI models have been trained to recognize patterns indicative of image manipulation.

Using the Clip Vit-14 model for feature extraction, the project's AI accurately classifies images as either genuine or manipulated, providing users with the assurance of authenticity.

4. News Update Feature

In today's rapidly evolving world, access to accurate and reliable news is essential for making informed decisions and understanding global events. To facilitate this, the Bloodline project integrates a news-bot feature, providing users with the latest news updates from reputable sources. By aggregating news content from trusted sources, the news-bot ensures that users have access to timely and credible information, enabling them to stay informed and engaged with current affairs.

Quick Start

Follow these steps to set up the project locally on your machine.

Prerequisites

Make sure you have the following installed on your machine:

Git
Node.js
npm (Node Package Manager)
Postgre-SQL

Cloning the Repository

git clone https://github.com/toshan07/Guardians-of-Truth
cd Guardians-of-Truth

**Installation**

Install the project dependencies using npm:

```bash
npm install

Install the model packages from requirement.txt :

pip install -r requirements.txt

Running the Project

Open the ejs_server, by running the index.js file.

npm i
nodemon index.js

Open the Test_Ml_Models, then run the model_deepfake and model_fake_news file to run the Fast Api Server which is used for deploying the models.

Demo

Contributors

Team: The BLOODLINE 🔥🔥

The project was developed by the following contributors-

Toshan Gupta
Naman Singhania
Mayank Goel
Vishnu

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
Data		Data
Test_ml_models		Test_ml_models
Train_ML_Models		Train_ML_Models
img_vid		img_vid
node_modules		node_modules
public		public
saved_models		saved_models
uploads		uploads
views		views
.gitattributes		.gitattributes
README.md		README.md
index.js		index.js
package-lock.json		package-lock.json
package.json		package.json
requirements.txt		requirements.txt

toshan07/Guardians-of-Truth

Folders and files

Latest commit

History

Repository files navigation