Skip to content

FIWARE/Apache Platform aiming to cover the entire industrial data value chain

Notifications You must be signed in to change notification settings

Engineering-Research-and-Development/opr4aa-platform

Repository files navigation

On-The-Edge Processing Component for Acquisition and Actuation (OPR4AA)

opr4aa logo

OPR4AA stands for On-the-edge PRocessing for Acquisition and Actuation: it is a platform based on open source FIWARE/Apache components aiming to cover the entire industrial data value chain. Built on top of the DIDA (Digital Industries Data Analytics) platform, it can be deployed for edge processing, collecting data, process them and communicate results with other external cloud modules

The component allows you to bring input data call some external Rest API, process them using AI algorithms (e.g. Python Tensorflow + Keras) persist them in the internal persistence layer or export them to external modules through Rest API.

The OPR4AA platform is composed by:

  • FIWARE Draco (deprecated): based on Apache NiFi. NiFi is a data-flow system based on the flow-based concept programming designed to automate the flow of data in systems and support direct and scalable graphics.
  • Apache Airflow: an open-source platform that allows users to programmatically author, schedule, and monitor workflows. It is designed to be highly scalable, extensible, and modular, making it ideal for creating complex data processing pipelines.
  • Apache Hadoop Distributed File System (HDFS): designed to reliably store very large files across machines in a large cluster
  • Apache Spark: is an open-source parallel processing framework for running largescale data analytics applications across clustered computers. It can handle both batch and real-time analytics and data processing workloads.
  • Apache Livy: is a service allowing easy interaction with a Spark cluster over a REST interface. Through it, can be easily submitted Spark jobs or snippets of Spark code, synchronous or asynchronous result retrieval, as well as Spark Context management, everything via a simple REST interface or an RPC client library. Apache Livy also simplifies the interaction between Spark and application servers.

Requirements

  • Docker Engine
  • Minimum 8GB RAM
  • Docker Compose >= 1.29

How to run

Build & Run containers:

Before starting, chose wether to run AirFlow or NiFi as ingestion module. Comment/Uncomment docker-compose.yml accordingly.

docker network create -d bridge network-bridge

docker-compose up --build -d

docker-compose -f airflow.yml up --build -d


Access the UIs

  1. Airflow at (https://localhost:8087/)
  2. NiFi at (https://localhost:8443/nifi)
  3. HDFS at (http://localhost:9870/explorer.html)
  4. Spark Master at (http://localhost:8080/)
  5. Spark Worker at (http://localhost:8081/)
  6. Livy UI at (http://localhost:8998/)

Configuration

Using Airflow

In Airflow you can configure your own data flow (see Airflow doc here) in a more lightweight way. The solution provides processing on Spark.

Airflow credentials

User Password
airflow airflow

How to run a DAG programmatically

curl --location --request POST 'http://localhost:8087/api/v1/dags/test-pipeline/dagRuns' \
    --header 'Authorization: Basic YWlyZmxvdzphaXJmbG93' \
    --header 'Content-Type: application/json' \
    --data-raw '{
        "conf": {
            "host": "api.host.cloud",
            "username": "*****",
            "password": "*****",
            "source_entity": {
                "entity_id": "OPR4AA-Execution-Test",
                "entity_type": "Entity-Type-Test",
                "attribute":"image"
            },
            "sink_entity": {
                "entity_id": "OPR4AA-Execution-Test",
                "entity_type": "Entity-Type-Test",
                "attribute":"evaluation"
            }
        }
    }'

Basic Authorization header is generated by base64 encoding of string "airflow:airflow"

A Postman collection for DAG Run is provided.


Using Draco

In Draco you can configure your own data flow (see NiFi doc here). The solution provides processing on Draco or Spark. Algorithms & data ingestion can be done by calling the provided APIs. Inside the template pre-loaded on Draco you can activate the flows you prefer to use and can configure each NiFi processor following the notes on the UI.

Draco/NiFi credentials

User Password
admin ctsBtRBKHRAx69EqUghvvgEvjnaLjFEB

Draco will start with pre-uploaded template:


API (available only using Draco)

HTTP Method Port Service Description
POST 8085 /ingest-algorithm Algorithm data ingestion route. Accept a .zip file that contains algorithm files.
POST 8086 /ingest-data Input data ingestion route. Accept a file.

A Postman collection for algorithms & data ingestion is provided.


Test AI classifier algorithm

The solution is provided with an example you can run to test the solution. An AI Python image classifier is provided, including a pre-trained ready-to-use neural network model based on Tensorflow and Keras (see here).

About

FIWARE/Apache Platform aiming to cover the entire industrial data value chain

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published