Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial release of Falco integration #9619

Merged

Conversation

brewcore
Copy link
Contributor

@brewcore brewcore commented Apr 16, 2024

What does this PR do?

This is an initial release of a new integration for Falco. It captures events (called Alerts in Falco) that are created by Falco's Rules. It includes:

  • A data stream for events from Falco rules.
  • Data collection logic for events data stream
  • Ingest pipeline for events data stream
  • Mapped fields according to the ECS schema and added Fields metadata in the appropriate yml files
  • Dashboard and visualizations of events.
  • Test for pipeline for event data stream.
  • System test cases for event data stream.
  • Documentation for users on how to configure Falco for this integration.

Checklist

  • I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.
  • I have added an entry to my package's changelog.yml file.
  • I have verified that Kibana version constraints are current according to guidelines.

What's Ready for Review:

  • Data Stream & Mappings
  • Ingest Pipelines
  • Pipeline Tests
  • System (Integration) Tests
  • Visualizations
  • Documentation

How to test this PR locally

Related issues

Automated Test

--- Test results for package: falco - START ---
╭─────────┬──────────────┬───────────┬────────────────────────────────────────────────────────────────┬────────┬──────────────╮
│ PACKAGE │ DATA STREAM  │ TEST TYPE │ TEST NAME                                                      │ RESULT │ TIME ELAPSED │
├─────────┼──────────────┼───────────┼────────────────────────────────────────────────────────────────┼────────┼──────────────┤
│ falco   │              │ asset     │ dashboard falco-a17f684e-6905-4bb0-861a-eb10cd4c518f is loaded │ PASS   │        792ns │
│ falco   │              │ asset     │ search falco-f0d52d00-50cb-4b23-8755-c6966a30462b is loaded    │ PASS   │        125ns │
│ falco   │ falco_alerts │ asset     │ index_template logs-falco.falco_alerts is loaded               │ PASS   │         42ns │
│ falco   │ falco_alerts │ asset     │ ingest_pipeline logs-falco.falco_alerts-0.1.0 is loaded        │ PASS   │         83ns │
╰─────────┴──────────────┴───────────┴────────────────────────────────────────────────────────────────┴────────┴──────────────╯
--- Test results for package: falco - END   ---

--- Test results for package: falco - START ---
╭─────────┬──────────────┬───────────┬────────────────────────────┬────────┬──────────────╮
│ PACKAGE │ DATA STREAM  │ TEST TYPE │ TEST NAME                  │ RESULT │ TIME ELAPSED │
├─────────┼──────────────┼───────────┼────────────────────────────┼────────┼──────────────┤
│ falco   │ falco_alerts │ pipeline  │ test-falco.log             │ PASS   │  26.470208ms │
│ falco   │ falco_alerts │ pipeline  │ test-nopreserve.log        │ PASS   │    4.02925ms │
│ falco   │ falco_alerts │ pipeline  │ (ingest pipeline warnings) │ PASS   │ 174.821875ms │
╰─────────┴──────────────┴───────────┴────────────────────────────┴────────┴──────────────╯
--- Test results for package: falco - END   ---

--- Test results for package: falco - START ---
╭─────────┬──────────────┬───────────┬──────────────────────────┬────────┬──────────────╮
│ PACKAGE │ DATA STREAM  │ TEST TYPE │ TEST NAME                │ RESULT │ TIME ELAPSED │
├─────────┼──────────────┼───────────┼──────────────────────────┼────────┼──────────────┤
│ falco   │ falco_alerts │ static    │ Verify sample_event.json │ PASS   │  79.931708ms │
╰─────────┴──────────────┴───────────┴──────────────────────────┴────────┴──────────────╯
--- Test results for package: falco - END   ---

--- Test results for package: falco - START ---
╭─────────┬──────────────┬───────────┬───────────┬────────┬───────────────╮
│ PACKAGE │ DATA STREAM  │ TEST TYPE │ TEST NAME │ RESULT │  TIME ELAPSED │
├─────────┼──────────────┼───────────┼───────────┼────────┼───────────────┤
│ falco   │ falco_alerts │ system    │ logfile   │ PASS   │   34.7337035s │
│ falco   │ falco_alerts │ system    │ default   │ PASS   │ 37.119049875s │
╰─────────┴──────────────┴───────────┴───────────┴────────┴───────────────╯
--- Test results for package: falco - END   ---

Screenshot

falco-integration-pane
Dashboard Screenshot

Copy link

cla-checker-service bot commented Apr 16, 2024

💚 CLA has been signed

@narph narph added the Team:Security-Service Integrations Security Service Integrations Team [elastic/security-service-integrations] label Apr 25, 2024
Copy link
Contributor

@chemamartinez chemamartinez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an initial review of the ingest pipeline, mappings, and input.

packages/falco/changelog.yml Outdated Show resolved Hide resolved
packages/falco/data_stream/falco_alerts/sample_event.json Outdated Show resolved Hide resolved
packages/falco/data_stream/falco_alerts/manifest.yml Outdated Show resolved Hide resolved
packages/falco/data_stream/falco_alerts/fields/ecs.yml Outdated Show resolved Hide resolved
packages/falco/data_stream/falco_alerts/fields/ecs.yml Outdated Show resolved Hide resolved
packages/falco/data_stream/falco_alerts/fields/fields.yml Outdated Show resolved Hide resolved
cole-labar and others added 5 commits May 2, 2024 11:52
…edback:

  - Updated field definitions and mapping of fields within pipeline
  - Added agent.yml field definitions
  - Updated test suites and removed unnecessary test
- Adding in deployment updates from k8s to Docker
- Integration cleanup and asset management
- Updated documentation
@brewcore

This comment was marked as outdated.

brewcore added 2 commits May 9, 2024 15:31
Tests are still not passing due to undocumented fields coming from Falco.
@narph narph added the New Integration Issue or pull request for creating a new integration package. label May 27, 2024
Copy link
Contributor

@chemamartinez chemamartinez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not the final review but another iteration of completed items.

packages/falco/docs/README.md Show resolved Hide resolved
packages/falco/manifest.yml Outdated Show resolved Hide resolved
packages/falco/manifest.yml Outdated Show resolved Hide resolved
packages/falco/manifest.yml Outdated Show resolved Hide resolved
packages/falco/data_stream/falco_alerts/manifest.yml Outdated Show resolved Hide resolved
packages/falco/data_stream/falco_alerts/fields/fields.yml Outdated Show resolved Hide resolved
packages/falco/data_stream/falco_alerts/fields/fields.yml Outdated Show resolved Hide resolved
packages/falco/data_stream/falco_alerts/fields/fields.yml Outdated Show resolved Hide resolved
cole-labar and others added 8 commits June 28, 2024 08:48
  - Added validation.yml for skipping dashboard references preventing build
  - Added screencap of Falco dashboard
  - Updated sample logs for testing
  - Updated pipeline and fieldspec to correctly manage nested vs value fields
  - Updated documentation to match changes
@chemamartinez
Copy link
Contributor

/test

@chemamartinez
Copy link
Contributor

/test

@chemamartinez
Copy link
Contributor

/test

@aleksmaus
Copy link
Member

👋
what do you think about adding change to the pipeline?

diff --git a/packages/falco/data_stream/alerts/elasticsearch/ingest_pipeline/default.yml b/packages/falco/data_stream/alerts/elasticsearch/ingest_pipeline/default.yml
index b94eaf89c4..867a96491f 100644
--- a/packages/falco/data_stream/alerts/elasticsearch/ingest_pipeline/default.yml
+++ b/packages/falco/data_stream/alerts/elasticsearch/ingest_pipeline/default.yml
@@ -7,6 +7,31 @@ processors:
 - json:
     field: message
     target_field: falco
+    if: ctx?.message != null
+- script:
+    tag: move_falco_fields
+    lang: painless
+    if: ctx?.output_fields != null
+    params:
+        fields:
+            - uuid
+            - output
+            - priority
+            - rule
+            - time
+            - output_fields
+            - source
+            - tags
+            - hostname
+    source: >-
+        def m = new HashMap();
+        for (def v : params.fields) {
+            if (ctx.containsKey(v)) {
+                m[v] = ctx[v];
+                ctx.remove(v);
+            }
+        }
+        ctx['falco'] = m
 - dot_expander:
     field: 'container.image'
     path: falco.output_fields

This would allow us to reuse this integration for the Falco Sidekick output that we are POCing with at the moment.

Basically with this change:

  1. The json processor is only used when ctx.message is present
  2. The additional script processor moves the top level falco fields under falco child key which makes the schema compatible with the rest of the pipeline processing.

Here is an example of the typical Falco Sidekick payload

{
    "uuid": "599eed35-0dfb-4309-800c-a2d17852611b",
    "output": "11:36:23.388683755: Warning Sensitive file opened for reading by non-trusted program (file=/etc/pam.d/cron gparent=sudo ggparent=bash gggparent=konsole evt_type=openat user=root user_uid=0 user_loginuid=1000 process=cat proc_exepath=/usr/bin/cat parent=sudo command=cat /etc/pam.d/cron terminal=34820 container_id=host container_name=host)",
    "priority": "Warning",
    "rule": "Read sensitive file untrusted",
    "time": "2024-08-06T15:36:23.388683755Z",
    "output_fields": {
        "container.id": "host",
        "container.name": "host",
        "evt.time": 1722958583388683755,
        "evt.type": "openat",
        "fd.name": "/etc/pam.d/cron",
        "proc.aname[2]": "sudo",
        "proc.aname[3]": "bash",
        "proc.aname[4]": "konsole",
        "proc.cmdline": "cat /etc/pam.d/cron",
        "proc.exepath": "/usr/bin/cat",
        "proc.name": "cat",
        "proc.pname": "sudo",
        "proc.tty": 34820,
        "user.loginuid": 1000,
        "user.name": "root",
        "user.uid": 0
    },
    "source": "syscall",
    "tags": [
        "T1555",
        "container",
        "filesystem",
        "host",
        "maturity_stable",
        "mitre_credential_access"
    ],
    "hostname": "lebuntu"
}

Here is an example of the Elasticsearch document

      {
        "_index": ".ds-logs-falco.alerts-default-2024.08.08-000001",
        "_id": "yK2-M5EBzEEAojd2_mW4",
        "_score": 1,
        "_source": {
          "container": {
            "name": "host",
            "id": "host"
          },
          "event.severity": 3,
          "process": {
            "parent": {
              "name": "sudo"
            },
            "name": "cat",
            "user": {
              "name": "root",
              "id": "0"
            },
            "executable": "/usr/bin/cat",
            "command_line": "cat /etc/pam.d/cron"
          },
          "event.category": [
            "process"
          ],
          "rule": {
            "name": "Read sensitive file untrusted"
          },
          "threat.technique.id": [
            "T1555"
          ],
          "observer": {
            "hostname": "lebuntu",
            "product": "falco",
            "vendor": "sysdig",
            "type": "sensor"
          },
          "falco.container.mounts": null,
          "@timestamp": "2024-08-06T15:36:23.388Z",
          "related": {
            "hosts": [
              "lebuntu"
            ]
          },
          "falco": {
            "output": "11:36:23.388683755: Warning Sensitive file opened for reading by non-trusted program (file=/etc/pam.d/cron gparent=sudo ggparent=bash gggparent=konsole evt_type=openat user=root user_uid=0 user_loginuid=1000 process=cat proc_exepath=/usr/bin/cat parent=sudo command=cat /etc/pam.d/cron terminal=34820 container_id=host container_name=host)",
            "output_fields": {
              "container": {},
              "evt": {
                "type": "openat"
              },
              "proc": {
                "tty": 34820
              },
              "event": {
                "time": 1722958583388683800
              },
              "user": {
                "loginuid": 1000
              },
              "fd": {
                "name": "/etc/pam.d/cron"
              }
            },
            "hostname": "lebuntu",
            "time": "2024-08-06T15:36:23.388683755Z",
            "source": "syscall",
            "priority": "Warning",
            "uuid": "764e0f2d-37e8-4201-abdb-58d562f1a964",
            "tags": [
              "T1555",
              "container",
              "filesystem",
              "host",
              "maturity_stable",
              "mitre_credential_access"
            ]
          },
          "event.type": [
            "access"
          ],
          "event": {
            "agent_id_status": "missing",
            "ingested": "2024-08-08T20:47:20Z",
            "provider": "syscall",
            "kind": "alert"
          }
        }
      }

@chemamartinez
Copy link
Contributor

/test

@chemamartinez
Copy link
Contributor

Hi @aleksmaus,

As far as I know, the integration is already compatible with the Falco Sidekick output.

Currently, the TCP input includes a syslog processor that decodes incoming syslog messages as the following:

<5>2024-08-07T13:49:16Z a72f9a747cf8 Falco[1]: {"uuid":"23716645-4d9d-4254-9429-2a287a9af199","output":"2024-08-07T13:49:16.479964318+0000: Notice Shell spawned by untrusted binary (parent_exe=/tmp/falco-event-generator3282684109/httpd parent_exepath=/bin/event-generator pcmdline=httpd --loglevel info run ^helper.RunShell$ gparent=event-generator ggparent=containerd-shim aname[4]=containerd-shim aname[5]=init aname[6]=\u003cNA\u003e aname[7]=\u003cNA\u003e evt_type=execve user=root user_uid=0 user_loginuid=-1 process=bash proc_exepath=/bin/bash parent=httpd command=bash -c ls \u003e /dev/null terminal=0 exe_flags=EXE_WRITABLE container_id=2ae6a7f15b6e container_name=elastic-package-service-10413-falco-event-generator-1)","priority":"Notice","rule":"Run shell untrusted","time":"2024-08-07T13:49:16.479964318Z","output_fields":{"container.id":"2ae6a7f15b6e","container.name":"elastic-package-service-10413-falco-event-generator-1","evt.arg.flags":"EXE_WRITABLE","evt.time.iso8601":1723038556479964318,"evt.type":"execve","proc.aname[2]":"event-generator","proc.aname[3]":"containerd-shim","proc.aname[4]":"containerd-shim","proc.aname[5]":"init","proc.aname[6]":null,"proc.aname[7]":null,"proc.cmdline":"bash -c ls \u003e /dev/null","proc.exepath":"/bin/bash","proc.name":"bash","proc.pcmdline":"httpd --loglevel info run ^helper.RunShell$","proc.pexe":"/tmp/falco-event-generator3282684109/httpd","proc.pexepath":"/bin/event-generator","proc.pname":"httpd","proc.tty":0,"user.loginuid":-1,"user.name":"root","user.uid":0},"source":"syscall","tags":["T1059.004","container","host","maturity_stable","mitre_execution","process","shell"],"hostname":"e822ea6618ae"}

into a JSON object where syslog metadata is added, and the Falco data is placed into the message field. So the first json processor at the ingest pipeline has the same effect you are looking for, moving all the top level fields under falco.

- json:
    field: message
    target_field: falco

Therefore, I think adding that change won't have any effect on the pipeline. We'd appreciate it if you could verify that this integration is reusable for your use case and let us know if you find anything you may need.

Thanks!

@aleksmaus
Copy link
Member

Therefore, I think adding that change won't have any effect on the pipeline. We'd appreciate it if you could verify that this integration is reusable for your use case and let us know if you find anything you may need.

Thanks!

In our case the users were interested in using Falco with Sidekick without any Agent/Beats.
So basically Falco Sidekick directly sends JSON to Elasticsearch, equivalent to something like

POST logs-falco.alerts-default/_doc
{
  "uuid": "80f5e61c-09b0-4473-a532-d8d0b8728f35",
  "output": "11:36:23.388683755: Warning Sensitive file opened for reading by non-trusted program (file=/etc/pam.d/cron gparent=sudo ggparent=bash gggparent=konsole evt_type=openat user=root user_uid=0 user_loginuid=1000 process=cat proc_exepath=/usr/bin/cat parent=sudo command=cat /etc/pam.d/cron terminal=34820 container_id=host container_name=host)",
  "priority": "Warning",
  "rule": "Read sensitive file untrusted",
  "time": "2024-08-06T15:36:23.388683755Z",
  "output_fields": {
    "container.id": "host",
    "container.name": "host",
    "evt.time": 1722958583388683800,
    "evt.type": "openat",
    "fd.name": "/etc/pam.d/cron",
    "proc.aname[2]": "sudo",
    "proc.aname[3]": "bash",
    "proc.aname[4]": "konsole",
    "proc.cmdline": "cat /etc/pam.d/cron",
    "proc.exepath": "/usr/bin/cat",
    "proc.name": "cat",
    "proc.pname": "sudo",
    "proc.tty": 34820,
    "user.loginuid": 1000,
    "user.name": "root",
    "user.uid": 0
  },
  "source": "syscall",
  "tags": [
    "T1555",
    "container",
    "filesystem",
    "host",
    "maturity_stable",
    "mitre_credential_access"
  ],
  "hostname": "lebuntu",
  "@timestamp": "2024-08-06T15:36:23.388683755Z"
}

@chemamartinez
Copy link
Contributor

/test

@chemamartinez
Copy link
Contributor

In our case the users were interested in using Falco with Sidekick without any Agent/Beats. So basically Falco Sidekick directly sends JSON to Elasticsearch, equivalent to something like

POST logs-falco.alerts-default/_doc
{
  "uuid": "80f5e61c-09b0-4473-a532-d8d0b8728f35",
  "output": "11:36:23.388683755: Warning Sensitive file opened for reading by non-trusted program (file=/etc/pam.d/cron gparent=sudo ggparent=bash gggparent=konsole evt_type=openat user=root user_uid=0 user_loginuid=1000 process=cat proc_exepath=/usr/bin/cat parent=sudo command=cat /etc/pam.d/cron terminal=34820 container_id=host container_name=host)",
  "priority": "Warning",
  "rule": "Read sensitive file untrusted",
  "time": "2024-08-06T15:36:23.388683755Z",
  "output_fields": {
    "container.id": "host",
    "container.name": "host",
    "evt.time": 1722958583388683800,
    "evt.type": "openat",
    "fd.name": "/etc/pam.d/cron",
    "proc.aname[2]": "sudo",
    "proc.aname[3]": "bash",
    "proc.aname[4]": "konsole",
    "proc.cmdline": "cat /etc/pam.d/cron",
    "proc.exepath": "/usr/bin/cat",
    "proc.name": "cat",
    "proc.pname": "sudo",
    "proc.tty": 34820,
    "user.loginuid": 1000,
    "user.name": "root",
    "user.uid": 0
  },
  "source": "syscall",
  "tags": [
    "T1555",
    "container",
    "filesystem",
    "host",
    "maturity_stable",
    "mitre_credential_access"
  ],
  "hostname": "lebuntu",
  "@timestamp": "2024-08-06T15:36:23.388683755Z"
}

Thanks for expanding the use case, I still think that given that input, the json processor that you can find at the beginning of the ingest pipeline would move all the root fields into a falco object.

You can take a look at the pipeline tests that process events with the same format.

Maybe I am missing something, in that case please let me know.

@elasticmachine
Copy link

🚀 Benchmarks report

To see the full report comment with /test benchmark fullreport

@aleksmaus
Copy link
Member

Maybe I am missing something, in that case please let me know.

What data would it move if there is no message property in the payload?

I just did a quick test with the POST payload above and the pipeline without my additions and I'm getting the pipeline error in the document

          "error": {
            "message": [
              "Processor \"json\" with tag \"\" in pipeline \"logs-falco.alerts-0.1.12\" failed with message \"field [message] not present as part of path [message]\""
            ]
          }

Could you try that POST command above and let me know if I'm missing something?

@chemamartinez
Copy link
Contributor

What data would it move if there is no message property in the payload?

My bad, I thought you wanted to reuse the whole integration instead of just the ingest pipeline. It makes sense to me now.

In that case, @cole-labar do you mind adding the changes that @aleksmaus suggests here?

@chemamartinez
Copy link
Contributor

/test

@elasticmachine
Copy link

💚 Build Succeeded

History

@aleksmaus
Copy link
Member

My bad, I thought you wanted to reuse the whole integration instead of just the ingest pipeline. It makes sense to me now.

Yeah, this is kind of unusual use case, because most of the time we have the data shipped via beats.
Having the same data from two different sources allows users of Falco to reuse everything that this integration provides, including mappings, dashboards and whatever else is going to be developed with this integration without a need to switch to the Agent/Beats. At the same time this opens the door to show how the Agent/Beats observability could help the user to discover more context/data around Falco alerts if needed.

Thank you!

Copy link
Contributor

@chemamartinez chemamartinez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@chemamartinez chemamartinez merged commit 4523d07 into elastic:main Aug 14, 2024
5 checks passed
@elasticmachine
Copy link

Package falco - 0.1.0 containing this change is available at https://epr.elastic.co/search?package=falco

jvalente-salemstate pushed a commit to jvalente-salemstate/integrations that referenced this pull request Aug 21, 2024
This is an initial release of a new integration for Falco. It captures events (called Alerts in Falco) that are created by Falco's Rules.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Integration:falco Falco New Integration Issue or pull request for creating a new integration package. Team:Security-Service Integrations Security Service Integrations Team [elastic/security-service-integrations]
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[New Integration] Sysdig Falco
10 participants