Cijail: How to protect your CI/CD pipelines from supply chain attacks?

2024-06-02

Supply chain attacks are especially popular nowadays, and there is a good reason for that. Many build tools such as Cargo, Pip, NPM were not designed to protect from them (NPM example, Cargo-related discussion). At the same time maintainers' tools such as Nix, Guix, RPM and DEB build systems successfully mitigate such attacks. These tools precisely control what files are downloaded over the network before the build starts and prohibit any network access during the build phase itself. In this article we introduce a tool called Cijail that allows you to adopt similar rules for developers' build systems such as Cargo, Pip, NPM. This tool is based on Linux Seccomp, can be run inside CI/CD pipelines, and does not require superuser privileges. It protects from data exfiltration over DNS via deep packet inspection effectively limiting the damage supply chain attacks can cause. The tool is open source and written in Rust.

post.alt — «Capybara inside a futuristic tunnel, oil painting» by Midjourney.

Why protect from supply chain attacks?
What is a supply chain attack?
How the data is exfiltrated over DNS?
How we can protect ourselves from supply chain attacks?
Example: Cargo + Github (Cijail itself)
Example: NPM + Gitlab (static web site)
Caveat: cargo-deny via HTTPS proxy
Caveat: NPM via HTTPS proxy
Conclusion

Why protect from supply chain attacks?

Supply chain attacks are become popular with introduction of developers' tools that manage project's dependencies. In contrast to maintainers' tools they do not block network access during build phase, and hackers use this seemingly minor breach to exfiltrate secrets by bundling malicious scripts with the dependency and executing these scripts during build phase. It takes only one popular dependency to be compromised to run these scripts on a multitude of developers' computers and CI/CD pipelines and steal private keys. This is unlikely event but the damage it may cause is catastrophic: private keys might give access to a cryptowallet (on a developer's machine), to a server via SSH, to a static website via cloud upload endpoint etc. From our perspective protecting from them by default is like using the seat belt: no one expects a car crash when one uses a seat belt, but expects the belt to save one's life in an unlikely catastrophic situation.

What is a supply chain attack?

Supply chain attack starts with hacker getting access to a repository of a popular software package. The hacker can use social engineering, zero-day vulnerabilities in operating systems or breaches in repository management system itself. Usually two-factor authentication can protect from the attack on this phase.

If the hacker was able to get access to the repository, he or she proceeds with making a malicious commit or (most likely) making a new release archive that contains malicious code. Usually signed commits and signed releases/packages/archives protect from the attack on this phase.

Then the attacker waits until dependent software packages download new release of the breached dependency and execute the malicious code in their CI/CD pipelines or on the developers' computers. In order to exfiltrate the secrets the hacker would obscure the traffic as DNS for example and setup a DNS server to collect the secrets.

How the data is exfiltrated over DNS?

Data exfiltration over DNS works as follows. A malicious actors sets up a DNS server for his/her domain. Then it encodes secrets as subdomains of this domain and eventually the DNS lookup request reaches the hacker's DNS server via other perfectly secure and legit publicly available DNS servers. This exfiltration uses DNS as a side channel. This is one of many side channels that hackers might use (the other popular one being ICMP protocol).

Conveniently DNS traffic is not blocked anywhere because other software uses DNS. One way to protect from this attack is to either allow to resolve only certain domains via deep packet inspection or block Internet access altogether. Maintainers' tools use the latter while Cijail adopts the former approach because developers' tools were not designed to block the traffic during build phase.

How we can protect ourselves from supply chain attacks?

Cijail protects from supply chain attacks via whitelisting domain names, IP addresses and ports as well as URLS that a script is allowed to access. This is implemented using Seccomp and MITM HTTPS proxy server. Cijail launches the supplied command in a child process with Seccomp jail and SECCOMP_RET_USER_NOTIF flag. Simultaneously the control process is launched that receives notifications from the jailed process and decides if the resource can be accessed via SECCOMP_IOCTL_NOTIF_SEND flag. Finally, a MITM HTTPS proxy is launched as the third process. This process decrypts all HTTPS requests to check that the corresponding URL is allowed. For MITM HTTPS proxy to work the CA SSL certificate is automatically installed in the operating system as trusted.

# no traffic is allowed
🌊 cijail dig staex.io @1.1.1.1
[Sun Apr 04 17:28:22 2024] cijail: deny connect 1.1.1.1:53

# DNS request (connection to DNS server is allowed whereas name resolution is not)
🌊 env CIJAIL_ENDPOINTS='1.1.1.1:53' cijail dig staex.io @1.1.1.1
[Sun Apr 04 17:28:22 2024] cijail: allow connect 1.1.1.1:53
[Sun Apr 04 17:28:22 2024] cijail: deny sendmmsg staex.io

# DNS request and name resolution is allowed
🌊 env CIJAIL_ENDPOINTS='1.1.1.1:53 staex.io' cijail dig staex.io @1.1.1.1
[Sun Apr 04 17:28:22 2024] cijail: allow connect 1.1.1.1:53
[Sun Apr 04 17:28:22 2024] cijail: allow sendmmsg staex.io
... dig output ...

Example: Cargo + Github (Cijail itself)

We tried to use Cijail for building itself. In order to use Cijail in your Github Actions you need to add the following line to your Dockerfile.

COPY --from=ghcr.io/staex-io/cijail:0.6.8 / /usr/local

Then you have to prepend cijail to every command in every step because Github Actions do not respect Docker's ENTRYPOINT. Then all you need to do is to add CIJAIL_ENDPOINTS environment variable with the list of allowed URLS and other endpoints. The resulting workflow specification for Cijail looks like the following.

variables:
  CIJAIL_ENDPOINTS: |
    https://github.com/lyz-code/yamlfix/     	                  # git
    https://pypi.org/simple/                                      # pip
    https://files.pythonhosted.org/packages/                      # pip
    https://static.crates.io/crates/                              # cargo
    https://index.crates.io/                                      # cargo
    https://uploads.github.com/repos/staex-io/cijail/releases/    # github
    https://api.github.com/repos/staex-io/cijail/releases         # github
steps:
  - name: Lint
    run: cijail ./ci/build.sh

Example: NPM + Gitlab (static web site)

For Gitlab the approach is similar. This time you might consider adding ENTRYPOINT ["/usr/local/bin/cijail"] to your Dockerfile to not prepend cijail to every command in your pipeline. The resulting workflow specification for a static web site looks like the following.

CIJAIL_ENDPOINTS: |
  https://registry.npmjs.org/                  # npm
  https://github.com/lyz-code/yamlfix/         # git
  https://pypi.org/simple/                     # pip
  https://files.pythonhosted.org/packages/     # pip
  9.9.9.9:53                                   # rsync
  staex.io:22                                  # rsync

Caveat: `cargo-deny` via HTTPS proxy

One particular problem that we encountered is the fact that some programs bundle trusted root CA certificates in their binaries. This is the case for cargo-deny. This tool uses webpki-roots crate that bundles root CA certificates as byte arrays directly in the cargo-deny binary. It is impossible to add Cijail's root certificate to such a program. The current workaround is to either run cargo-deny without Cijail or build cargo-deny with the following flags (as suggested by u/repilur): --no-default-features --features native-certs.

# our MITM proxy failed to trick cargo-deny :-(
🌊 cijail cargo deny check
[ERROR] error trying to connect: invalid peer certificate: UnknownIssuer

# a workaround
🌊 cijail cargo deny check --disable-fetch || true    # a warm-up (download dependencies)
🌊 cargo deny check                                   # run without cijail 😮

Caveat: NPM via HTTPS proxy

Another problem comes from the fact that NPM usage behind HTTPS proxy is not as reliable as without it. In some cases it creates thousands of connections to download a few dependencies. The workaround that we found is to specify maxsocket=1 in NPM's configuration.

# 1000+ connections for 340 dependencies?
🌊 cijail npm install
[Fri May 24 07:02:13 2024] cijail: allow connect 127.0.0.1:39317
[Fri May 24 07:02:13 2024] cijail: allow connect 127.0.0.1:39317
[Fri May 24 07:02:13 2024] cijail: allow connect 127.0.0.1:39317
... the message repeats 1000+ times
npm ERR! code ECONNREFUSED

# a workaround
🌊 npm config set maxsockets 1

Conclusion

To summarize, most CI/CD pipelines are vulnerable to data exfiltration via DNS because developers' tools like Cargo, NPM and PIP do not block network access during build phase in contrast to maintainers' tools like Nix, Guix, RPM and DEB build systems that do.

The best way to protect from any data exfiltration is to split building the package into download and build phase. During download phase the dependencies are downloaded but no scripts are executed and no packages are built. During build phase the scripts are executed and the packages are built, but the network access is disabled. This simple technique will protect from any type of data exfiltration without the need for deep packet inspection.

The major problem with implementing such a split in developers' tools is the fact that it might break some packages. Another problem is that blocking network access in a Docker container might require additional privileges that are not present by default. Below is the example of how to do this manually for NPM and Cargo.

# cargo example
🌊 cargo download    # only download dependencies
🌊 unshare -rn cargo build    # build packages and run scripts without network access (will not work in a Docker container)

# npm example
🌊 npm clean-install --ignore-scripts    # only download dependencies
🌊 unshare -rn npm rebuild    # build packages and run scripts without network access (will not work in a Docker container)

Cijail slides presented at Rust & Tell Berlin meetup on 2024-05-30. — Cijail slides presented at Rust & Tell: It is not June yet Berlin meetup on 2024-05-30.

Discussion

Staex is a secure public network for IoT devices that can not run a VPN such as smart meters, IP cameras, and EV chargers. Staex encrypts legacy protocols, reduces mobile data usage, and simplifies building networks with complex topologies through its unique multi-hop architecture. Staex is fully zero-trust meaning that no traffic is allowed unless specified by the device owner which makes it more secure than even some private networks. With this, Staex creates an additional separation layer to provide more security for IoT devices on the Internet, also protecting other Internet services from DDoS attacks that are usually executed on millions of IoT machines.

To stay up to date subscribe to our newsletter, follow us on LinkedIn and Twitter for updates and subscribe to our YouTube channel.

Cijail: How to protect your CI/CD pipelines from supply chain attacks?

Table of contents

Why protect from supply chain attacks?

What is a supply chain attack?

How the data is exfiltrated over DNS?

How we can protect ourselves from supply chain attacks?

Example: Cargo + Github (Cijail itself)

Example: NPM + Gitlab (static web site)

Caveat: `cargo-deny` via HTTPS proxy

Caveat: NPM via HTTPS proxy

Conclusion

Discussion

See also

Staex latest release features on-premise fleet management via web UI

Staex: Data Sharing for IoT

Staex latest release features tunnels as the ultimate network isolation tool

Cijail: How to protect your CI/CD pipelines from supply chain attacks?

Table of contents

Why protect from supply chain attacks?

What is a supply chain attack?

How the data is exfiltrated over DNS?

How we can protect ourselves from supply chain attacks?

Example: Cargo + Github (Cijail itself)

Example: NPM + Gitlab (static web site)

Caveat: cargo-deny via HTTPS proxy

Caveat: NPM via HTTPS proxy

Conclusion

Discussion

See also

Staex latest release features on-premise fleet management via web UI

Staex: Data Sharing for IoT

Staex latest release features tunnels as the ultimate network isolation tool

Caveat: `cargo-deny` via HTTPS proxy