The surge in automation and interconnected services has placed Non-Human Identities at the forefront of modern technology ecosystems. These entities power critical functions across applications and infrastructure but can become serious vulnerabilities if overlooked.
Awareness of NHI-related risks is growing, and organizations are investing in solutions to enhance visibility and mitigation strategies. Yet, it's not just about knowing NHIs exist; it's crucial to track their usage meticulously. Without comprehensive monitoring, organizations are essentially flying blind, unable to assess risks accurately or respond effectively to incidents.
The Critical Role of NHIs in GitHub
GitHub stands as a cornerstone for engineering teams worldwide, from startups to Fortune 100 enterprises. Its ecosystem relies heavily on NHIs, including Deploy Keys, OAuth applications, and Personal Access Tokens (PATs). These NHIs facilitate automation and integration but can become vectors for unauthorized access if compromised.
The Evolution of Personal Access Tokens
Until late 2022, GitHub's classic Personal Access Tokens (PATs) offered only coarse-grained permissions. Tokens had broad access to repositories and organizations the owning user could access, with minimal control or visibility for organization owners. Moreover, companies faced a significant blind spot with classic PATs—not only were audit logs of these tokens partial at best, but there was also no way to obtain an inventory of them. Organizations were essentially considered as almost flying blind with regard to the existence and usage of classic PATs.
It’s strongly advised against using classic PATs. Security teams should initiate campaigns to work with their developers to decommission any classic PATs in use. Transitioning to Fine-Grained Personal Access Tokens (FGPATs) not only enhances security but also increases awareness of NHIs within the organization.
Recognizing these security implications, GitHub introduced Fine-Grained Personal Access Tokens in October 2022. This advancement allowed tokens to have specific permissions and apply only to necessary repositories. Importantly, it empowered organization administrators with visibility into token permissions and actual usage—a significant step forward in NHI security.
Unveiling a Gap in PAT Monitoring
To secure critical technologies like GitHub, companies must ensure full accounting and auditability of all actions performed. This involves two main data sources:
- Inventory: A list of all PATs created within the organization, accessible via the UI or GitHub API.
- Audit Logs: Records of actions performed using the PATs, retrievable via the GitHub user interface, the API, or forwarded to a SIEM through webhooks.
Theoretically, combining these sources should provide a complete picture of PAT usage. However, when integrating our product with GitHub, we encountered a disconnect. We couldn't correlate tokens from the inventory with entries in the audit logs—the IDs didn't match, and token names were inconsistently presented.
Diving Deep into the Data Discrepancy
Our investigation revealed that different data sources and logs provided varying pieces of information:
- Inventory Data:
- Token Request ID: An ID representing the request to create the token.
- Creator ID: The user who initiated the token creation.
- Approval Time: When the token was approved.
- Audit Logs:
- Request-Created Logs:
- Actual Token ID: The unique identifier for the token.
- Token Name: The name given to the token.
- Creator ID: Matches the Inventory Data.
- Creation Time: When the token request was made.
- Access-Granted Logs:
- Actual Token ID: Same as above.
- Token Name: Same as above.
- Approver ID: The user who approved the token, possibly different from the creator.
- Approval Time: Matches the Inventory Data.
- Token Usage Logs:
- Actual Token ID: Same as above.
- Token Name: Same as above.
- Action Details: Information about the action performed using the token.
- Request-Created Logs:
The crux of the issue was that the Inventory Data and Audit Logs used different IDs for the same token, making direct correlation impossible.
Crafting a Solution
To bridge this gap, we developed a method to correlate tokens across data sources:
- Aggregate Audit Logs: We collected all 'request-created' and 'access-granted' audit logs.
- Map Token IDs: By matching the creator ID and approval time between logs and inventory, we associated the actual Token ID with the Token Request ID from the inventory.
- Enrich Inventory Data: We updated the inventory records with the actual Token IDs and token names from the audit logs.
While this approach allowed us to create a unified view of each PAT—linking its creation, approval, and subsequent usage—we recognized that this process might not be straightforward for teams attempting to achieve this capability in-house. The complexity of correlating disparate data sources requires significant effort and expertise. Therefore, we decided to reach out to GitHub to shed light on this gap and advocate for a more seamless solution.
Collaboration with GitHub for a Better Future
After implementing our workaround, we reached out to GitHub to disclose our findings. We were pleased to see that GitHub acknowledged the issue and planned to address it. Their response was prompt and collaborative, reflecting their commitment to enhancing user security.
We are happy to report that GitHub has completed the necessary updates to align the IDs across inventory and audit logs. Although this fix hasn't been officially announced yet, teams can already begin to observe the changes in their environments. This improvement not only validates our efforts but also exemplifies how vendors can work proactively to enhance security and auditability for their users.
Conclusion
NHIs are integral to modern enterprise operations, but they also introduce complex security challenges. Visibility and tracking are not optional—they are essential components of a robust security strategy. Our experience with GitHub underscores the importance of vendor cooperation in addressing security gaps.
At Clutch, we are committed to providing solutions that automatically and continuously discover and correlate all NHIs across your ecosystem. Our collaboration with GitHub reflects our mutual commitment to empowering organizations with the tools they need to secure their environments effectively.
Some vendors are stepping up to enhance security and auditability, while others lag behind, lacking even basic logging capabilities for NHIs. It's crucial for organizations to partner with vendors who prioritize security and provide the necessary tools for comprehensive monitoring.
Timeline
- October 18, 2022: GitHub introduces Fine-Grained Personal Access Tokens.
- May 21, 2024: Clutch delivers GitHub integration support.
- August 28, 2024: Clutch enhances auditability support for Fine-Grained Personal Access Tokens.
- August 29, 2024: Clutch provides feedback to GitHub.
- September 12, 2024: GitHub acknowledges the issue.
- Between October 13 and November 7, 2024: GitHub releases the fix, aligning data across sources.