Enhancing Developer Experience with Mercari's Unified Platform Interface (SFD)

platform engineering

Discover Mercari's Single Front Door (SFD), a unified platform interface designed to streamline developer workflows, scale GitOps, and enhance security through modular architecture, IAM, and RBAC.

At Mercari, the Enablement Tools and Interfaces team, responsible for enhancing developer experience and CI/CD, is developing the Single Front Door (SFD) service. SFD provides a unified interface to our platform, enabling GitOps at scale across thousands of components. This is achieved by integrating a widely used internal command-line tool with various external services, including Google Cloud, to streamline the developer experience, enforce governance, maintain consistency, and manage large-scale GitOps efficiently.

Recently, a cloud-hosted Model Context Protocol (MCP) server was introduced as an additional interface. This allows developers to access all workflows and platform capabilities directly through AI-powered tools, including their Integrated Development Environments (IDEs).

Concept

Mercari Group's platform has evolved over the years to support hundreds of production services and more than 1600 active repositories. Numerous tools, such as infrastructure-as-code repositories, an abstraction framework for infrastructure configurations and application manifests, and in-house CI/CD systems, were provided to aid development. However, these components lacked a centralized interface, compelling developers to understand and interact with each tool separately. This often required making changes across at least five repositories and completing about a dozen steps to deploy new services to production. Common platform interactions included:

  • Committing Infrastructure as Code for resource management
  • Committing Kubernetes Manifests for service configurations
  • Committing Protobuf definitions for intra-service communication
  • Setting up debug environments using external tools/services
  • Committing edits to delete or manage cloud resources

These disparate processes were identified as significant impediments to development productivity. To address this, SFD was developed as a new unified interface for Mercari Group's platform, eliminating the need for users to directly interact with multiple tools for common platform operations.

Given Mercari's strong reliance on GitOps, most platform operations are executed through predefined workflows that modify files using templates. These workflows leverage developer credentials to create changesets in repositories on their behalf. Users can trigger these workflows via SFD—either through the internal CLI tool or by interacting with an AI agent in their IDEs via an MCP server. The workflows then perform subsequent steps, such as making necessary changes in configuration repositories on behalf of the users.

System Design

A workflow triggered via SFD follows this lifecycle:

  1. Users authenticate to SFD through an OAuth flow and communicate their intent via CLI prompts or natural language input through AI chat in their IDEs.
  2. The user interface (CLI or IDE agent) sends a corresponding workflow request to the backend, including the OAuth token.
  3. The backend securely stores workflow metadata and then triggers a workflow using Argo Workflows.
  4. Argo Workflows initiates execution containers for each step of the workflow.
  5. These steps execute sequentially, adhering to a Directed Acyclic Graph (DAG) defined in the workflow definition.
  6. Kubernetes containers for each workflow step execute the business logic, creating changesets on GitHub or other relevant components.

Challenges

Safeguarding GitOps at Scale

A critical aspect of successful GitOps is managing the scope of access for committers. As GitOps considers the Git repository the definitive configuration and operational ledger, determining who (or what) is authorized to commit is as crucial as the content being committed.

Each commit can trigger real, automated changes, including deployments, rollouts, environment modifications, policy shifts, and, as in Mercari's case, full-scale infrastructure provisioning. Therefore, the credentials linked to each commit act as an execution token with a potentially broad operational blast radius.

To mitigate this risk, workflow changes are managed by:

  • Using the user's own credentials to ensure self-triggered workflows do not bypass human review.
  • Limiting automations to only the necessary action scopes.

The team's solution involves initiating an OAuth flow using Mercari's organizational GitHub App before triggering any workflow. This provides each user with a temporary access token for the app, ensuring that workflows can only interact with repositories and components the GitHub App is authorized to access, irrespective of the user's personal permissions. This significantly reduces the potential blast radius.

Upon generation, the token is sent to the backend, encrypted, and stored in a centralized datastore. During execution, each workflow job (running in isolated container pods) retrieves this token, guaranteeing that every changeset is created using consistent credentials despite running in separate environments.

Configuring IAM + RBAC for Secure External Service Access

Another significant challenge was providing secure access to external systems for both core backend services and Argo Workflows job containers. This included GCP services (Secret Manager, Datastore, KMS, Pub/Sub), GitHub, GetDX, and Slack, all without embedding static credentials or granting overly broad permissions. This was paramount, as Mercari's architecture adheres to a zero-trust model, requiring each workload to verify its identity and receive only the minimum necessary access.

Argo Workflows facilitated a clean solution using GCP Identity and Access Management (IAM) and Kubernetes Role-Based Access Control (RBAC). Each workflow step operates under its own Kubernetes Service Account, which is then mapped to a tightly scoped GCP Service Account via Workload Identity. This architecture grants each step least-privileged access to GCP APIs (e.g., secretmanager.secretAccessor, pubsub.publisher, datastore.viewer) without relying on shared credentials or long-lived tokens.

Backend services follow a similar pattern: each service maintains its unique identity and limited permissions, with no secrets injected into pods. RBAC governs workload actions within the cluster, while IAM controls their access to GCP. This combined approach enforces robust isolation and inherently supports Mercari's zero-trust design.

Envisioned End State: The Golden Path

The ultimate goal for this service is to become a fully modular workflow engine. In this state, every stage of the application lifecycle will be constructed from reusable, well-defined building blocks. Each block will represent a specific platform capability—such as service configuration, infrastructure provisioning, service mesh enablement, observability, and CI/CD integration—allowing teams to assemble custom workflows that meet their requirements while adhering to platform standards.

Rather than developing custom automation for each service, developers will leverage these prebuilt blocks, which come with production-ready defaults, including Terraform modules, Kubernetes manifests, service templates, and logging/metrics pipelines.

This model is designed for extensibility. Platform teams can introduce new building blocks for their components whenever they wish to expose capabilities via SFD, inherently promoting innersource. As the platform expands, so too will the catalog of reusable steps, allowing workflows to evolve organically while remaining aligned with organizational best practices.