Enhance Recommendations: Add SKU And Tags To Filter

Alex Johnson
-
Enhance Recommendations: Add SKU And Tags To Filter

Making cloud cost optimization more precise requires the ability to drill down into specific resource types and configurations. When analyzing cloud spending, you often find yourself looking at particular instance types or resources with specific metadata. This is where the RecommendationFilter message plays a crucial role. However, as it currently stands in costsource.proto, this filter is missing key components that plugins need to provide truly resource-specific cost optimization recommendations. This limitation prevents us from offering actionable insights based on instance types (SKUs) and resource metadata like tags, which are essential for effective cost management.

The Current Limitation: What's Missing in RecommendationFilter

The RecommendationFilter message, found in costsource.proto (lines 646-657), is designed to help narrow down the scope of recommendations you receive. It currently supports filtering by several criteria:

  • provider: This lets you specify the cloud provider you're using, such as "aws", "azure", or "gcp".
  • region: You can filter recommendations for a specific geographical region, like "us-east-1".
  • resource_type: This allows you to focus on a particular type of resource, for instance, "ec2" instances or "ebs" volumes.
  • category: You can filter by the type of recommendation, using an enum to define categories.
  • action_type: Similarly, you can filter by the recommended action, also defined by an enum.

While these fields are useful for broad filtering, they fall short when you need to pinpoint recommendations for specific configurations. To generate meaningful and actionable recommendations, plugins require more granular information. Specifically, they need:

  • SKU/Instance Type: For EC2, this would be details like "t2.medium" or "m5.large". For EBS, it could be "gp2" or "gp3". This specifies the exact machine or storage type you're working with.
  • Tags/Metadata: Resources often have custom tags assigned to them for organization and identification, such as {"size": "100"} for an EBS volume or {"env": "prod"} for an application environment. These tags provide critical context.

Without these essential fields, the GetRecommendations RPC simply doesn't have enough context to empower plugins to perform advanced optimization tasks. This means we can't effectively:

  1. Generate recommendations for upgrading EC2 instance generations (e.g., suggesting a move from a "t2.medium" to a "t3.medium" for better performance and potentially better cost-efficiency).
  2. Propose migrations to more cost-effective architectures, like moving from Intel-based instances to Graviton/ARM instances (e.g., from "m5" to "m6g").
  3. Advise on upgrading EBS volume types (e.g., recommending a switch from "gp2" to "gp3" for improved performance at a lower cost).

This gap means that users are missing out on highly specific, tailored advice that could lead to significant cost savings and performance improvements.

The Proposed Solution: Enhancing RecommendationFilter

To address these limitations and unlock more granular recommendation capabilities, we propose extending the RecommendationFilter message with two new, optional fields. These additions will allow plugins to receive the necessary context to generate precise recommendations for specific resource configurations. The updated RecommendationFilter would look like this:

// RecommendationFilter specifies criteria for filtering recommendations.
message RecommendationFilter {
  // provider filters by cloud provider (e.g., "aws", "azure", "gcp", "kubernetes")
  string provider = 1;
  // region filters by deployment region
  string region = 2;
  // resource_type filters by resource type
  string resource_type = 3;
  // category filters by recommendation category
  RecommendationCategory category = 4;
  // action_type filters by recommended action type
  RecommendationActionType action_type = 5;
  
  // NEW: sku filters by SKU or instance type (e.g., "t2.medium", "gp2")
  // When provided, plugins generate recommendations for this specific SKU.
  optional string sku = 6;
  
  // NEW: tags provides additional resource metadata for recommendation generation
  // Example: {"size": "100"} for EBS volume size, {"env": "prod"} for filtering
  optional map<string, string> tags = 7;
}

Let's break down what these new fields bring:

  • sku (Optional String): This field is designed to accept the specific Stock Keeping Unit (SKU) or instance type identifier. When a user provides a sku, such as "t2.medium" for EC2 or "gp2" for EBS, the plugin knows to focus its analysis and generate recommendations tailored precisely for that particular type of resource. This is incredibly powerful for scenarios where you have specific instance families or storage types you need to optimize.

  • tags (Optional Map<string, string>): This field allows for the inclusion of arbitrary key-value pairs representing resource metadata. For instance, if you're analyzing EBS volumes, you might pass {"size": "100"} to indicate you're interested in 100GB volumes. Or, you could use tags like {"env": "prod"} or {"application": "billing"} to filter recommendations for resources tagged with specific environments or applications. This provides a flexible way to incorporate custom filtering logic based on how users organize their cloud infrastructure.

By adding these two fields, we are significantly enhancing the RecommendationFilter's ability to capture the nuances of cloud resource configurations. This will enable plugins to move beyond generic advice and deliver highly specific, context-aware optimization suggestions, leading to more effective cost management and performance tuning.

Concrete Use Cases: How This Powers Smarter Recommendations

Let's illustrate how these new sku and tags fields in the RecommendationFilter can be leveraged to generate incredibly specific and valuable cost optimization recommendations. These examples showcase how you can craft requests to get tailored advice for your unique cloud environment.

UC1: Optimizing EC2 Instance Generations

Imagine you have a fleet of older "t2.medium" EC2 instances and you're curious if newer generations offer better performance or cost savings. With the enhanced RecommendationFilter, you can make a request like this:

req := &GetRecommendationsRequest{
  Filter: &RecommendationFilter{
    Provider:     "aws",
    Region:       "us-east-1",
    ResourceType: "ec2",
    Sku:          "t2.medium",  // NEW: Specifying the exact instance type
  },
}

In this scenario, the plugin, upon receiving this request, would specifically analyze "t2.medium" instances in the specified region. It could then return a recommendation such as: "Upgrade your t2.medium instances to t3.medium for improved CPU burst performance and potential cost savings." This is far more actionable than a generic recommendation about upgrading EC2 instances.

UC2: Enhancing EBS Volume Type Recommendations

For storage, optimizing costs often involves choosing the right volume type. If you have numerous 100GB "gp2" EBS volumes and want to see if a migration to "gp3" makes sense, you can use the sku and tags fields together:

req := &GetRecommendationsRequest{
  Filter: &RecommendationFilter{
    Provider:     "aws",
    Region:       "us-east-1",
    ResourceType: "ebs",
    Sku:          "gp2",  // NEW: Targeting 'gp2' volumes
    Tags:         map[string]string{"size": "100"},  // NEW: Focusing on 100GB volumes
  },
}

Here, the plugin understands you're interested in optimizing "gp2" EBS volumes that are specifically 100GB in size. The output could be: "Consider upgrading your 100GB gp2 EBS volumes to gp3 for enhanced IOPS and throughput capabilities, potentially leading to 20% cost savings." This level of detail is invaluable for fine-tuning storage costs.

UC3: Facilitating Graviton/ARM Migrations

Moving to ARM-based Graviton processors can offer substantial cost and performance benefits for compatible workloads. If you're running "m5.large" instances and want to explore Graviton alternatives, the sku field is key:

req := &GetRecommendationsRequest{
  Filter: &RecommendationFilter{
    Provider:     "aws",
    Region:       "us-east-1",
    ResourceType: "ec2",
    Sku:          "m5.large",  // NEW: Identifying the current instance type
  },
}

In response, a plugin could suggest: "Migrate your m5.large instances to m6g.large (AWS Graviton) to achieve approximately 20% in cost savings and improved performance for compatible workloads." This directly guides users toward adopting more modern, efficient architectures.

These use cases demonstrate the practical power of adding sku and tags to the RecommendationFilter. They enable a move from generic advice to highly specific, actionable insights that directly address the user's infrastructure configuration, leading to more impactful cost optimizations and performance tuning.

Exploring Alternative Approaches (And Why They Weren't Chosen)

When designing features, it's always wise to consider different ways to achieve the goal. For adding SKU and tag filtering to RecommendationFilter, several alternatives were evaluated. Understanding why these were passed over helps solidify the chosen solution's strengths.

Alternative A: Incorporating ResourceDescriptor into the Request

One idea was to introduce a ResourceDescriptor field directly into the GetRecommendationsRequest. This descriptor could potentially hold information like SKU and tags. However, this approach was rejected for a few key reasons:

  • Conflicting Semantics: The ResourceDescriptor is primarily intended for direct resource identification and lookup, not for filtering a set of potential recommendations. Using it for filtering would blur the lines between finding a specific resource and defining criteria for a set of recommendations.
  • Type Mismatch: The sku field within ResourceDescriptor might have different semantics (e.g., being a required identifier) compared to its intended use as an optional filter criterion here.
  • Complexity: It would add an extra layer of complexity without offering a significant advantage over dedicated filter fields.

Alternative B: Overloading the resource_type Field

Another thought was to combine the resource type and SKU into a single string, perhaps in a format like "ec2:t2.medium". This approach was also rejected because:

  • Breaking Existing Logic: This would fundamentally alter how the resource_type field is interpreted, potentially breaking existing clients and internal logic that relies on its current, simpler format.
  • Inconsistency: It introduces an inconsistent pattern for defining resource details, making the API harder to understand and maintain.
  • Parsing Difficulties: It would complicate parsing and validation logic within the backend, requiring extra steps to extract the resource type and SKU separately.

Alternative C: Relying Solely on a Generic metadata Map

A simpler approach might have been to only add a generic map<string, string> metadata = 6 field, without a dedicated sku field. This was rejected because:

  • SKU's Prominence: The SKU or instance type is a very common and critical piece of information for generating resource-specific recommendations. Giving it a first-class, dedicated field makes it more discoverable, easier to use, and more semantically clear.
  • Clarity and Documentation: An explicit sku field is self-documenting. Users immediately understand its purpose without having to guess what keys might be used in a generic metadata map (e.g., should it be "instance_type", "sku", or something else?).
  • Type Safety: While both are maps, having a dedicated string for SKU offers a slightly clearer contract than relying on a key within a broader metadata map.

By considering and rejecting these alternatives, the proposed solution of adding distinct, optional sku and tags fields emerges as the most robust, clear, and backward-compatible approach for enhancing the RecommendationFilter.

Ensuring Smooth Adoption: Backward Compatibility and Implementation

Introducing new features, especially to core message structures like RecommendationFilter, requires careful consideration of how existing systems will interact with the changes. Fortunately, the proposed enhancements are designed to be fully backward compatible, ensuring a seamless transition for current users and systems.

  • Optional Fields: The new sku and tags fields are marked as optional. This means that when a request is made without these fields, they will simply be absent or have their default empty values (an empty string for sku and an empty map for tags). Existing clients that are not aware of these new fields will continue to function exactly as before, sending requests without populating them.
  • No Change in RPC Semantics: The core functionality and signatures of existing RPCs, like GetRecommendations, remain unchanged. The addition of optional fields does not alter how existing parameters are processed.
  • Plugin Adaptability: Plugins that implement the recommendation logic can easily adapt. They can check if the sku field is non-empty before attempting to generate SKU-specific recommendations. Similarly, they can utilize the tags map for context-aware filtering when it's provided. This allows for a gradual rollout where older plugins continue to work as before, while newer versions can leverage the enhanced capabilities.

Implementation Notes for a Smooth Rollout

To ensure a clean and effective implementation, the following steps are recommended:

  1. Version Bumping: After the changes are merged, the version of the relevant package should be bumped. A suggestion is to move to v0.4.8 to indicate these specific enhancements.
  2. SDK Updates: If there are SDK helper functions (like ApplyRecommendationFilter) that abstract the creation or application of these filters, they should be updated to accommodate the new fields. This ensures that SDK users have an easy way to utilize the new functionality.
  3. Proto Documentation: Add clear and concise documentation directly within the proto comments for the new sku and tags fields. Explain their purpose, expected format, and provide examples.
  4. Validation Logic: Implement validation for the new fields. For instance, if a sku is provided, it might be beneficial to validate it against known patterns for the specified provider and resource_type to catch potential errors early. This could involve checking if "t2.medium" is a valid EC2 instance type for AWS.

By adhering to these principles and implementation notes, the integration of sku and tags into RecommendationFilter can be achieved with minimal disruption, maximizing the benefits of more granular cost optimization recommendations.

Related Resources and Further Reading

To dive deeper into the context and development surrounding these recommendation filters, you might find the following resources helpful:

  • Pulumicost-plugin-aws-public Pull Request: The development for the AWS public plugin is directly related and was blocked pending this RecommendationFilter enhancement. You can track the progress and specific implementation details here: pulumicost-plugin-aws-public #012-recommendations branch.
  • Internal Research Document: For a comprehensive overview of the background, discussions, and research that led to identifying this gap, refer to the internal document: specs/012-recommendations/research.md.

Understanding these related areas can provide valuable insights into the motivation behind this change and its impact on cloud cost management tools.

You may also like