Design Overview
The AWS Rust SDK aims to provide an official, high quality & complete interface to AWS services. We plan to eventually use the CRT to provide signing & credential management. The Rust SDK will provide first-class support for the CRT as well as Tokio & Hyper. The Rust SDK empowers advanced customers to bring their own HTTP/IO implementations.
Our design choices are guided by our Tenets.
Acknowledgments
The design builds on the learnings, ideas, hard work, and GitHub issues of the 142 Rusoto contributors & thousands of users who built this first and learned the hard way.
External API Overview
The Rust SDK is "modular" meaning that each AWS service is its own crate. Each crate provides two layers to access the service:
- The "fluent" API. For most use cases, a high level API that ties together connection management and serialization will be the quickest path to success.
#[tokio::main]
async fn main() {
let client = dynamodb::Client::from_env();
let tables = client
.list_tables()
.limit(10)
.send()
.await.expect("failed to load tables");
}
- The "low-level" API: It is also possible for customers to assemble the pieces themselves. This offers more control over operation construction & dispatch semantics:
#[tokio::main]
async fn main() {
let conf = dynamodb::Config::builder().build();
let conn = aws_hyper::Client::https();
let operation = dynamodb::ListTables::builder()
.limit(10)
.build(&conf)
.expect("invalid operation");
let tables = conn.call(operation).await.expect("failed to list tables");
}
The Fluent API is implemented as a thin wrapper around the core API to improve ergonomics.
Internals
The Rust SDK is built on Tower Middleware, Tokio & Hyper. We're continuing to iterate on the internals to enable running the AWS SDK in other executors & HTTP stacks. As an example, you can see a demo of adding reqwest
as a custom HTTP stack to gain access to its HTTP Proxy support!
For more details about the SDK internals see Operation Design
Code Generation
The Rust SDK is code generated from Smithy models, using Smithy codegeneration utilities. The Code generation is written in Kotlin. More details can be found in the Smithy section.
Rust SDK Design Tenets
Unless you know better ones! These are our tenets today, but we'd love your thoughts. Do you wish we had different priorities? Let us know by opening and issue or starting a discussion.
- Batteries included, but replaceable. The AWS SDK for Rust should provide a best-in-class experience for many use cases, but, customers will use the SDK in unique and unexpected ways. Meet customers where they are; strive to be compatible with their tools. Provide mechanisms to allow customers make different choices.
- Make common problems easy to solve. The AWS SDK for Rust should make common problems solvable. Guide customers to patterns that set them up for long-term success.
- Design for the Future. The AWS SDK for Rust should evolve with AWS without breaking existing customers. APIs will evolve in unpredictable directions, new protocols will gain adoption, and new services will be created that we never could have imagined. Don’t simplify or unify code today that prevents evolution tomorrow.
Details, Justifications, and Ramifications
Batteries included, but replaceable.
Some customers will use the Rust SDK as their first experience with async Rust, potentially any Rust. They may not be familiar with Tokio or the concept of an async executor. We are not afraid to have an opinion about the best solution for most customers.
Other customers will come to the SDK with specific requirements. Perhaps they're integrating the SDK into a much larger project that uses async_std
. Maybe they need to set custom headers, modify the user agent, or audit every request. They should be able to use the Rust SDK without forking it to meet their needs.
Make common problems easy to solve
If solving a common problem isn’t obvious from the API, it should be obvious from the documentation. The SDK should guide users towards the best solutions for common tasks, first with well named methods, second with documentation, and third with real -world usage examples. Provide misuse resistant APIs. Async Rust has the potential to introduce subtle bugs; the Rust SDK should help customers avoid them.
Design for the Future
APIs evolve in unpredictable ways, and it's crucial that the SDK can evolve without breaking existing customers. This means designing the SDK so that fundamental changes to the internals can be made without altering the external interface we surface to customers:
- Keeping the shared core as small & opaque as possible.
- Don’t leak our internal dependencies to customers
- With every design choice, consider, "Can I reverse this choice in the future?"
This may not result in DRY code, and that’s OK! Code that is auto generated has different goals and tradeoffs than code that has been written by hand.
Design FAQ
What is Smithy?
Smithy is the interface design language used by AWS services. smithy-rs
allows users to generate a Rust client for any
Smithy based service (pending protocol support), including those outside of AWS.
Why is there one crate per service?
-
Compilation time: Although it's possible to use cargo features to conditionally compile individual services, we decided that this added significant complexity to the generated code. In Rust the "unit of compilation" is a Crate, so by using smaller crates we can get better compilation parallelism. Furthermore, ecosystem services like
docs.rs
have an upper limit on the maximum amount of time required to build an individual crate—if we packaged the entire SDK as a single crate, we would quickly exceed this limit. -
Versioning: It is expected that over time we may major-version-bump individual services. New updates will be pushed for some AWS service nearly every day. Maintaining separate crates allows us to only increment versions for the relevant pieces that change. See Independent Crate Versioning for more info.
Why don't the SDK service crates implement serde::Serialize
or serde::Deserialize
for any types?
-
Compilation time:
serde
makes heavy use of several crates (proc-macro2
,quote
, andsyn
) that are very expensive to compile. Several service crates are already quite large and adding aserde
dependency would increase compile times beyond what we consider acceptable. When we last checked, addingserde
derives made compilation 23% slower. -
Misleading results: We can't use
serde
for serializing requests to AWS or deserializing responses from AWS because both sides of that process would require too much customization. Adding serialize/deserialize impls for operations has the potential to confuse users when they find it doesn't actually capture all the necessary information (like headers and trailers) sent in a request or received in a response.
In the future, we may add serde
support behind a feature gate. However, we would only support this for operation Input
and Output
structs with the aim of making SDK-related tests easier to set up and run.
I want to add new request building behavior. Should I add that functionality to the make_operation
codegen or write a request-altering middleware?
The main question to ask yourself in this case is "is this new behavior relevant to all services or is it only relevant to some services?"
- If the behavior is relevant to all services: Behavior like this should be defined as a middleware. Behavior like this is often AWS-specific and may not be relevant to non-AWS smithy clients. Middlewares are defined outside of codegen. One example of behavior that should be defined as a middleware is request signing because all requests to AWS services must be signed.
- If the behavior is only relevant to some services/depends on service model specifics: Behavior like this should be defined within
make_operation
. Avoid defining AWS-specific behavior withinmake_operation
. One example of behavior that should be defined inmake_operation
is checksum validation because only some AWS services have APIs that support checksum validation.
"Wait a second" I hear you say, "checksum validation is part of the AWS smithy spec, not the core smithy spec. Why is that behavior defined in make_operation
?" The answer is that that feature only applies to some operations and we don't want to codegen a middleware that only supports a subset of operations for a service.
Smithy
The Rust SDK uses Smithy models and code generation tooling to generate an SDK. Smithy is an open source IDL (interface design language) developed by Amazon. Although the Rust SDK uses Smithy models for AWS services, smithy-rs and Smithy models in general are not AWS specific.
Design documentation here covers both our implementation of Smithy Primitives (e.g. simple shape) as well as more complex Smithy traits like Endpoint
.
Internals
Smithy introduces a few concepts that are defined here:
-
Shape: The core Smithy primitive. A smithy model is composed of nested shapes defining an API.
-
Symbol
: A Representation of a type including namespaces and any dependencies required to use a type. A shape can be converted into a symbol by aSymbolVisitor
. ASymbolVisitor
maps shapes to types in your programming language (e.g. Rust). In the Rust SDK, see SymbolVisitor.kt. Symbol visitors are composable—many specific behaviors are mixed in via small & focused symbol providers, e.g. support for the streaming trait is mixed in separately. -
Writer
: Writers are code generation primitives that collect code prior to being written to a file. Writers enable language specific helpers to be added to simplify codegen for a given language. For example,smithy-rs
addsrustBlock
toRustWriter
to create a "Rust block" of code.writer.rustBlock("struct Model") { model.fields.forEach { write("${field.name}: #T", field.symbol) } }
This would produce something like:
#![allow(unused)] fn main() { struct Model { field1: u32, field2: String } }
-
Generators: A Generator, e.g.
StructureGenerator
,UnionGenerator
generates more complex Rust code from a Smithy model. Protocol generators pull these individual tools together to generate code for an entire service / protocol.
A developer's view of code generation can be found in this document.
Simple Shapes
Smithy Type (links to design discussions) | Rust Type (links to Rust documentation) |
---|---|
blob | Vec<u8> |
boolean | bool |
string | String |
byte | i8 |
short | i16 |
integer | i32 |
long | i64 |
float | f32 |
double | f64 |
bigInteger | BigInteger (Not implemented yet) |
bigDecimal | BigDecimal (Not implemented yet) |
timestamp | DateTime |
document | Document |
Big Numbers
Rust currently has no standard library or universally accepted large-number crate. Until one is stabilized, a string representation is a reasonable compromise:
#![allow(unused)] fn main() { pub struct BigInteger(String); pub struct BigDecimal(String); }
This will enable us to add helpers over time as requested. Users will also be able to define their own conversions into their preferred large-number libraries.
As of 5/23/2021 BigInteger / BigDecimal are not included in AWS models. Implementation is tracked here.
Timestamps
chrono is the current de facto library for datetime in Rust, but it is pre-1.0. DateTimes are represented by an SDK defined structure modeled on std::time::Duration
from the Rust standard library.
#![allow(unused)] fn main() { /// DateTime in time. /// /// DateTime in time represented as seconds and sub-second nanos since /// the Unix epoch (January 1, 1970 at midnight UTC/GMT). /// /// This type can be converted to/from the standard library's [`SystemTime`]: /// ```rust /// # fn doc_fn() -> Result<(), aws_smithy_types::date_time::ConversionError> { /// # use aws_smithy_types::date_time::DateTime; /// # use std::time::SystemTime; /// use std::convert::TryFrom; /// /// let the_millennium_as_system_time = SystemTime::try_from(DateTime::from_secs(946_713_600))?; /// let now_as_date_time = DateTime::from(SystemTime::now()); /// # Ok(()) /// # } /// ``` /// /// The [`aws-smithy-types-convert`](https://crates.io/crates/aws-smithy-types-convert) crate /// can be used for conversions to/from other libraries, such as /// [`time`](https://crates.io/crates/time) or [`chrono`](https://crates.io/crates/chrono). #[derive(PartialEq, Eq, Hash, Clone, Copy)] pub struct DateTime { pub(crate) seconds: i64, /// Subsecond nanos always advances the wallclock time, even for times where seconds is negative /// /// Bigger subsecond nanos => later time pub(crate) subsecond_nanos: u32, } }
Functions in the aws-smithy-types-convert
crate provide conversions to other crates, such as time
or chrono
.
Strings
Rust has two different String representations:
String
, an owned, heap allocated string.&str
, a reference to a string, owned elsewhere.
In ideal world, input shapes, where there is no reason for the strings to be owned would use &'a str
. Outputs would likely use String
. However, Smithy does not provide a distinction between input and output shapes.
A third compromise could be storing Arc<String>
, an atomic reference counted pointer to a String
. This may be ideal for certain advanced users, but is likely to confuse most users and produces worse ergonomics. This is an open design area where we will seek user feedback. Rusoto uses String
and there has been one feature request to date to change that.
Current models represent strings as String
.
Document Types
Smithy defines the concept of "Document Types":
[Documents represent] protocol-agnostic open content that is accessed like JSON data. Open content is useful for modeling unstructured data that has no schema, data that can't be modeled using rigid types, or data that has a schema that evolves outside of the purview of a model. The serialization format of a document is an implementation detail of a protocol and MUST NOT have any effect on the types exposed by tooling to represent a document value.
Individual protocols define their own document serialization behavior, with some protocols such as AWS and EC2 Query not supporting document types.
Recursive Shapes
Note: Throughout this document, the word "box" always refers to a Rust
Box<T>
, a heap allocated pointer to T, and not the Smithy concept of boxed vs. unboxed.
Recursive shapes pose a problem for Rust, because the following Rust code will not compile:
#![allow(unused)] fn main() { struct TopStructure { intermediate: IntermediateStructure } struct IntermediateStructure { top: Option<TopStructure> } }
|
3 | struct TopStructure {
| ^^^^^^^^^^^^^^^^^^^ recursive type has infinite size
4 | intermediate: IntermediateStructure
| ----------------------------------- recursive without indirection
|
= help: insert indirection (e.g., a `Box`, `Rc`, or `&`) at some point to make `main::TopStructure` representable
This occurs because Rust types must be a size known at compile time. The way around this, as the message suggests, is to Box the offending type. smithy-rs
implements this design in RecursiveShapeBoxer.kt
To support this, as the message suggests, we must "Box
" the offending type. There is a touch of trickiness—only one element in the cycle needs to be boxed, but we need to select it deterministically such that we always pick the same element between multiple codegen runs. To do this the Rust SDK will:
- Topologically sort the graph of shapes.
- Identify cycles that do not pass through an existing Box
, List, Set, or Map - For each cycle, select the earliest shape alphabetically & mark it as Box
in the Smithy model by attaching the custom RustBoxTrait
to the member. - Go back to step 1.
This would produce valid Rust:
#![allow(unused)] fn main() { struct TopStructure { intermediate: IntermediateStructure } struct IntermediateStructure { top: Box<Option<TopStructure>> } }
Backwards Compatibility Note!
Box
-
A recursive link is added to an existing structure. This causes a member that was not boxed before to become Box
. Workaround: Mark the new member as Box
in a customization. -
A field is removed from a structure that removes the recursive dependency. The SDK would generate T instead of Box
. Workaround: Mark the member that used to be boxed as Box
in a customization. The Box will be unnecessary, but we will keep it for backwards compatibility.
Aggregate Shapes
Smithy Type | Rust Type |
---|---|
List | Vec<Member> |
Set | Vec<Member> |
Map | HashMap<String, Value> |
Structure | struct |
Union | enum |
Most generated types are controlled by SymbolVisitor.
List
List objects in Smithy are transformed into vectors in Rust. Based on the output of the NullableIndex, the generated list may be Vec<T>
or Vec<Option<T>>
.
Set
Because floats are not Hashable in Rust, for simplicity smithy-rs translates all sets to into Vec<T>
instead of HashSet<T>
. In the future, a breaking change may be made to introduce a library-provided wrapper type for Sets.
Map
Because key
MUST be a string in Smithy maps, we avoid the hashibility issue encountered with Set
. There are optimizations that could be considered (e.g. since these maps will probably never be modified), however, pending customer feedback, Smithy Maps become HashMap<String, V>
in Rust.
Structure
See
StructureGenerator.kt
for more details
Smithy structure
becomes a struct
in Rust. Backwards compatibility & usability concerns lead to a few design choices:
- As specified by
NullableIndex
, fields areOption<T>
when Smithy models them as nullable. - All structs are marked
#[non_exhaustive]
- All structs derive
Debug
&PartialEq
. Structs do not deriveEq
because afloat
member may be added in the future. - Struct fields are public. Public struct fields allow for split borrows. When working with output objects this significantly improves ergonomics, especially with optional fields.
let out = dynamo::ListTablesOutput::new(); out.some_field.unwrap(); // <- partial move, impossible with an accessor
- Builders are generated for structs that provide ergonomic and backwards compatible constructors. A builder for a struct is always available via the convenience method
SomeStruct::builder()
- Structures manually implement debug: In order to support the sensitive trait, a
Debug
implementation for structures is manually generated.
Example Structure Output
Smithy Input:
@documentation("<p>Contains I/O usage metrics...")
structure IOUsage {
@documentation("... elided")
ReadIOs: ReadIOs,
@documentation("... elided")
WriteIOs: WriteIOs
}
long ReadIOs
long WriteIOs
Rust Output:
/// <p>Contains I/O usage metrics for a command that was invoked.</p>
#[non_exhaustive]
#[derive(std::clone::Clone, std::cmp::PartialEq)]
pub struct IoUsage {
/// <p>The number of read I/O requests that the command made.</p>
pub read_i_os: i64,
/// <p>The number of write I/O requests that the command made.</p>
pub write_i_os: i64,
}
impl std::fmt::Debug for IoUsage {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
let mut formatter = f.debug_struct("IoUsage");
formatter.field("read_i_os", &self.read_i_os);
formatter.field("write_i_os", &self.write_i_os);
formatter.finish()
}
}
/// See [`IoUsage`](crate::model::IoUsage)
pub mod io_usage {
/// A builder for [`IoUsage`](crate::model::IoUsage)
#[non_exhaustive]
#[derive(Debug, Clone, Default)]
pub struct Builder {
read_i_os: std::option::Option<i64>,
write_i_os: std::option::Option<i64>,
}
impl Builder {
/// <p>The number of read I/O requests that the command made.</p>
pub fn read_i_os(mut self, inp: i64) -> Self {
self.read_i_os = Some(inp);
self
}
/// <p>The number of read I/O requests that the command made.</p>
pub fn set_read_i_os(mut self, inp: Option<i64>) -> Self {
self.read_i_os = inp;
self
}
/// <p>The number of write I/O requests that the command made.</p>
pub fn write_i_os(mut self, inp: i64) -> Self {
self.write_i_os = Some(inp);
self
}
/// <p>The number of write I/O requests that the command made.</p>
pub fn set_write_i_os(mut self, inp: Option<i64>) -> Self {
self.write_i_os = inp;
self
}
/// Consumes the builder and constructs a [`IoUsage`](crate::model::IoUsage)
pub fn build(self) -> crate::model::IoUsage {
crate::model::IoUsage {
read_i_os: self.read_i_os.unwrap_or_default(),
write_i_os: self.write_i_os.unwrap_or_default(),
}
}
}
}
impl IoUsage {
/// Creates a new builder-style object to manufacture [`IoUsage`](crate::model::IoUsage)
pub fn builder() -> crate::model::io_usage::Builder {
crate::model::io_usage::Builder::default()
}
}
Union
Smithy Union
is modeled as enum
in Rust.
- Generated
enum
s must be marked#[non_exhaustive]
. - Generated
enum
s must provide anUnknown
variant. If parsing receives an unknown input that doesn't match any of the given union variants,Unknown
should be constructed. Tracking Issue. - Union members (enum variants) are not nullable, because Smithy union members cannot contain null values.
- When union members contain references to other shapes, we generate a wrapping variant (see below).
- Union members do not require
#[non_exhaustive]
, because changing the shape targeted by a union member is not backwards compatible. is_variant
andas_variant
helper functions are generated to improve ergonomics.
Generated Union Example
The union generated for a simplified dynamodb::AttributeValue
Smithy:
namespace test
union AttributeValue {
@documentation("A string value")
string: String,
bool: Boolean,
bools: BoolList,
map: ValueMap
}
map ValueMap {
key: String,
value: AttributeValue
}
list BoolList {
member: Boolean
}
Rust:
#[non_exhaustive]
#[derive(std::clone::Clone, std::cmp::PartialEq, std::fmt::Debug)]
pub enum AttributeValue {
/// a string value
String(std::string::String),
Bool(bool),
Bools(std::vec::Vec<bool>),
Map(std::collections::HashMap<std::string::String, crate::model::AttributeValue>),
}
impl AttributeValue {
pub fn as_bool(&self) -> Result<&bool, &crate::model::AttributeValue> {
if let AttributeValue::Bool(val) = &self { Ok(&val) } else { Err(self) }
}
pub fn is_bool(&self) -> bool {
self.as_bool().is_some()
}
pub fn as_bools(&self) -> Result<&std::vec::Vec<bool>, &crate::model::AttributeValue> {
if let AttributeValue::Bools(val) = &self { Ok(&val) } else { Err(self) }
}
pub fn is_bools(&self) -> bool {
self.as_bools().is_some()
}
pub fn as_map(&self) -> Result<&std::collections::HashMap<std::string::String, crate::model::AttributeValue>, &crate::model::AttributeValue> {
if let AttributeValue::Map(val) = &self { Ok(&val) } else { Err(self) }
}
pub fn is_map(&self) -> bool {
self.as_map().is_some()
}
pub fn as_string(&self) -> Result<&std::string::String, &crate::model::AttributeValue> {
if let AttributeValue::String(val) = &self { Ok(&val) } else { Err(self) }
}
pub fn is_string(&self) -> bool {
self.as_string().is_some()
}
}
Backwards Compatibility
AWS SDKs require that clients can evolve in a backwards compatible way as new fields and operations are added. The types
generated by smithy-rs
are specifically designed to meet these requirements. Specifically, the following
transformations must not break compilation when upgrading to a new version:
- New operation added
- New member added to structure
- New union variant added
- New error added (todo)
- New enum variant added (todo)
However, the following changes are not backwards compatible:
- Error removed from operation.
In general, the best tool in Rust to solve these issues in the #[non_exhaustive]
attribute which will be explored in
detail below.
New Operation Added
Before
$version: "1"
namespace s3
service S3 {
operations: [GetObject]
}
After
$version: "1"
namespace s3
service S3 {
operations: [GetObject, PutObject]
}
Adding support for a new operation is backwards compatible because SDKs to not expose any sort of "service trait" that provides an interface over an entire service. This prevents clients from inheriting or implementing an interface that would be broken by the addition of a new operation.
New member added to structure
Summary
- Structures are marked
#[non_exhaustive]
- Structures must be instantiated using builders
- Structures must not derive
Default
in the event that required fields are added in the future.
In general, adding a new public
member to a structure in Rust is not backwards compatible. However, by applying
the #[non_exhaustive]
to the structures generated by the Rust SDK, the Rust compiler will prevent users from using our
structs in ways that prevent new fields from being added in the future. Note: in this context, the optionality of
the fields is irrelevant.
Specifically, #[non_exhaustive]
prohibits the
following patterns:
-
Direct structure instantiation:
fn foo() { let ip_addr = IpAddress { addr: "192.168.1.1" }; }
If a new member
is_local: boolean
was added to the IpAddress structure, this code would not compile. To enable users to still construct our structures while maintaining backwards compatibility, all structures expose a builder, accessible atSomeStruct::Builder
:fn foo() { let ip_addr = IpAddress::builder().addr("192.168.1.1").build(); }
-
Structure destructuring:
fn foo() { let IpAddress { addr } = some_ip_addr(); }
This will also fail to compile if a new member is added, however, by adding
#[non_exhaustive]
, the..
multifield wildcard MUST be added to support new fields being added in the future:fn foo() { let IpAddress { addr, .. } = some_ip_addr(); }
Validation & Required Members
Adding a required member to a structure is not considered backwards compatible. When a required member is added to a structure:
- The builder will change to become fallible, meaning that instead of returning
T
it will returnResult<T, BuildError>
. - Previous builder invocations that did not set the new field will still stop compiling if this was the first required field.
- Previous builder invocations will now return a
BuildError
because the required field is unset.
New union variant added
Similar to structures, #[non_exhaustive]
also applies to unions. In order to allow new union variants to be added in
the future, all unions (enum
in Rust) generated by the Rust SDK must be marked with #[non_exhaustive]
. Note:
because new fields cannot be added to union variants, the union variants themselves do not need
to be #[non_exhaustive]
. To support new variants from services, each union contains an Unknown
variant. By
marking Unknown
as non_exhaustive, we prevent customers from instantiating it directly.
#[non_exhaustive]
#[derive(std::clone::Clone, std::cmp::PartialEq, std::fmt::Debug)]
pub enum AttributeValue {
B(aws_smithy_types::Blob),
Bool(bool),
Bs(std::vec::Vec<aws_smithy_types::Blob>),
L(std::vec::Vec<crate::model::AttributeValue>),
M(std::collections::HashMap<std::string::String, crate::model::AttributeValue>),
N(std::string::String),
Ns(std::vec::Vec<std::string::String>),
Null(bool),
S(std::string::String),
Ss(std::vec::Vec<std::string::String>),
// By marking `Unknown` as non_exhaustive, we prevent client code from instantiating it directly.
#[non_exhaustive]
Unknown,
}
Smithy Client
smithy-rs
provides the ability to generate a client whose operations defined by a smithy model. The documents referenced
here explain aspects of the client in greater detail.
What is the orchestrator?
At a very high level, an orchestrator is a process for transforming requests into responses. Please enjoy this fancy chart:
flowchart TB A(Orchestrate)-->|Input|B(Request serialization) B-->|Transmit Request|C(Connection) C-->|Transmit Response|D(Response deserialization) D-->|Success|E("Ok(Output)") D-->|Unretryable Failure|F("Err(SdkError)") D-->|Retryable Failure|C
This process is also referred to as the "request/response lifecycle." In this example, the types of "transmit request" and "transmit response" are protocol-dependent. Typical operations use HTTP, but we plan to support other protocols like MQTT in the future.
In addition to the above steps, the orchestrator must also handle:
- Endpoint resolution: figuring out which URL to send a request to.
- Authentication, identity resolution, and request signing: Figuring out who is sending the request, their credentials, and how we should insert the credentials into a request.
- Interceptors: Running lifecycle hooks at each point in the request/response lifecycle.
- Runtime Plugins: Resolving configuration from config builders.
- Retries: Categorizing responses from services and deciding whether to retry and how long to wait before doing so.
- Trace Probes: A sink for events that occur during the request/response lifecycle.
How is an orchestrator configured?
While the structure of an orchestrator is fixed, the actions it takes during its lifecycle are highly configurable. Users have two ways to configure this process:
- Runtime Plugins:
- When can these be set? Any time before calling
orchestrate
. - When are they called by the orchestrator? In two batches, at the very beginning of
orchestrate
. - What can they do?
- They can set configuration to be used by the orchestrator or in interceptors.
- They can set interceptors.
- Are they user-definable? No. At present, only smithy-rs maintainers may define these.
- When can these be set? Any time before calling
- Interceptors:
- When can these be set? Any time before calling
orchestrate
. - When are they called by the orchestrator? At each step in the request-response lifecycle.
- What can they do?
- They can set configuration to be used by the orchestrator or in interceptors.
- They can log information.
- Depending on when they're run, they can modify the input, transmit request, transmit response, and the output/error.
- Are they user-definable? Yes.
- When can these be set? Any time before calling
Configuration for a request is constructed by runtime plugins just after calling orchestrate
. Configuration is stored in a ConfigBag
: a hash map that's keyed on type's TypeId
(an opaque object, managed by the Rust compiler, which references some type.)
What does the orchestrator do?
The orchestrator's work is divided into four phases:
NOTE: If an interceptor fails, then the other interceptors for that lifecycle event are still run. All resulting errors are collected and emitted together.
- Building the
ConfigBag
and mounting interceptors.- This phase is fallible.
- An interceptor context is created. This will hold request and response objects, making them available to interceptors.
- All runtime plugins set at the client-level are run. These plugins can set config and mount interceptors. Any "read before execution" interceptors that have been set get run.
- All runtime plugins set at the operation-level are run. These plugins can also set config and mount interceptors. Any new "read before execution" interceptors that have been set get run.
- Request Construction
- This phase is fallible.
- The "read before serialization" and "modify before serialization" interceptors are called.
- The input is serialized into a transmit request.
- The "read after serialization" and "modify before retry loop" interceptors are called.
- Before making an attempt, the retry handler is called to check if an attempt should be made. The retry handler makes this decision for an initial attempt as well as for the retry attempts. If an initial attempt should be made, then the orchestrator enters the Dispatch phase. Otherwise, a throttling error is returned.
- Request Dispatch
- This phase is fallible. This phase's tasks are performed in a loop. Retryable request failures will be retried, and unretryable failures will end the loop.
- The "read before attempt" interceptors are run.
- An endpoint is resolved according to an endpoint resolver. The resolved endpoint is then applied to the transmit request.
- The "read before signing" and "modify before signing" interceptors are run.
- An identity and a signer are resolved according to an authentication resolver. The signer then signs the transmit request with the identity.
- The "read after signing", "read before transmit", and "modify before transmit" interceptors are run.
- The transmit request is passed into the connection, and a transmit response is received.
- The "read after transmit", "read before deserialization", and "modify before deserialization" interceptors are run.
- The transmit response is deserialized.
- The "read after attempt" and "modify before attempt completion" interceptors are run.
- The retry strategy is called to check if a retry is necessary. If a retry is required, the Dispatch phase restarts. Otherwise, the orchestrator enters the Response Handling phase.
- Response Handling
- This phase is fallible.
- The "read after deserialization" and "modify before completion" interceptors are run.
- Events are dispatched to any trace probes that the user has set.
- The "read after execution" interceptors are run.
At the end of all this, the response is returned. If an error occurred at any point, then the response will contain one or more errors, depending on what failed. Otherwise, the output will be returned.
How is the orchestrator implemented in Rust?
Avoiding generics at all costs
In designing the orchestrator, we sought to solve the problems we had with the original smithy client. The client made heavy use of generics, allowing for increased performance, but at the cost of increased maintenance burden and increased compile times. The Rust compiler, usually very helpful, isn't well-equipped to explain trait errors when bounds are this complex, and so the resulting client was difficult to extend. Trait aliases would have helped, but they're not (at the time of writing) available.
The type signatures for the old client and its call
method:
impl<C, M, R> Client<C, M, R>
where
C: bounds::SmithyConnector,
M: bounds::SmithyMiddleware<C>,
R: retry::NewRequestPolicy,
{
pub async fn call<O, T, E, Retry>(&self, op: Operation<O, Retry>) -> Result<T, SdkError<E>>
where
O: Send + Sync,
E: std::error::Error + Send + Sync + 'static,
Retry: Send + Sync,
R::Policy: bounds::SmithyRetryPolicy<O, T, E, Retry>,
Retry: ClassifyRetry<SdkSuccess<T>, SdkError<E>>,
bounds::Parsed<<M as bounds::SmithyMiddleware<C>>::Service, O, Retry>:
Service<Operation<O, Retry>, Response=SdkSuccess<T>, Error=SdkError<E>> + Clone,
{
self.call_raw(op).await.map(|res| res.parsed)
}
pub async fn call_raw<O, T, E, Retry>(
&self,
op: Operation<O, Retry>,
) -> Result<SdkSuccess<T>, SdkError<E>>
where
O: Send + Sync,
E: std::error::Error + Send + Sync + 'static,
Retry: Send + Sync,
R::Policy: bounds::SmithyRetryPolicy<O, T, E, Retry>,
Retry: ClassifyRetry<SdkSuccess<T>, SdkError<E>>,
// This bound is not _technically_ inferred by all the previous bounds, but in practice it
// is because _we_ know that there is only implementation of Service for Parsed
// (ParsedResponseService), and it will apply as long as the bounds on C, M, and R hold,
// and will produce (as expected) Response = SdkSuccess<T>, Error = SdkError<E>. But Rust
// doesn't know that -- there _could_ theoretically be other implementations of Service for
// Parsed that don't return those same types. So, we must give the bound.
bounds::Parsed<<M as bounds::SmithyMiddleware<C>>::Service, O, Retry>:
Service<Operation<O, Retry>, Response=SdkSuccess<T>, Error=SdkError<E>> + Clone,
{
// The request/response lifecycle
}
}
The type signature for the new orchestrate
method:
pub async fn orchestrate(
input: Input,
runtime_plugins: &RuntimePlugins,
// Currently, SdkError is HTTP-only. We currently use it for backwards-compatibility purposes.
// The `HttpResponse` generic will likely be removed in the future.
) -> Result<Output, SdkError<Error, HttpResponse>> {
// The request/response lifecycle
}
Wait a second, I hear you ask. "I see an Input
and Output
there, but you're not declaring any generic type arguments. What gives?"
I'm glad you asked. Generally, when you need traits, but you aren't willing to use generic type arguments, then you must Box
. Polymorphism is achieved through dynamic dispatch instead of static dispatch, and this comes with a small runtime cost.
So, what are Input
and Output
? They're our own special flavor of a boxed trait object.
pub type Input = TypeErasedBox;
pub type Output = TypeErasedBox;
pub type Error = TypeErasedBox;
/// A new-type around `Box<dyn Any + Send + Sync>`
#[derive(Debug)]
pub struct TypeErasedBox {
inner: Box<dyn Any + Send + Sync>,
}
The orchestrator itself doesn't know about any concrete types. Instead, it passes boxed data between the various components of the request/response lifecycle. Individual components access data in two ways:
- From the
ConfigBag
:
- (with an accessor)
let retry_strategy = cfg.retry_strategy();
- (with the get method)
let retry_strategy = cfg.get::<Box<dyn RetryStrategy>>()
- From the
InterceptorContext
:
- (owned)
let put_object_input: PutObjectInput = ctx.take_input().unwrap().downcast().unwrap()?;
- (by reference)
let put_object_input = ctx.input().unwrap().downcast_ref::<PutObjectInput>().unwrap();
Users can only call ConfigBag::get
or downcast a TypeErasedBox
to types they have access to, which allows maintainers to ensure encapsulation. For example: a plugin writer may declare a private type, place it in the config bag, and then later retrieve it. Because the type is private, only code in the same crate/module can ever insert or retrieve it. Therefore, there's less worry that someone will depend on a hidden, internal detail and no worry they'll accidentally overwrite a type in the bag.
NOTE: When inserting values into a config bag, using one of the
set_<component>
methods is always preferred, as this prevents mistakes related to inserting similar, but incorrect types.
The actual code
The current implementation of orchestrate
is defined here, in the aws-smithy-runtime
crate. Related code can be found in the aws-smithy-runtime-api
crate.
Frequently asked questions
Why can't users create and use their own runtime plugins?
We chose to hide the runtime plugin API from users because we are concerned that exposing it will cause more problems than it solves. Instead, we encourage users to use interceptors. This is because, when setting a runtime plugin, any existing runtime plugin with the same type will be replaced. For example, there can only be one retry strategy or response deserializer. Errors resulting from unintentionally overriding a plugin would be difficult for users to diagnose, and would consume valuable development time.
Why does the orchestrator exist?
The orchestrator exists because there is an AWS-internal initiative to bring the architecture of all AWS SDKs closer to one another.
Why does this document exist when there's already an orchestrator RFC?
Because RFCs become outdated as designs evolve. It is our intention to keep this document up to date with our current implementation.
Identity and Auth in Clients
The Smithy specification establishes several auth related modeling traits that can be applied to operation and service shapes. To briefly summarize:
- The auth schemes that are supported by a service are declared on the service shape
- Operation shapes MAY specify the subset of service-defined auth schemes they support. If none are specified, then all service-defined auth schemes are supported.
A smithy code generator MUST support at least one auth scheme for every modeled operation, but it need not support ALL modeled auth schemes.
This design document establishes how smithy-rs implements this specification.
Terminology
- Auth: Either a shorthand that represents both of the authentication and authorization terms below, or an ambiguous representation of one of them. In this doc, this term will always refer to both.
- Authentication: The process of proving an entity is who they claim they are, sometimes referred to as AuthN.
- Authorization: The process of granting an authenticated entity the permission to do something, sometimes referred to as AuthZ.
- Identity: The information required for authentication.
- Signing: The process of attaching metadata to a request that allows a server to authenticate that request.
Overview of Smithy Client Auth
There are two stages to identity and auth:
- Configuration
- Execution
The configuration stage
First, let's establish the aspects of auth that can be configured from the model at codegen time.
- Data
- AuthSchemeOptionResolverParams: parameters required to resolve auth scheme options. These parameters are allowed to come from both the client config and the operation input structs.
- AuthSchemes: a list of auth schemes that can be used to sign HTTP requests. This information comes directly from the service model.
- AuthSchemeProperties: configuration from the auth scheme for the signer.
- IdentityResolvers: list of available identity resolvers.
- Implementations
- IdentityResolver: resolves an identity for use in authentication. There can be multiple identity resolvers that need to be selected from.
- Signer: a signing implementation that signs a HTTP request.
- ResolveAuthSchemeOptions: resolves a list of auth scheme options for a given operation and its inputs.
As it is undocumented (at time of writing), this document assumes that the code generator creates one service-level runtime plugin, and an operation-level runtime plugin per operation, hence referred to as the service runtime plugin and operation runtime plugin.
The code generator emits code to add identity resolvers and HTTP auth schemes to the config bag in the service runtime plugin. It then emits code to register an interceptor in the operation runtime plugin that reads the operation input to generate the auth scheme option resolver params (which also get added to the config bag).
The execution stage
At a high-level, the process of resolving an identity and signing a request looks as follows:
- Retrieve the
AuthSchemeOptionResolverParams
from the config bag. TheAuthSchemeOptionResolverParams
allow client config and operation inputs to play a role in which auth scheme option is selected. - Retrieve the
ResolveAuthSchemeOptions
impl from the config bag, and use it to resolve the auth scheme options available with theAuthSchemeOptionResolverParams
. The returned auth scheme options are in priority order. - Retrieve the
IdentityResolvers
list from the config bag. - For each auth scheme option:
- Attempt to find an HTTP auth scheme for that auth scheme option in the config bag (from the
AuthSchemes
list). - If an auth scheme is found:
- Use the auth scheme to extract the correct identity resolver from the
IdentityResolvers
list. - Retrieve the
Signer
implementation from the auth scheme. - Use the
IdentityResolver
to resolve the identity needed for signing. - Sign the request with the identity, and break out of the loop from step #4.
- Use the auth scheme to extract the correct identity resolver from the
- Attempt to find an HTTP auth scheme for that auth scheme option in the config bag (from the
In general, it is assumed that if an HTTP auth scheme exists for an auth scheme option, then an identity resolver also exists for that auth scheme option. Otherwise, the auth option was configured incorrectly during codegen.
How this looks in Rust
The client will use trait objects and dynamic dispatch for the IdentityResolver
,
Signer
, and AuthSchemeOptionResolver
implementations. Generics could potentially be used,
but the number of generic arguments and trait bounds in the orchestrator would balloon to
unmaintainable levels if each configurable implementation in it was made generic.
These traits look like this:
#[derive(Clone, Debug)]
pub struct AuthSchemeId {
scheme_id: &'static str,
}
pub trait ResolveAuthSchemeOptions: Send + Sync + Debug {
fn resolve_auth_scheme_options<'a>(
&'a self,
params: &AuthSchemeOptionResolverParams,
) -> Result<Cow<'a, [AuthSchemeId]>, BoxError>;
}
pub trait IdentityResolver: Send + Sync + Debug {
fn resolve_identity(&self, config: &ConfigBag) -> BoxFallibleFut<Identity>;
}
pub trait Signer: Send + Sync + Debug {
/// Return a signed version of the given request using the given identity.
///
/// If the provided identity is incompatible with this signer, an error must be returned.
fn sign_http_request(
&self,
request: &mut HttpRequest,
identity: &Identity,
auth_scheme_endpoint_config: AuthSchemeEndpointConfig<'_>,
runtime_components: &RuntimeComponents,
config_bag: &ConfigBag,
) -> Result<(), BoxError>;
}
IdentityResolver
and Signer
implementations are both given an Identity
, but
will need to understand what the concrete data type underlying that identity is. The Identity
struct
uses a Arc<dyn Any>
to represent the actual identity data so that generics are not needed in
the traits:
#[derive(Clone, Debug)]
pub struct Identity {
data: Arc<dyn Any + Send + Sync>,
expiration: Option<SystemTime>,
}
Identities can often be cached and reused across several requests, which is why the Identity
uses Arc
rather than Box
. This also reduces the allocations required. The signer implementations
will use downcasting to access the identity data types they understand. For example, with AWS SigV4,
it might look like the following:
fn sign_http_request(
&self,
request: &mut HttpRequest,
identity: &Identity,
auth_scheme_endpoint_config: AuthSchemeEndpointConfig<'_>,
runtime_components: &RuntimeComponents,
config_bag: &ConfigBag,
) -> Result<(), BoxError> {
let aws_credentials = identity.data::<Credentials>()
.ok_or_else(|| "The SigV4 signer requires AWS credentials")?;
let access_key = &aws_credentials.secret_access_key;
// -- snip --
}
Also note that identity data structs are expected to censor their own sensitive fields, as
Identity
implements the automatically derived Debug
trait.
Challenges with this Identity
design
A keen observer would note that there is an expiration
field on Identity
, and may ask, "what about
non-expiring identities?" This is the result of a limitation on Box<dyn Any>
, where it can only be
downcasted to concrete types. There is no way to downcast to a dyn Trait
since the information required
to verify that that type is that trait is lost at compile time (a std::any::TypeId
only encodes information
about the concrete type).
In an ideal world, it would be possible to extract the expiration like this:
pub trait ExpiringIdentity {
fn expiration(&self) -> SystemTime;
}
let identity: Identity = some_identity();
if let Some(expiration) = identity.data::<&dyn ExpiringIdentity>().map(ExpiringIdentity::expiration) {
// make a decision based on that expiration
}
Theoretically, you should be able to save off additional type information alongside the Box<dyn Any>
and use
unsafe code to transmute to known traits, but it is difficult to implement in practice, and adds unsafe code
in a security critical piece of code that could otherwise be avoided.
The expiration
field is a special case that is allowed onto the Identity
struct directly since identity
cache implementations will always need to be aware of this piece of information, and having it as an Option
still allows for non-expiring identities.
Ultimately, this design constrains Signer
implementations to concrete types. There is no world
where an Signer
can operate across multiple unknown identity data types via trait, and that
should be OK since the signer implementation can always be wrapped with an implementation that is aware
of the concrete type provided by the identity resolver, and can do any necessary conversions.
Detailed Error Explanations
This page collects detailed explanations for some errors. If you encounter an error and are interested in learning more about what it means and why it occurs, check here.
If you can't find the explanation on this page, please file an issue asking for it to be added.
"Connection encountered an issue and should not be re-used. Marking it for closure"
The SDK clients each maintain their own connection pool (except when they share
an HttpClient
). By the convention of some services, when a request fails due
to a transient error, that connection should not be re-used
for a retry. Instead, it should be dropped and a new connection created instead.
This prevents clients from repeatedly sending requests over a failed connection.
This feature is referred to as "connection poisoning" internally.
Transient Errors
When requests to a service time out, or when a service responds with a 500, 502, 503, or 504 error, it's considered a 'transient error'. Transient errors are often resolved by making another request.
When retrying transient errors, the SDKs may avoid re-using connections to overloaded or otherwise unavailable service endpoints, choosing instead to establish a new connection. This behavior is referred to internally as "connection poisoning" and is configurable.
To configure this behavior, set the reconnect_mode in an SDK client config's RetryConfig.
Smithy Server
Smithy Rust provides the ability to generate a server whose operations are provided by the customer.
- Middleware
- Instrumentation
- Accessing Un-modelled Data
- The Anatomy of a Service
- Generating Common Service Code
Middleware
The following document provides a brief survey of the various positions middleware can be inserted in Smithy Rust.
We use the Pokémon service as a reference model throughout.
/// A Pokémon species forms the basis for at least one Pokémon.
@title("Pokémon Species")
resource PokemonSpecies {
identifiers: {
name: String
},
read: GetPokemonSpecies,
}
/// A users current Pokémon storage.
resource Storage {
identifiers: {
user: String
},
read: GetStorage,
}
/// The Pokémon Service allows you to retrieve information about Pokémon species.
@title("Pokémon Service")
@restJson1
service PokemonService {
version: "2021-12-01",
resources: [PokemonSpecies, Storage],
operations: [
GetServerStatistics,
DoNothing,
CapturePokemon,
CheckHealth
],
}
Introduction to Tower
Smithy Rust is built on top of tower
.
Tower is a library of modular and reusable components for building robust networking clients and servers.
The tower
library is centered around two main interfaces, the Service
trait and the Layer
trait.
The Service
trait can be thought of as an asynchronous function from a request to a response, async fn(Request) -> Result<Response, Error>
, coupled with a mechanism to handle back pressure, while the Layer
trait can be thought of as a way of decorating a Service
, transforming either the request or response.
Middleware in tower
typically conforms to the following pattern, a Service
implementation of the form
#![allow(unused)] fn main() { pub struct NewService<S> { inner: S, /* auxillary data */ } }
and a complementary
#![allow(unused)] fn main() { extern crate tower; pub struct NewService<S> { inner: S } use tower::{Layer, Service}; pub struct NewLayer { /* auxiliary data */ } impl<S> Layer<S> for NewLayer { type Service = NewService<S>; fn layer(&self, inner: S) -> Self::Service { NewService { inner, /* auxiliary fields */ } } } }
The NewService
modifies the behavior of the inner Service
S
while the NewLayer
takes auxiliary data and constructs NewService<S>
from S
.
Customers are then able to stack middleware by composing Layer
s using combinators such as ServiceBuilder::layer
and Stack
.
Applying Middleware
One of the primary goals is to provide configurability and extensibility through the application of middleware. The customer is able to apply Layer
s in a variety of key places during the request/response lifecycle. The following schematic labels each configurable middleware position from A to D:
stateDiagram-v2 state in <<fork>> state "GetPokemonSpecies" as C1 state "GetStorage" as C2 state "DoNothing" as C3 state "..." as C4 direction LR [*] --> in : HTTP Request UpgradeLayer --> [*]: HTTP Response state A { state PokemonService { state RoutingService { in --> UpgradeLayer: HTTP Request in --> C2: HTTP Request in --> C3: HTTP Request in --> C4: HTTP Request state B { state C1 { state C { state UpgradeLayer { direction LR [*] --> Handler: Model Input Handler --> [*] : Model Output state D { Handler } } } } C2 C3 C4 } } } } C2 --> [*]: HTTP Response C3 --> [*]: HTTP Response C4 --> [*]: HTTP Response
where UpgradeLayer
is the Layer
converting Smithy model structures to HTTP structures and the RoutingService
is responsible for routing requests to the appropriate operation.
A. Outer Middleware
The output of the Smithy service builder provides the user with a Service<http::Request, Response = http::Response>
implementation. A Layer
can be applied around the entire Service
.
#![allow(unused)] fn main() { extern crate aws_smithy_http_server; extern crate pokemon_service_server_sdk; extern crate tower; use std::time::Duration; struct TimeoutLayer; impl TimeoutLayer { fn new(t: Duration) -> Self { Self }} impl<S> Layer<S> for TimeoutLayer { type Service = S; fn layer(&self, svc: S) -> Self::Service { svc } } use pokemon_service_server_sdk::{input::*, output::*, error::*}; let handler = |req: GetPokemonSpeciesInput| async { Result::<GetPokemonSpeciesOutput, GetPokemonSpeciesError>::Ok(todo!()) }; use aws_smithy_http_server::protocol::rest_json_1::{RestJson1, router::RestRouter}; use aws_smithy_http_server::routing::{Route, RoutingService}; use pokemon_service_server_sdk::{PokemonServiceConfig, PokemonService}; use tower::Layer; let config = PokemonServiceConfig::builder().build(); // This is a HTTP `Service`. let app = PokemonService::builder(config) .get_pokemon_species(handler) /* ... */ .build() .unwrap(); let app: PokemonService<RoutingService<RestRouter<Route>, RestJson1>> = app; // Construct `TimeoutLayer`. let timeout_layer = TimeoutLayer::new(Duration::from_secs(3)); // Apply a 3 second timeout to all responses. let app = timeout_layer.layer(app); }
B. Route Middleware
A single layer can be applied to all routes inside the Router
. This
exists as a method on the PokemonServiceConfig
builder object, which is passed into the
service builder.
#![allow(unused)] fn main() { extern crate tower; extern crate pokemon_service_server_sdk; extern crate aws_smithy_http_server; use tower::{util::service_fn, Layer}; use std::time::Duration; use aws_smithy_http_server::protocol::rest_json_1::{RestJson1, router::RestRouter}; use aws_smithy_http_server::routing::{Route, RoutingService}; use pokemon_service_server_sdk::{input::*, output::*, error::*}; let handler = |req: GetPokemonSpeciesInput| async { Result::<GetPokemonSpeciesOutput, GetPokemonSpeciesError>::Ok(todo!()) }; struct MetricsLayer; impl MetricsLayer { pub fn new() -> Self { Self } } impl<S> Layer<S> for MetricsLayer { type Service = S; fn layer(&self, svc: S) -> Self::Service { svc } } use pokemon_service_server_sdk::{PokemonService, PokemonServiceConfig}; // Construct `MetricsLayer`. let metrics_layer = MetricsLayer::new(); let config = PokemonServiceConfig::builder().layer(metrics_layer).build(); let app = PokemonService::builder(config) .get_pokemon_species(handler) /* ... */ .build() .unwrap(); let app: PokemonService<RoutingService<RestRouter<Route>, RestJson1>> = app; }
Note that requests pass through this middleware immediately after routing succeeds and therefore will not be encountered if routing fails. This means that the TraceLayer in the example above does not provide logs unless routing has completed. This contrasts to middleware A, which all requests/responses pass through when entering/leaving the service.
C. Operation Specific HTTP Middleware
A "HTTP layer" can be applied to specific operations.
#![allow(unused)] fn main() { extern crate tower; extern crate pokemon_service_server_sdk; extern crate aws_smithy_http_server; use tower::{util::service_fn, Layer}; use std::time::Duration; use pokemon_service_server_sdk::{operation_shape::GetPokemonSpecies, input::*, output::*, error::*}; use aws_smithy_http_server::protocol::rest_json_1::{RestJson1, router::RestRouter}; use aws_smithy_http_server::routing::{Route, RoutingService}; use aws_smithy_http_server::{operation::OperationShapeExt, plugin::*, operation::*}; let handler = |req: GetPokemonSpeciesInput| async { Result::<GetPokemonSpeciesOutput, GetPokemonSpeciesError>::Ok(todo!()) }; struct LoggingLayer; impl LoggingLayer { pub fn new() -> Self { Self } } impl<S> Layer<S> for LoggingLayer { type Service = S; fn layer(&self, svc: S) -> Self::Service { svc } } use pokemon_service_server_sdk::{PokemonService, PokemonServiceConfig, scope}; scope! { /// Only log on `GetPokemonSpecies` and `GetStorage` struct LoggingScope { includes: [GetPokemonSpecies, GetStorage] } } // Construct `LoggingLayer`. let logging_plugin = LayerPlugin(LoggingLayer::new()); let logging_plugin = Scoped::new::<LoggingScope>(logging_plugin); let http_plugins = HttpPlugins::new().push(logging_plugin); let config = PokemonServiceConfig::builder().http_plugin(http_plugins).build(); let app = PokemonService::builder(config) .get_pokemon_species(handler) /* ... */ .build() .unwrap(); let app: PokemonService<RoutingService<RestRouter<Route>, RestJson1>> = app; }
This middleware transforms the operations HTTP requests and responses.
D. Operation Specific Model Middleware
A "model layer" can be applied to specific operations.
#![allow(unused)] fn main() { extern crate tower; extern crate pokemon_service_server_sdk; extern crate aws_smithy_http_server; use tower::{util::service_fn, Layer}; use pokemon_service_server_sdk::{operation_shape::GetPokemonSpecies, input::*, output::*, error::*}; let handler = |req: GetPokemonSpeciesInput| async { Result::<GetPokemonSpeciesOutput, GetPokemonSpeciesError>::Ok(todo!()) }; use aws_smithy_http_server::{operation::*, plugin::*}; use aws_smithy_http_server::protocol::rest_json_1::{RestJson1, router::RestRouter}; use aws_smithy_http_server::routing::{Route, RoutingService}; struct BufferLayer; impl BufferLayer { pub fn new(size: usize) -> Self { Self } } impl<S> Layer<S> for BufferLayer { type Service = S; fn layer(&self, svc: S) -> Self::Service { svc } } use pokemon_service_server_sdk::{PokemonService, PokemonServiceConfig, scope}; scope! { /// Only buffer on `GetPokemonSpecies` and `GetStorage` struct BufferScope { includes: [GetPokemonSpecies, GetStorage] } } // Construct `BufferLayer`. let buffer_plugin = LayerPlugin(BufferLayer::new(3)); let buffer_plugin = Scoped::new::<BufferScope>(buffer_plugin); let config = PokemonServiceConfig::builder().model_plugin(buffer_plugin).build(); let app = PokemonService::builder(config) .get_pokemon_species(handler) /* ... */ .build() .unwrap(); let app: PokemonService<RoutingService<RestRouter<Route>, RestJson1>> = app; }
In contrast to position C, this middleware transforms the operations modelled inputs to modelled outputs.
Plugin System
Suppose we want to apply a different Layer
to every operation. In this case, position B (PokemonService::layer
) will not suffice because it applies a single Layer
to all routes and while position C (Operation::layer
) would work, it'd require the customer constructs the Layer
by hand for every operation.
Consider the following middleware:
#![allow(unused)] fn main() { extern crate aws_smithy_http_server; extern crate tower; use aws_smithy_http_server::shape_id::ShapeId; use std::task::{Context, Poll}; use tower::Service; /// A [`Service`] that adds a print log. pub struct PrintService<S> { inner: S, operation_id: ShapeId, service_id: ShapeId } impl<R, S> Service<R> for PrintService<S> where S: Service<R>, { type Response = S::Response; type Error = S::Error; type Future = S::Future; fn poll_ready(&mut self, cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> { self.inner.poll_ready(cx) } fn call(&mut self, req: R) -> Self::Future { println!("Hi {} in {}", self.operation_id.name(), self.service_id.name()); self.inner.call(req) } } }
The plugin system provides a way to construct then apply Layer
s in position C and D, using the protocol and operation shape as parameters.
An example of a PrintPlugin
which prints the operation name:
#![allow(unused)] fn main() { extern crate aws_smithy_http_server; use aws_smithy_http_server::shape_id::ShapeId; pub struct PrintService<S> { inner: S, operation_id: ShapeId, service_id: ShapeId } use aws_smithy_http_server::{plugin::Plugin, operation::OperationShape, service::ServiceShape}; /// A [`Plugin`] for a service builder to add a [`PrintService`] over operations. #[derive(Debug)] pub struct PrintPlugin; impl<Ser, Op, T> Plugin<Ser, Op, T> for PrintPlugin where Ser: ServiceShape, Op: OperationShape, { type Output = PrintService<T>; fn apply(&self, inner: T) -> Self::Output { PrintService { inner, operation_id: Op::ID, service_id: Ser::ID, } } } }
You can provide a custom method to add your plugin to a collection of HttpPlugins
or ModelPlugins
via an extension trait. For example, for HttpPlugins
:
#![allow(unused)] fn main() { extern crate aws_smithy_http_server; pub struct PrintPlugin; impl aws_smithy_http_server::plugin::HttpMarker for PrintPlugin { } use aws_smithy_http_server::plugin::{HttpPlugins, PluginStack}; /// This provides a [`print`](PrintExt::print) method on [`HttpPlugins`]. pub trait PrintExt<ExistingPlugins> { /// Causes all operations to print the operation name when called. /// /// This works by applying the [`PrintPlugin`]. fn print(self) -> HttpPlugins<PluginStack<PrintPlugin, ExistingPlugins>>; } impl<ExistingPlugins> PrintExt<ExistingPlugins> for HttpPlugins<ExistingPlugins> { fn print(self) -> HttpPlugins<PluginStack<PrintPlugin, ExistingPlugins>> { self.push(PrintPlugin) } } }
This allows for:
#![allow(unused)] fn main() { extern crate pokemon_service_server_sdk; extern crate aws_smithy_http_server; use aws_smithy_http_server::plugin::{PluginStack, Plugin}; struct PrintPlugin; impl<Ser, Op, T> Plugin<Ser, Op, T> for PrintPlugin { type Output = T; fn apply(&self, svc: T) -> Self::Output { svc }} impl aws_smithy_http_server::plugin::HttpMarker for PrintPlugin { } trait PrintExt<EP> { fn print(self) -> HttpPlugins<PluginStack<PrintPlugin, EP>>; } impl<EP> PrintExt<EP> for HttpPlugins<EP> { fn print(self) -> HttpPlugins<PluginStack<PrintPlugin, EP>> { self.push(PrintPlugin) }} use pokemon_service_server_sdk::{operation_shape::GetPokemonSpecies, input::*, output::*, error::*}; let handler = |req: GetPokemonSpeciesInput| async { Result::<GetPokemonSpeciesOutput, GetPokemonSpeciesError>::Ok(todo!()) }; use aws_smithy_http_server::protocol::rest_json_1::{RestJson1, router::RestRouter}; use aws_smithy_http_server::routing::{Route, RoutingService}; use aws_smithy_http_server::plugin::{IdentityPlugin, HttpPlugins}; use pokemon_service_server_sdk::{PokemonService, PokemonServiceConfig}; let http_plugins = HttpPlugins::new() // [..other plugins..] // The custom method! .print(); let config = PokemonServiceConfig::builder().http_plugin(http_plugins).build(); let app /* : PokemonService<Route<B>> */ = PokemonService::builder(config) .get_pokemon_species(handler) /* ... */ .build() .unwrap(); let app: PokemonService<RoutingService<RestRouter<Route>, RestJson1>> = app; }
The custom print
method hides the details of the Plugin
trait from the average consumer.
They interact with the utility methods on HttpPlugins
and enjoy the self-contained documentation.
Instrumentation
A Smithy Rust server uses the tracing
crate to provide instrumentation. The customer is responsible for setting up a Subscriber
in order to ingest and process events - Smithy Rust makes no prescription on the choice of Subscriber
. Common choices might include:
tracing_subscriber::fmt
for printing tostdout
.tracing-log
to providing compatibility with thelog
.
Events are emitted and spans are opened by the aws-smithy-http-server
, aws-smithy-http-server-python
, and generated crate. The default target is always used
The tracing macros default to using the module path where the span or event originated as the target, but it may be overridden.
and therefore spans and events be filtered using the EnvFilter
and/or Targets
filters with crate and module paths.
For example,
RUST_LOG=aws_smithy_http_server=warn,aws_smithy_http_server_python=error
and
#![allow(unused)] fn main() { extern crate tracing_subscriber; extern crate tracing; use tracing_subscriber::filter; use tracing::Level; let filter = filter::Targets::new().with_target("aws_smithy_http_server", Level::DEBUG); }
In general, Smithy Rust is conservative when using high-priority log levels:
- ERROR
- Fatal errors, resulting in the termination of the service.
- Requires immediate remediation.
- WARN
- Non-fatal errors, resulting in incomplete operation.
- Indicates service misconfiguration, transient errors, or future changes in behavior.
- Requires inspection and remediation.
- INFO
- Informative events, which occur inside normal operating limits.
- Used for large state transitions, e.g. startup/shutdown.
- DEBUG
- Informative and sparse events, which occur inside normal operating limits.
- Used to debug coarse-grained progress of service.
- TRACE
- Informative and frequent events, which occur inside normal operating limits.
- Used to debug fine-grained progress of service.
Spans over the Request/Response lifecycle
Smithy Rust is built on top of tower
, which means that middleware can be used to encompass different periods of the lifecycle of the request and response and identify them with a span.
An open-source example of such a middleware is TraceLayer
provided by the tower-http
crate.
Smithy provides an out-the-box middleware which:
- Opens a DEBUG level span, prior to request handling, including the operation name and request URI and headers.
- Emits a DEBUG level event, after to request handling, including the response headers and status code.
This is enabled via the instrument
method provided by the aws_smithy_http_server::instrumentation::InstrumentExt
trait.
#![allow(unused)] fn main() { extern crate aws_smithy_http_server; extern crate pokemon_service_server_sdk; use pokemon_service_server_sdk::{operation_shape::GetPokemonSpecies, input::*, output::*, error::*}; let handler = |req: GetPokemonSpeciesInput| async { Result::<GetPokemonSpeciesOutput, GetPokemonSpeciesError>::Ok(todo!()) }; use aws_smithy_http_server::{ instrumentation::InstrumentExt, plugin::{IdentityPlugin, HttpPlugins} }; use aws_smithy_http_server::protocol::rest_json_1::{RestJson1, router::RestRouter}; use aws_smithy_http_server::routing::{Route, RoutingService}; use pokemon_service_server_sdk::{PokemonServiceConfig, PokemonService}; let http_plugins = HttpPlugins::new().instrument(); let config = PokemonServiceConfig::builder().http_plugin(http_plugins).build(); let app = PokemonService::builder(config) .get_pokemon_species(handler) /* ... */ .build() .unwrap(); let app: PokemonService<RoutingService<RestRouter<Route>, RestJson1>> = app; }
Example
The Pokémon service example, located at /examples/pokemon-service
, sets up a tracing
Subscriber
as follows:
#![allow(unused)] fn main() { extern crate tracing_subscriber; use tracing_subscriber::{prelude::*, EnvFilter}; /// Setup `tracing::subscriber` to read the log level from RUST_LOG environment variable. pub fn setup_tracing() { let format = tracing_subscriber::fmt::layer().pretty(); let filter = EnvFilter::try_from_default_env() .or_else(|_| EnvFilter::try_new("info")) .unwrap(); tracing_subscriber::registry().with(format).with(filter).init(); } }
Running the Pokémon service example using
RUST_LOG=aws_smithy_http_server=debug,pokemon_service=debug cargo r
and then using cargo t
to run integration tests against the server, yields the following logs:
2022-09-27T09:13:35.372517Z DEBUG aws_smithy_http_server::instrumentation::service: response, headers: {"content-type": "application/json", "content-length": "17"}, status_code: 200 OK
at /smithy-rs/rust-runtime/aws-smithy-http-server/src/logging/service.rs:47
in aws_smithy_http_server::instrumentation::service::request with operation: get_server_statistics, method: GET, uri: /stats, headers: {"host": "localhost:13734"}
2022-09-27T09:13:35.374104Z DEBUG pokemon_service: attempting to authenticate storage user
at pokemon-service/src/lib.rs:184
in aws_smithy_http_server::instrumentation::service::request with operation: get_storage, method: GET, uri: /pokedex/{redacted}, headers: {"passcode": "{redacted}", "host": "localhost:13734"}
2022-09-27T09:13:35.374152Z DEBUG pokemon_service: authentication failed
at pokemon-service/src/lib.rs:188
in aws_smithy_http_server::instrumentation::service::request with operation: get_storage, method: GET, uri: /pokedex/{redacted}, headers: {"passcode": "{redacted}", "host": "localhost:13734"}
2022-09-27T09:13:35.374230Z DEBUG aws_smithy_http_server::instrumentation::service: response, headers: {"content-type": "application/json", "x-amzn-errortype": "NotAuthorized", "content-length": "2"}, status_code: 401 Unauthorized
at /smithy-rs/rust-runtime/aws-smithy-http-server/src/logging/service.rs:47
in aws_smithy_http_server::instrumentation::service::request with operation: get_storage, method: GET, uri: /pokedex/{redacted}, headers: {"passcode": "{redacted}", "host": "localhost:13734"}
Interactions with Sensitivity
Instrumentation interacts with Smithy's sensitive trait.
Sensitive data MUST NOT be exposed in things like exception messages or log output. Application of this trait SHOULD NOT affect wire logging (i.e., logging of all data transmitted to and from servers or clients).
For this reason, Smithy runtime will never use tracing
to emit events or open spans that include any sensitive data. This means that the customer can ingest all logs from aws-smithy-http-server
and aws-smithy-http-server-*
without fear of violating the sensitive trait.
The Smithy runtime will not, and cannot, prevent the customer violating the sensitive trait within the operation handlers and custom middleware. It is the responsibility of the customer to not violate the sensitive contract of their own model, care must be taken.
Smithy shapes can be sensitive while being coupled to the HTTP request/responses via the HTTP binding traits. This poses a risk when ingesting events which naively capture request/response information. The instrumentation middleware provided by Smithy Rust respects the sensitive trait and will replace sensitive data in its span and event with {redacted}
. This feature can be seen in the Example above. For debugging purposes these redactions can be prevented using the aws-smithy-http-server
feature flag, unredacted-logging
.
Some examples of inadvertently leaking sensitive information:
- Ingesting tracing events and spans from third-party crates which do not respect sensitivity.
- An concrete example of this would be enabling events from
hyper
ortokio
.
- An concrete example of this would be enabling events from
- Applying middleware which ingests events including HTTP payloads or any other part of the HTTP request/response which can be bound.
Accessing Un-modelled Data
For every Smithy Operation an input, output, and optional error are specified. This in turn constrains the function signature of the handler provided to the service builder - the input to the handler must be the input specified by the operation etc.
But what if we, the customer, want to access data in the handler which is not modelled by our Smithy model? Smithy Rust provides an escape hatch in the form of the FromParts
trait. In axum
these are referred to as "extractors".
/// Provides a protocol aware extraction from a [`Request`]. This borrows the
/// [`Parts`], in contrast to [`FromRequest`].
pub trait FromParts<Protocol>: Sized {
/// The type of the failures yielded extraction attempts.
type Rejection: IntoResponse<Protocol>;
/// Extracts `self` from a [`Parts`] synchronously.
fn from_parts(parts: &mut Parts) -> Result<Self, Self::Rejection>;
}
Here Parts
is the struct containing all items in a http::Request
except for the HTTP body.
A prolific example of a FromParts
implementation is Extension<T>
:
/// Generic extension type stored in and extracted from [request extensions].
///
/// This is commonly used to share state across handlers.
///
/// If the extension is missing it will reject the request with a `500 Internal
/// Server Error` response.
///
/// [request extensions]: https://docs.rs/http/latest/http/struct.Extensions.html
#[derive(Debug, Clone)]
pub struct Extension<T>(pub T);
/// The extension has not been added to the [`Request`](http::Request) or has been previously removed.
#[derive(Debug, Error)]
#[error("the `Extension` is not present in the `http::Request`")]
pub struct MissingExtension;
impl<Protocol> IntoResponse<Protocol> for MissingExtension {
fn into_response(self) -> http::Response<BoxBody> {
let mut response = http::Response::new(empty());
*response.status_mut() = StatusCode::INTERNAL_SERVER_ERROR;
response
}
}
impl<Protocol, T> FromParts<Protocol> for Extension<T>
where
T: Send + Sync + 'static,
{
type Rejection = MissingExtension;
fn from_parts(parts: &mut http::request::Parts) -> Result<Self, Self::Rejection> {
parts.extensions.remove::<T>().map(Extension).ok_or(MissingExtension)
}
}
This allows the service builder to accept the following handler
async fn handler(input: ModelInput, extension: Extension<SomeStruct>) -> ModelOutput {
/* ... */
}
where ModelInput
and ModelOutput
are specified by the Smithy Operation and SomeStruct
is a struct which has been inserted, by middleware, into the http::Request::extensions
.
Up to 32 structures implementing FromParts
can be provided to the handler with the constraint that they must be provided after the ModelInput
:
async fn handler(input: ModelInput, ext1: Extension<SomeStruct1>, ext2: Extension<SomeStruct2>, other: Other /* : FromParts */, /* ... */) -> ModelOutput {
/* ... */
}
Note that the parts.extensions.remove::<T>()
in Extensions::from_parts
will cause multiple Extension<SomeStruct>
arguments in the handler to fail. The first extraction failure to occur is serialized via the IntoResponse
trait (notice type Error: IntoResponse<Protocol>
) and returned.
The FromParts
trait is public so customers have the ability specify their own implementations:
struct CustomerDefined {
/* ... */
}
impl<P> FromParts<P> for CustomerDefined {
type Error = /* ... */;
fn from_parts(parts: &mut Parts) -> Result<Self, Self::Error> {
// Construct `CustomerDefined` using the request headers.
let header_value = parts.headers.get("header-name").ok_or(/* ... */)?;
Ok(CustomerDefined { /* ... */ })
}
}
async fn handler(input: ModelInput, arg: CustomerDefined) -> ModelOutput {
/* ... */
}
The Anatomy of a Service
What is Smithy? At a high-level, it's a grammar for specifying services while leaving the business logic undefined. A Smithy Service specifies a collection of function signatures in the form of Operations, their purpose is to encapsulate business logic. A Smithy implementation should, for each Smithy Service, provide a builder, which accepts functions conforming to said signatures, and returns a service subject to the semantics specified by the model.
This survey is disinterested in the actual Kotlin implementation of the code generator, and instead focuses on the structure of the generated Rust code and how it relates to the Smithy model. The intended audience is new contributors and users interested in internal details.
During the survey we will use the pokemon.smithy
model as a reference:
/// A Pokémon species forms the basis for at least one Pokémon.
@title("Pokémon Species")
resource PokemonSpecies {
identifiers: {
name: String
},
read: GetPokemonSpecies,
}
/// A users current Pokémon storage.
resource Storage {
identifiers: {
user: String
},
read: GetStorage,
}
/// The Pokémon Service allows you to retrieve information about Pokémon species.
@title("Pokémon Service")
@restJson1
service PokemonService {
version: "2021-12-01",
resources: [PokemonSpecies, Storage],
operations: [
GetServerStatistics,
DoNothing,
CapturePokemon,
CheckHealth
],
}
Smithy Rust will use this model to produce the following API:
#![allow(unused)] fn main() { extern crate pokemon_service_server_sdk; extern crate aws_smithy_http_server; use aws_smithy_http_server::protocol::rest_json_1::{RestJson1, router::RestRouter}; use aws_smithy_http_server::routing::{Route, RoutingService}; use pokemon_service_server_sdk::{input::*, output::*, error::*, operation_shape::*, PokemonServiceConfig, PokemonService}; // A handler for the `GetPokemonSpecies` operation (the `PokemonSpecies` resource). async fn get_pokemon_species(input: GetPokemonSpeciesInput) -> Result<GetPokemonSpeciesOutput, GetPokemonSpeciesError> { todo!() } let config = PokemonServiceConfig::builder().build(); // Use the service builder to create `PokemonService`. let pokemon_service = PokemonService::builder(config) // Pass the handler directly to the service builder... .get_pokemon_species(get_pokemon_species) /* other operation setters */ .build() .expect("failed to create an instance of the Pokémon service"); let pokemon_service: PokemonService<RoutingService<RestRouter<Route>, RestJson1>> = pokemon_service; }
Operations
A Smithy Operation specifies the input, output, and possible errors of an API operation. One might characterize a Smithy Operation as syntax for specifying a function type.
We represent this in Rust using the OperationShape
trait:
#![allow(unused)] fn main() { extern crate aws_smithy_http_server; use aws_smithy_http_server::shape_id::ShapeId; pub trait OperationShape { /// The name of the operation. const ID: ShapeId; /// The operation input. type Input; /// The operation output. type Output; /// The operation error. [`Infallible`](std::convert::Infallible) in the case where no error /// exists. type Error; } use aws_smithy_http_server::operation::OperationShape as OpS; impl<T: OpS> OperationShape for T { const ID: ShapeId = <T as OpS>::ID; type Input = <T as OpS>::Input; type Output = <T as OpS>::Output; type Error = <T as OpS>::Error; } }
For each Smithy Operation shape,
/// Retrieve information about a Pokémon species.
@readonly
@http(uri: "/pokemon-species/{name}", method: "GET")
operation GetPokemonSpecies {
input: GetPokemonSpeciesInput,
output: GetPokemonSpeciesOutput,
errors: [ResourceNotFoundException],
}
the following implementation is generated
#![allow(unused)] fn main() { extern crate pokemon_service_server_sdk; extern crate aws_smithy_http_server; use aws_smithy_http_server::{operation::OperationShape, shape_id::ShapeId}; use pokemon_service_server_sdk::{input::*, output::*, error::*}; /// Retrieve information about a Pokémon species. pub struct GetPokemonSpecies; impl OperationShape for GetPokemonSpecies { const ID: ShapeId = ShapeId::new("com.aws.example#GetPokemonSpecies", "com.aws.example", "GetPokemonSpecies"); type Input = GetPokemonSpeciesInput; type Output = GetPokemonSpeciesOutput; type Error = GetPokemonSpeciesError; } }
where GetPokemonSpeciesInput
, GetPokemonSpeciesOutput
are both generated from the Smithy structures and GetPokemonSpeciesError
is an enum generated from the errors: [ResourceNotFoundException]
.
Note that the GetPokemonSpecies
marker structure is a zero-sized type (ZST), and therefore does not exist at runtime - it is a way to attach operation-specific data on an entity within the type system.
The following nomenclature will aid us in our survey. We describe a tower::Service
as a "model service" if its request and response are Smithy structures, as defined by the OperationShape
trait - the GetPokemonSpeciesInput
, GetPokemonSpeciesOutput
, and GetPokemonSpeciesError
described above. Similarly, we describe a tower::Service
as a "HTTP service" if its request and response are http
structures - http::Request
and http::Response
.
The constructors exist on the marker ZSTs as an extension trait to OperationShape
, namely OperationShapeExt
:
#![allow(unused)] fn main() { extern crate aws_smithy_http_server; use aws_smithy_http_server::operation::*; /// An extension trait over [`OperationShape`]. pub trait OperationShapeExt: OperationShape { /// Creates a new [`Service`] for well-formed [`Handler`]s. fn from_handler<H, Exts>(handler: H) -> IntoService<Self, H> where H: Handler<Self, Exts>, Self: Sized; /// Creates a new [`Service`] for well-formed [`Service`](tower::Service)s. fn from_service<S, Exts>(svc: S) -> Normalize<Self, S> where S: OperationService<Self, Exts>, Self: Sized; } use aws_smithy_http_server::operation::OperationShapeExt as OpS; impl<T: OpS> OperationShapeExt for T { fn from_handler<H, Exts>(handler: H) -> IntoService<Self, H> where H: Handler<Self, Exts>, Self: Sized { <T as OpS>::from_handler(handler) } fn from_service<S, Exts>(svc: S) -> Normalize<Self, S> where S: OperationService<Self, Exts>, Self: Sized { <T as OpS>::from_service(svc) } } }
Observe that there are two constructors provided: from_handler
which takes a H: Handler
and from_service
which takes a S: OperationService
. In both cases Self
is passed as a parameter to the traits - this constrains handler: H
and svc: S
to the signature given by the implementation of OperationShape
on Self
.
The Handler
and OperationService
both serve a similar purpose - they provide a common interface for converting to a model service S
.
- The
Handler<GetPokemonSpecies>
trait covers all async functions takingGetPokemonSpeciesInput
and asynchronously returning aResult<GetPokemonSpeciesOutput, GetPokemonSpeciesError>
. - The
OperationService<GetPokemonSpecies>
trait covers alltower::Service
s with requestGetPokemonSpeciesInput
, responseGetPokemonSpeciesOutput
and errorGetPokemonSpeciesOutput
.
The from_handler
constructor is used in the following way:
#![allow(unused)] fn main() { extern crate pokemon_service_server_sdk; extern crate aws_smithy_http_server; use pokemon_service_server_sdk::{ input::GetPokemonSpeciesInput, output::GetPokemonSpeciesOutput, error::GetPokemonSpeciesError, operation_shape::GetPokemonSpecies }; use aws_smithy_http_server::operation::OperationShapeExt; async fn get_pokemon_service(input: GetPokemonSpeciesInput) -> Result<GetPokemonSpeciesOutput, GetPokemonSpeciesError> { todo!() } let operation = GetPokemonSpecies::from_handler(get_pokemon_service); }
Alternatively, from_service
constructor:
#![allow(unused)] fn main() { extern crate pokemon_service_server_sdk; extern crate aws_smithy_http_server; extern crate tower; use pokemon_service_server_sdk::{ input::GetPokemonSpeciesInput, output::GetPokemonSpeciesOutput, error::GetPokemonSpeciesError, operation_shape::GetPokemonSpecies }; use aws_smithy_http_server::operation::OperationShapeExt; use std::task::{Context, Poll}; use tower::Service; struct Svc { /* ... */ } impl Service<GetPokemonSpeciesInput> for Svc { type Response = GetPokemonSpeciesOutput; type Error = GetPokemonSpeciesError; type Future = /* Future<Output = Result<Self::Response, Self::Error>> */ std::future::Ready<Result<Self::Response, Self::Error>>; fn poll_ready(&mut self, ctx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> { todo!() } fn call(&mut self, input: GetPokemonSpeciesInput) -> Self::Future { todo!() } } let svc: Svc = Svc { /* ... */ }; let operation = GetPokemonSpecies::from_service(svc); }
To summarize a model service constructed can be constructed from a Handler
or a OperationService
subject to the constraints of an OperationShape
. More detailed information on these conversions is provided in the Handler and OperationService section Rust docs.
Serialization and Deserialization
A Smithy protocol specifies the serialization/deserialization scheme - how a HTTP request is transformed into a modelled input and a modelled output to a HTTP response. The is formalized using the FromRequest
and IntoResponse
traits:
#![allow(unused)] fn main() { extern crate aws_smithy_http_server; extern crate http; use aws_smithy_http_server::body::BoxBody; use std::future::Future; /// Provides a protocol aware extraction from a [`Request`]. This consumes the /// [`Request`], in contrast to [`FromParts`]. pub trait FromRequest<Protocol, B>: Sized { type Rejection: IntoResponse<Protocol>; type Future: Future<Output = Result<Self, Self::Rejection>>; /// Extracts `self` from a [`Request`] asynchronously. fn from_request(request: http::Request<B>) -> Self::Future; } /// A protocol aware function taking `self` to [`http::Response`]. pub trait IntoResponse<Protocol> { /// Performs a conversion into a [`http::Response`]. fn into_response(self) -> http::Response<BoxBody>; } use aws_smithy_http_server::request::FromRequest as FR; impl<P, B, T: FR<P, B>> FromRequest<P, B> for T { type Rejection = <T as FR<P, B>>::Rejection; type Future = <T as FR<P, B>>::Future; fn from_request(request: http::Request<B>) -> Self::Future { <T as FR<P, B>>::from_request(request) } } use aws_smithy_http_server::response::IntoResponse as IR; impl<P, T: IR<P>> IntoResponse<P> for T { fn into_response(self) -> http::Response<BoxBody> { <T as IR<P>>::into_response(self) } } }
Note that both traits are parameterized by Protocol
. These protocols exist as ZST marker structs:
#![allow(unused)] fn main() { extern crate aws_smithy_http_server; use aws_smithy_http_server::protocol::{ aws_json_10::AwsJson1_0 as _, aws_json_11::AwsJson1_1 as _, rest_json_1::RestJson1 as _, rest_xml::RestXml as _, }; /// [AWS REST JSON 1.0 Protocol](https://awslabs.github.io/smithy/2.0/aws/protocols/aws-restjson1-protocol.html). pub struct RestJson1; /// [AWS REST XML Protocol](https://awslabs.github.io/smithy/2.0/aws/protocols/aws-restxml-protocol.html). pub struct RestXml; /// [AWS JSON 1.0 Protocol](https://awslabs.github.io/smithy/2.0/aws/protocols/aws-json-1_0-protocol.html). pub struct AwsJson1_0; /// [AWS JSON 1.1 Protocol](https://awslabs.github.io/smithy/2.0/aws/protocols/aws-json-1_1-protocol.html). pub struct AwsJson1_1; }
Upgrading a Model Service
We can "upgrade" a model service to a HTTP service using FromRequest
and IntoResponse
described in the prior section:
stateDiagram-v2 direction LR HttpService: HTTP Service [*] --> from_request: HTTP Request state HttpService { direction LR ModelService: Model Service from_request --> ModelService: Model Input ModelService --> into_response: Model Output } into_response --> [*]: HTTP Response
This is formalized by the Upgrade<Protocol, Op, S>
HTTP service. The tower::Service
implementation is approximately:
impl<P, Op, S> Service<http::Request> for Upgrade<P, Op, S>
where
Input: FromRequest<P, B>,
S: Service<Input>,
S::Response: IntoResponse<P>,
S::Error: IntoResponse<P>,
{
async fn call(&mut self, request: http::Request) -> http::Response {
let model_request = match <Op::Input as OperationShape>::from_request(request).await {
Ok(ok) => ok,
Err(err) => return err.into_response()
};
let model_response = self.model_service.call(model_request).await;
model_response.into_response()
}
}
When we GetPokemonSpecies::from_handler
or GetPokemonSpecies::from_service
, the model service produced, S
, will meet the constraints above.
There is an associated Plugin
, UpgradePlugin
which constructs Upgrade
from a service.
The upgrade procedure is finalized by the application of the Layer
L
, referenced in Operation<S, L>
. In this way the entire upgrade procedure takes an Operation<S, L>
and returns a HTTP service.
stateDiagram-v2 direction LR [*] --> UpgradePlugin: HTTP Request state HttpPlugin { state UpgradePlugin { direction LR [*] --> S: Model Input S --> [*] : Model Output state ModelPlugin { S } } } UpgradePlugin --> [*]: HTTP Response
Note that the S
is specified by logic written, in Rust, by the customer, whereas UpgradePlugin
is specified entirely by Smithy model via the protocol, HTTP bindings, etc.
Routers
Different protocols supported by Smithy enjoy different routing mechanisms, for example, AWS JSON 1.0 uses the X-Amz-Target
header to select an operation, whereas AWS REST XML uses the HTTP label trait.
Despite their differences, all routing mechanisms satisfy a common interface. This is formalized using the Router trait:
#![allow(unused)] fn main() { extern crate aws_smithy_http_server; extern crate http; /// An interface for retrieving an inner [`Service`] given a [`http::Request`]. pub trait Router<B> { type Service; type Error; /// Matches a [`http::Request`] to a target [`Service`]. fn match_route(&self, request: &http::Request<B>) -> Result<Self::Service, Self::Error>; } }
which provides the ability to determine an inner HTTP service from a collection using a &http::Request
.
Types which implement the Router
trait are converted to a HTTP service via the RoutingService
struct:
/// A [`Service`] using a [`Router`] `R` to redirect messages to specific routes.
///
/// The `Protocol` parameter is used to determine the serialization of errors.
pub struct RoutingService<R, Protocol> {
router: R,
_protocol: PhantomData<Protocol>,
}
impl<R, P> Service<http::Request> for RoutingService<R, P>
where
R: Router<B>,
R::Service: Service<http::Request, Response = http::Response>,
R::Error: IntoResponse<P> + Error,
{
type Response = http::Response;
type Error = /* implementation detail */;
async fn call(&mut self, req: http::Request<B>) -> Result<Self::Response, Self::Error> {
match self.router.match_route(&req) {
// Successfully routed, use the routes `Service::call`.
Ok(ok) => ok.oneshot(req).await,
// Failed to route, use the `R::Error`s `IntoResponse<P>`.
Err(error) => {
debug!(%error, "failed to route");
Err(Box::new(error.into_response()))
}
}
}
}
The RouterService
is the final piece necessary to form a functioning composition - it is used to aggregate together the HTTP services, created via the upgrade procedure, into a single HTTP service which can be presented to the customer.
stateDiagram state in <<fork>> direction LR [*] --> in state RouterService { direction LR in --> ServiceA in --> ServiceB in --> ServiceC } ServiceA --> [*] ServiceB --> [*] ServiceC --> [*]
Plugins
A Plugin
is a
[tower::Layer
] with two extra type parameters, Service
and Operation
, corresponding to Smithy Service and Smithy Operation. This allows the middleware to be
parameterized them and change behavior depending on the context in which it's applied.
#![allow(unused)] fn main() { extern crate aws_smithy_http_server; pub trait Plugin<Service, Operation, T> { type Output; fn apply(&self, input: T) -> Self::Output; } use aws_smithy_http_server::plugin::Plugin as Pl; impl<Ser, Op, T, U: Pl<Ser, Op, T>> Plugin<Ser, Op, T> for U { type Output = <U as Pl<Ser, Op, T>>::Output; fn apply(&self, input: T) -> Self::Output { <U as Pl<Ser, Op, T>>::apply(self, input) } } }
An example Plugin
implementation can be found in /examples/pokemon-service/src/plugin.rs.
Plugins can be applied in two places:
- HTTP plugins, which are applied pre-deserialization/post-serialization, acting on HTTP requests/responses.
- Model plugins, which are applied post-deserialization/pre-serialization, acting on model inputs/outputs/errors.
stateDiagram-v2 direction LR [*] --> S: HTTP Request state HttpPlugin { state UpgradePlugin { state ModelPlugin { S } } } S --> [*]: HTTP Response
The service builder API requires plugins to be specified upfront - they must be
registered in the config object, which is passed as an argument to builder
.
Plugins cannot be modified afterwards.
You might find yourself wanting to apply multiple plugins to your service.
This can be accommodated via [HttpPlugins
] and [ModelPlugins
].
#![allow(unused)] fn main() { extern crate aws_smithy_http_server; use aws_smithy_http_server::plugin::HttpPlugins; use aws_smithy_http_server::plugin::IdentityPlugin as LoggingPlugin; use aws_smithy_http_server::plugin::IdentityPlugin as MetricsPlugin; let http_plugins = HttpPlugins::new().push(LoggingPlugin).push(MetricsPlugin); }
The plugins' runtime logic is executed in registration order.
In the example above, LoggingPlugin
would run first, while MetricsPlugin
is executed last.
If you are vending a plugin, you can leverage HttpPlugins
or ModelPlugins
as an extension point: you can add custom methods to it using an extension trait.
For example:
#![allow(unused)] fn main() { extern crate aws_smithy_http_server; use aws_smithy_http_server::plugin::{HttpPlugins, PluginStack}; use aws_smithy_http_server::plugin::IdentityPlugin as LoggingPlugin; use aws_smithy_http_server::plugin::IdentityPlugin as AuthPlugin; pub trait AuthPluginExt<CurrentPlugins> { fn with_auth(self) -> HttpPlugins<PluginStack<AuthPlugin, CurrentPlugins>>; } impl<CurrentPlugins> AuthPluginExt<CurrentPlugins> for HttpPlugins<CurrentPlugins> { fn with_auth(self) -> HttpPlugins<PluginStack<AuthPlugin, CurrentPlugins>> { self.push(AuthPlugin) } } let http_plugins = HttpPlugins::new() .push(LoggingPlugin) // Our custom method! .with_auth(); }
Builders
The service builder is the primary public API, generated for every Smithy Service. At a high-level, the service builder takes as input a function for each Smithy Operation and returns a single HTTP service. The signature of each function, also known as handlers, must match the constraints of the corresponding Smithy model.
You can create an instance of a service builder by calling builder
on the corresponding service struct.
#![allow(unused)] fn main() { extern crate aws_smithy_http_server; use aws_smithy_http_server::routing::Route; /// The service builder for [`PokemonService`]. /// /// Constructed via [`PokemonService::builder`]. pub struct PokemonServiceBuilder<Body, HttpPl, ModelPl> { capture_pokemon_operation: Option<Route<Body>>, empty_operation: Option<Route<Body>>, get_pokemon_species: Option<Route<Body>>, get_server_statistics: Option<Route<Body>>, get_storage: Option<Route<Body>>, health_check_operation: Option<Route<Body>>, http_plugin: HttpPl, model_plugin: ModelPl, } }
The builder has two setter methods for each Smithy Operation in the Smithy Service:
pub fn get_pokemon_species<HandlerType, HandlerExtractors, UpgradeExtractors>(self, handler: HandlerType) -> Self
where
HandlerType:Handler<GetPokemonSpecies, HandlerExtractors>,
ModelPl: Plugin<
PokemonService,
GetPokemonSpecies,
IntoService<GetPokemonSpecies, HandlerType>
>,
UpgradePlugin::<UpgradeExtractors>: Plugin<
PokemonService,
GetPokemonSpecies,
ModelPlugin::Output
>,
HttpPl: Plugin<
PokemonService,
GetPokemonSpecies,
UpgradePlugin::<UpgradeExtractors>::Output
>,
{
let svc = GetPokemonSpecies::from_handler(handler);
let svc = self.model_plugin.apply(svc);
let svc = UpgradePlugin::<UpgradeExtractors>::new()
.apply(svc);
let svc = self.http_plugin.apply(svc);
self.get_pokemon_species_custom(svc)
}
pub fn get_pokemon_species_service<S, ServiceExtractors, UpgradeExtractors>(self, service: S) -> Self
where
S: OperationService<GetPokemonSpecies, ServiceExtractors>,
ModelPl: Plugin<
PokemonService,
GetPokemonSpecies,
Normalize<GetPokemonSpecies, S>
>,
UpgradePlugin::<UpgradeExtractors>: Plugin<
PokemonService,
GetPokemonSpecies,
ModelPlugin::Output
>,
HttpPl: Plugin<
PokemonService,
GetPokemonSpecies,
UpgradePlugin::<UpgradeExtractors>::Output
>,
{
let svc = GetPokemonSpecies::from_service(service);
let svc = self.model_plugin.apply(svc);
let svc = UpgradePlugin::<UpgradeExtractors>::new().apply(svc);
let svc = self.http_plugin.apply(svc);
self.get_pokemon_species_custom(svc)
}
pub fn get_pokemon_species_custom<S>(mut self, svc: S) -> Self
where
S: Service<Request<Body>, Response = Response<BoxBody>, Error = Infallible>,
{
self.get_pokemon_species = Some(Route::new(svc));
self
}
Handlers and operations are upgraded to a Route
as soon as they are registered against the service builder. You can think of Route
as a boxing layer in disguise.
You can transform a builder instance into a complete service (PokemonService
) using one of the following methods:
build
. The transformation fails if one or more operations do not have a registered handler;build_unchecked
. The transformation never fails, but we return500
s for all operations that do not have a registered handler.
Both builder methods take care of:
- Pair each handler with the routing information for the corresponding operation;
- Collect all
(routing_info, handler)
pairs into aRouter
; - Transform the
Router
implementation into a HTTP service viaRouterService
; - Wrap the
RouterService
in a newtype given by the service name,PokemonService
.
The final outcome, an instance of PokemonService
, looks roughly like this:
#![allow(unused)] fn main() { extern crate aws_smithy_http_server; use aws_smithy_http_server::{routing::RoutingService, protocol::rest_json_1::{router::RestRouter, RestJson1}}; /// The Pokémon Service allows you to retrieve information about Pokémon species. #[derive(Clone)] pub struct PokemonService<S> { router: RoutingService<RestRouter<S>, RestJson1>, } }
The following schematic summarizes the composition:
stateDiagram-v2 state in <<fork>> state "GetPokemonSpecies" as C1 state "GetStorage" as C2 state "DoNothing" as C3 state "..." as C4 direction LR [*] --> in : HTTP Request UpgradePlugin --> [*]: HTTP Response state PokemonService { state RoutingService { in --> UpgradePlugin: HTTP Request in --> C2: HTTP Request in --> C3: HTTP Request in --> C4: HTTP Request state C1 { state HttpPlugin { state UpgradePlugin { direction LR [*] --> S: Model Input S --> [*] : Model Output state ModelPlugin { S } } } } C2 C3 C4 } } C2 --> [*]: HTTP Response C3 --> [*]: HTTP Response C4 --> [*]: HTTP Response
Accessing Unmodelled Data
An additional omitted detail is that we provide an "escape hatch" allowing Handler
s and OperationService
s to accept data that isn't modelled. In addition to accepting Op::Input
they can accept additional arguments which implement the FromParts
trait:
#![allow(unused)] fn main() { extern crate aws_smithy_http_server; extern crate http; use http::request::Parts; use aws_smithy_http_server::response::IntoResponse; /// Provides a protocol aware extraction from a [`Request`]. This borrows the /// [`Parts`], in contrast to [`FromRequest`]. pub trait FromParts<Protocol>: Sized { /// The type of the failures yielded extraction attempts. type Rejection: IntoResponse<Protocol>; /// Extracts `self` from a [`Parts`] synchronously. fn from_parts(parts: &mut Parts) -> Result<Self, Self::Rejection>; } use aws_smithy_http_server::request::FromParts as FP; impl<P, T: FP<P>> FromParts<P> for T { type Rejection = <T as FP<P>>::Rejection; fn from_parts(parts: &mut Parts) -> Result<Self, Self::Rejection> { <T as FP<P>>::from_parts(parts) } } }
This differs from FromRequest
trait, introduced in Serialization and Deserialization, as it's synchronous and has non-consuming access to Parts
, rather than the entire Request.
pub struct Parts {
pub method: Method,
pub uri: Uri,
pub version: Version,
pub headers: HeaderMap<HeaderValue>,
pub extensions: Extensions,
/* private fields */
}
This is commonly used to access types stored within Extensions
which have been inserted by a middleware. An Extension
struct implements FromParts
to support this use case:
#![allow(unused)] fn main() { extern crate aws_smithy_http_server; extern crate http; extern crate thiserror; use aws_smithy_http_server::{body::BoxBody, request::FromParts, response::IntoResponse}; use http::status::StatusCode; use thiserror::Error; fn empty() -> BoxBody { todo!() } /// Generic extension type stored in and extracted from [request extensions]. /// /// This is commonly used to share state across handlers. /// /// If the extension is missing it will reject the request with a `500 Internal /// Server Error` response. /// /// [request extensions]: https://docs.rs/http/latest/http/struct.Extensions.html #[derive(Debug, Clone)] pub struct Extension<T>(pub T); impl<Protocol, T> FromParts<Protocol> for Extension<T> where T: Clone + Send + Sync + 'static, { type Rejection = MissingExtension; fn from_parts(parts: &mut http::request::Parts) -> Result<Self, Self::Rejection> { parts.extensions.remove::<T>().map(Extension).ok_or(MissingExtension) } } /// The extension has not been added to the [`Request`](http::Request) or has been previously removed. #[derive(Debug, Error)] #[error("the `Extension` is not present in the `http::Request`")] pub struct MissingExtension; impl<Protocol> IntoResponse<Protocol> for MissingExtension { fn into_response(self) -> http::Response<BoxBody> { let mut response = http::Response::new(empty()); *response.status_mut() = StatusCode::INTERNAL_SERVER_ERROR; response } } }
Generating Common Service Code
This document introduces the project and how code is being generated. It is written for developers who want to start contributing to smithy-rs
.
Folder structure
The project is divided in:
/codegen-core
: contains common code to be used for both client and server code generation/codegen-client
: client code generation. Depends oncodegen-core
/codegen-server
: server code generation. Depends oncodegen-core
/aws
: the AWS Rust SDK, it deals with AWS services specifically. The folder structure reflects the project's, with therust-runtime
and thecodegen
/rust-runtime
: the generated client and server crates may depend on crates in this folder. Crates here are not code generated. The only crate that is not published isinlineable
, which contains common functions used by other crates, copied into the source crate
Crates in /rust-runtime
(informally referred to as "runtime crates") are added to a crate's dependency only when used.
For example, if a model uses event streams, the generated crates will depend on aws-smithy-eventstream
.
Generating code
smithy-rs
's entry points are Smithy code-generation plugins, and is not a command. One entry point is in RustCodegenPlugin::execute and
inherits from SmithyBuildPlugin
in smithy-build. Code generation is in Kotlin and shared common, non-Rust specific code with the smithy
Java repository. They plug into the Smithy gradle plugin, which is a gradle plugin.
The comment at the beginning of execute
describes what a Decorator
is and uses the following terms:
- Context: contains the model being generated, projection and settings for the build
- Decorator: (also referred to as customizations) customizes how code is being generated. AWS services are required to sign with the SigV4 protocol, and a decorator adds Rust code to sign requests and responses. Decorators are applied in reverse order of being added and have a priority order.
- Writer: creates files and adds content; it supports templating, using
#
for substitutions - Location: the file where a symbol will be written to
The only task of a RustCodegenPlugin
is to construct a CodegenVisitor
and call its execute() method.
CodegenVisitor::execute()
is given a Context
and decorators, and calls a CodegenVisitor.
CodegenVisitor, RustCodegenPlugin, and wherever there are different implementations between client and server, such as in generating error types, have corresponding server versions.
Objects used throughout code generation are:
- Symbol: a node in a graph, an abstraction that represents the qualified name of a type; symbols reference and depend on other symbols, and have some common properties among languages (such as a namespace or a definition file). For Rust, we add properties to include more metadata about a symbol, such as its type
- RustType:
Option<T>
,HashMap
, ... along with their namespaces of origin such asstd::collections
- RuntimeType: the information to locate a type, plus the crates it depends on
- ShapeId: an immutable object that identifies a
Shape
Useful conversions are:
SymbolProvider.toSymbol(shape)
where SymbolProvider
constructs symbols for shapes. Some symbols require to create other symbols and types;
event streams and other streaming shapes are an example.
Symbol providers are all applied in order; if a shape uses a reserved keyword in Rust, its name is converted to a new name by a symbol provider,
and all other providers will work with this new symbol.
Model.expectShape(shapeId)
Each model has a shapeId
to shape
map; this method returns the shape associated with this shapeId.
Some objects implement a transform
method that only change the input model, so that code generation will work on that new model. This is used to, for example, add a trait to a shape.
CodegenVisitor
is a ShapeVisitor
. For all services in the input model, shapes are converted into Rust;
here is how a service is constructed,
here a structure and so on.
Code generation flows from writer to files and entities are (mostly) generated only on a need-by-need basis.
The complete result is a Rust crate,
in which all dependencies are written into their modules and lib.rs
is generated (here).
execute()
ends by running cargo fmt,
to avoid having to format correctly Rust in Writer
s and to be sure the generated code follows the styling rules.
RFCs
What is an RFC?: An RFC is a document that proposes a change to smithy-rs
or the AWS Rust SDK. Request for Comments means a request for discussion and oversight about the future of the project from maintainers, contributors and users.
When should I write an RFC?: The AWS Rust SDK team proactively decides to write RFCs for major features or complex changes that we feel require extra scrutiny. However, the process can be used to request feedback on any change. Even changes that seem obvious and simple at first glance can be improved once a group of interested and experienced people have a chance to weigh in.
Who can submit an RFC?: An RFC can be submitted by anyone. In most cases, RFCs are authored by SDK maintainers, but everyone is welcome to submit RFCs.
Where do I start?: If you're ready to write and submit an RFC, please start a GitHub discussion with a summary of what you're trying to accomplish first. That way, the AWS Rust SDK team can ensure they have the bandwidth to review and shepherd the RFC through the whole process before you've expended effort in writing it. Once you've gotten the go-ahead, start with the RFC template.
Previously Submitted RFCs
- RFC-0001: AWS Configuration
- RFC-0002: Supporting multiple HTTP versions for SDKs that use Event Stream
- RFC-0003: API for Presigned URLs
- RFC-0004: Retry Behavior
- RFC-0005: Service Generation
- RFC-0006: Service-specific middleware
- RFC-0007: Split Release Process
- RFC-0008: Paginators
- RFC-0009: Example Consolidation
- RFC-0010: Waiters
- RFC-0011: Publishing Alpha to Crates.io
- RFC-0012: Independent Crate Versioning
- RFC-0013: Body Callback APIs
- RFC-0014: Fine-grained timeout configuration
- RFC-0015: How Cargo "features" should be used in the SDK and runtime crates
- RFC-0016: Supporting Flexible Checksums
- RFC-0017: Customizable Client Operations
- RFC-0018: Logging in the Presence of Sensitive Data
- RFC-0019: Event Streams Errors
- RFC-0020: Service Builder Improvements
- RFC-0021: Dependency Versions
- RFC-0022: Error Context and Compatibility
- RFC-0023: Evolving the new service builder API
- RFC-0024: RequestID
- RFC-0025: Constraint traits
- RFC-0026: Client Crate Organization
- RFC-0027: Endpoints 2.0
- RFC-0028: SDK Credential Cache Type Safety
- RFC-0029: Finding New Home for Credential Types
- RFC-0030: Serialization And Deserialization
- RFC-0031: Providing Fallback Credentials on Timeout
- RFC-0032: Better Constraint Violations
- RFC-0033: Improving access to request IDs in SDK clients
- RFC-0034: The Orchestrator Architecture
- RFC-0035: Sensible Defaults for Collection Values
- RFC-0036: Enabling HTTP crate upgrades in the future
- RFC-0037: The HTTP wrapper type
- RFC-0038: Retry Classifier Customization
- RFC-0039: Forward Compatible Errors
- RFC-0040: Behavior Versions
- RFC-0041: Improve client error ergonomics
- RFC-0042: File-per-change changelog
- RFC-0043: Identity Cache Partitions
4a8757c23 (add RFC to fix identity cache partitioning and default cache behaviors)
AWS Configuration RFC
Status: Implemented. For an ordered list of proposed changes see: Proposed changes.
An AWS SDK loads configuration from multiple locations. Some of these locations can be loaded synchronously. Some are async. Others may actually use AWS services such as STS or SSO.
This document proposes an overhaul to the configuration design to facilitate three things:
- Future-proof: It should be easy to add additional sources of region and credentials, sync and async, from many sources, including code-generated AWS services.
- Ergonomic: There should be one obvious way to create an AWS service client. Customers should be able to easily customize the client to make common changes. It should encourage sharing of things that are expensive to create.
- Shareable: A config object should be usable to configure multiple AWS services.
Usage Guide
The following is an imagined usage guide if this RFC where implemented.
Getting Started
Using the SDK requires two crates:
aws-sdk-<someservice>
: The service you want to use (e.g.dynamodb
,s3
,sesv2
)aws-config
: AWS metaconfiguration. This crate contains all the of logic to load configuration for the SDK (regions, credentials, retry configuration, etc.)
Add the following to your Cargo.toml:
[dependencies]
aws-sdk-dynamo = "0.1"
aws-config = "0.5"
tokio = { version = "1", features = ["full"] }
Let's write a small example project to list tables:
use aws_sdk_dynamodb as dynamodb;
#[tokio::main]
async fn main() -> Result<(), dynamodb::Error> {
let config = aws_config::load_from_env().await;
let dynamodb = dynamodb::Client::new(&config);
let resp = dynamodb.list_tables().send().await;
println!("my tables: {}", resp.tables.unwrap_or_default());
Ok(())
}
Tip: Every AWS service exports a top level
Error
type (e.g. aws_sdk_dynamodb::Error). Individual operations return specific error types that contain only the error variants returned by the operation. Because all the individual errors implementInto<dynamodb::Error>
, you can usedynamodb::Error
as the return type along with?
.
Next, we'll explore some other ways to configure the SDK. Perhaps you want to override the region loaded from the
environment with your region. In this case, we'll want more control over how we load config,
using aws_config::from_env()
directly:
use aws_sdk_dynamodb as dynamodb;
#[tokio::main]
async fn main() -> Result<(), dynamodb::Error> {
let region_provider = RegionProviderChain::default_provider().or_else("us-west-2");
let config = aws_config::from_env().region(region_provider).load().await;
let dynamodb = dynamodb::Client::new(&config);
let resp = dynamodb.list_tables().send().await;
println!("my tables: {}", resp.tables.unwrap_or_default());
Ok(())
}
Sharing configuration between multiple services
The Config
produced by aws-config
can be used with any AWS service. If we wanted to read our Dynamodb DB tables
aloud with Polly, we could create a Polly client as well. First, we'll need to add Polly to our Cargo.toml
:
[dependencies]
aws-sdk-dynamo = "0.1"
aws-sdk-polly = "0.1"
aws-config = "0.5"
tokio = { version = "1", features = ["full"] }
Then, we can use the shared configuration to build both service clients. The region override will apply to both clients:
use aws_sdk_dynamodb as dynamodb;
use aws_sdk_polly as polly;
#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> { // error type changed to `Box<dyn Error>` because we now have dynamo and polly errors
let config = aws_config::env_loader().with_region(Region::new("us-west-2")).load().await;
let dynamodb = dynamodb::Client::new(&config);
let polly = polly::Client::new(&config);
let resp = dynamodb.list_tables().send().await;
let tables = resp.tables.unwrap_or_default();
let table_sentence = format!("my dynamo DB tables are: {}", tables.join(", "));
let audio = polly.synthesize_speech()
.output_format(OutputFormat::Mp3)
.text(table_sentence)
.voice_id(VoiceId::Joanna)
.send()
.await?;
// Get MP3 data from the response and save it
let mut blob = resp
.audio_stream
.collect()
.await
.expect("failed to read data");
let mut file = tokio::fs::File::create("tables.mp3")
.await
.expect("failed to create file");
file.write_all_buf(&mut blob)
.await
.expect("failed to write to file");
Ok(())
}
Specifying a custom credential provider
If you have your own source of credentials, you may opt-out of the standard credential provider chain.
To do this, implement the ProvideCredentials
trait.
NOTE:
aws_types::Credentials
already implementsProvideCredentials
. If you want to use the SDK with static credentials, you're already done!
use aws_types::credentials::{ProvideCredentials, provide_credentials::future, Result};
struct MyCustomProvider;
impl MyCustomProvider {
pub async fn load_credentials(&self) -> Result {
todo!() // A regular async function
}
}
impl ProvideCredentials for MyCustomProvider {
fn provide_credentials<'a>(&'a self) -> future::ProvideCredentials<'a>
where
Self: 'a,
{
future::ProvideCredentials::new(self.load_credentials())
}
}
Hint: If your credential provider is not asynchronous, you can use
ProvideCredentials::ready
instead to save an allocation.
After writing your custom provider, you'll use it in when constructing the configuration:
#[tokio::main]
async fn main() {
let config = aws_config::from_env().credentials_provider(MyCustomProvider).load().await;
let dynamodb = dynamodb::new(&config);
}
Proposed Design
Achieving this design consists of three major changes:
- Add a
Config
struct toaws-types
. This contains a config, but with no logic to construct it. This represents what configuration SDKS need, but not how to load the information from the environment. - Create the
aws-config
crate.aws-config
contains the logic to load configuration from the environment. No generated service clients will depend onaws-config
. This is critical to avoid circular dependencies and to allowaws-config
to depend on other AWS services.aws-config
contains individual providers as well as a pre-assembled default provider chain for region and credentials. It will also contain crate features to automatically bring in HTTPS and async-sleep implementations. - Remove all "business logic" from
aws-types
.aws-types
should be an interface-only crate that is extremely stable. The ProvideCredentials trait should move intoaws-types
. The region provider trait which only exists to support region-chaining will move out of aws-types into aws-config.
Services will continue to generate their own Config
structs. These will continue to be customizable as they are today,
however, they won't have any default resolvers built in. Each AWS config will implement From<&aws_types::SharedConfig>
. A convenience method to new()
a fluent client directly from a shared config will also be generated.
Shared Config Implementation
This RFC proposes adding region and credentials providers support to the shared config. A future RFC will propose integration with HTTP settings, HTTPs connectors, and async sleep.
struct Config {
// private fields
...
}
impl Config {
pub fn region(&self) -> Option<&Region> {
self.region.as_ref()
}
pub fn credentials_provider(&self) -> Option<SharedCredentialsProvider> {
self.credentials_provider.clone()
}
pub fn builder() -> Builder {
Builder::default()
}
}
The Builder
for Config
allows customers to provide individual overrides and handles the insertion of the default
chain for regions and credentials.
Sleep + Connectors
Sleep and Connector are both runtime dependent features. aws-config
will define rt-tokio
and rustls
and native-tls
optional features. This centralizes the Tokio/Hyper dependency eventually removing the need for
each service to maintain their own Tokio/Hyper features.
Although not proposed in this RFC, shared config will eventually gain support for creating an HTTPs client from HTTP settings.
The .build()
method on ::Config
Currently, the .build()
method on service config will fill in defaults. As part of this change, .build()
called on
the service config with missing properties will fill in "empty" defaults. If no credentials provider is given,
a NoCredentials
provider will be set, and Region
will remain as None
.
Stability and Versioning
The introduction of Config
to aws-types is not without risks. If a customer depends on a version aws-config that
uses Config
that is incompatible, they will get confusing compiler errors.
An example of a problematic set of dependent versions:
┌─────────────────┐ ┌───────────────┐
│ aws-types = 0.1 │ │aws-types= 0.2 │
└─────────────────┘ └───────────────┘
▲ ▲
│ │
│ │
│ │
┌─────────┴─────────────┐ ┌────────┴───────┐
│aws-sdk-dynamodb = 0.5 │ │aws-config = 0.6│
└───────────┬───────────┘ └───────┬────────┘
│ │
│ │
│ │
│ │
│ │
├─────────────────────┬────────┘
│ my-lambda-function │
└─────────────────────┘
To mitigate this risk, we will need to make aws-types
essentially permanently stable. Changes to aws-types
need to
be made with extreme care. This will ensure that two versions of aws-types
never end up in a customer's dependency
tree.
We will dramatically reduce the surface area of aws-types
to contain only interfaces.
Several breaking changes will be made as part of this, notably, the profile file parsing will be moved out of aws-types.
Finally, to mitigate this risk even further, services will pub use
items from aws-types
directly which means that
even if a dependency mismatch exists, it is still possible for customers to work around it.
Changes Checklist
- ProvideRegion becomes async using a newtype'd future.
- AsyncProvideCredentials is removed. ProvideCredentials becomes async using a newtype'd future.
-
ProvideCredentials moved into
aws-types
.Credentials
moved intoaws-types
-
Create
aws-config
. -
Profile-file parsing moved into
aws-config
, region chain & region environment loaders moved toaws-config
. -
os_shim_internal moved to ???
aws-smithy-types
? -
Add
Config
toaws-types
. Ensure that it's set up to add new members while remaining backwards compatible. -
Code generate
From<&SharedConfig> for <everyservice>::Config
-
Code generate
<everservice>::Client::new(&shared_config)
-
Remove
<everyservice>::from_env
Open Issues
- Connector construction needs to be a function of HTTP settings
-
An AsyncSleep should be added to
aws-types::Config
RFC: Supporting multiple HTTP versions for SDKs that use Event Stream
Status: Accepted
For a summarized list of proposed changes, see the Changes Checklist section.
Most AWS SDK operations use HTTP/1.1, but bi-directional streaming operations that use the Event Stream message framing format need to use HTTP/2 (h2).
Smithy models can also customize which HTTP versions are used in each individual protocol trait.
For example,
@restJson1
has attributes http
and eventStreamHttp
to list out the versions that should be used in a priority order.
There are two problems in play that this doc attempts to solve:
- Connector Creation: Customers need to be able to create connectors with the HTTP settings they desire, and these custom connectors must align with what the Smithy model requires.
- Connector Selection: The generated code must be able to select the connector that best matches the requirements from the Smithy model.
Terminology
Today, there are three layers of Client
that are easy to confuse, so to make the following easier to follow,
the following terms will be used:
- Connector: An implementor of Tower's
Service
trait that converts a request into a response. This is typically a thin wrapper around a Hyper client. - Smithy Client: A
aws_smithy_client::Client<C, M, R>
struct that is responsible for gluing together the connector, middleware, and retry policy. This isn't intended to be used directly. - Fluent Client: A code generated
Client<C, M, R>
that has methods for each service operation on it. A fluent builder is generated alongside it to make construction easier. - AWS Client: A specialized Fluent Client that uses a
DynConnector
,DefaultMiddleware
, andStandard
retry policy.
All of these are just called Client
in code today. This is something that could be clarified in a separate refactor.
How Clients Work Today
Fluent clients currently keep a handle to a single Smithy client, which is a wrapper
around the underlying connector. When constructing operation builders, this handle is Arc
cloned and
given to the new builder instances so that their send()
calls can initiate a request.
The generated fluent client code ends up looking like this:
struct Handle<C, M, R> {
client: aws_smithy_client::Client<C, M, R>,
conf: crate::Config,
}
pub struct Client<C, M, R = Standard> {
handle: Arc<Handle<C, M, R>>,
}
Functions are generated per operation on the fluent client to gain access to the individual operation builders. For example:
pub fn assume_role(&self) -> fluent_builders::AssumeRole<C, M, R> {
fluent_builders::AssumeRole::new(self.handle.clone())
}
The fluent operation builders ultimately implement send()
, which chooses the one and only Smithy client out
of the handle to make the request with:
pub struct AssumeRole<C, M, R> {
handle: std::sync::Arc<super::Handle<C, M, R>>,
inner: crate::input::assume_role_input::Builder,
}
impl<C, M, R> AssumeRole<C, M, R> where ...{
pub async fn send(self) -> Result<AssumeRoleOutput, SdkError<AssumeRoleError>> where ... {
// Setup code omitted ...
// Make the actual request
self.handle.client.call(op).await
}
}
Smithy clients are constructed from a connector, as shown:
let connector = Builder::new()
.https()
.middleware(...)
.build();
let client = Client::with_config(connector, Config::builder().build());
The https()
method on the Builder constructs the actual Hyper client, and is driven off Cargo features to
select the correct TLS implementation. For example:
#![allow(unused)] fn main() { #[cfg(feature = "rustls")] pub fn https() -> Https { let https = hyper_rustls::HttpsConnector::with_native_roots(); let client = hyper::Client::builder().build::<_, SdkBody>(https); // HyperAdapter is a Tower `Service` request -> response connector that just calls the Hyper client crate::hyper_impls::HyperAdapter::from(client) } }
Solving the Connector Creation Problem
Customers need to be able to provide HTTP settings, such as timeouts, for all connectors that the clients use.
These should come out of the SharedConfig
when it is used. Connector creation also needs to be customizable
so that alternate HTTP implementations can be used, or so that a fake implementation can be used for tests.
To accomplish this, SharedConfig
will have a make_connector
member. A customer would configure
it as such:
let config = some_shared_config_loader()
.with_http_settings(my_http_settings)
.with_make_connector(|reqs: &MakeConnectorRequirements| {
Some(MyCustomConnector::new(reqs))
})
.await;
The passed in MakeConnectorRequirements
will hold the customer-provided HttpSettings
as well
as any Smithy-modeled requirements, which will just be HttpVersion
for now. The MakeConnectorRequirements
struct will be marked non_exhaustive
so that new requirements can be added to it as the SDK evolves.
A default make_connector
implementation would be provided that creates a Hyper connector based on the
Cargo feature flags. This might look something like this:
#![allow(unused)] fn main() { #[cfg(feature = "rustls")] pub fn default_connector(reqs: &HttpRequirements) -> HyperAdapter { let https = hyper_rustls::HttpsConnector::with_native_roots(); let mut builder = hyper::Client::builder(); builder = configure_settings(builder, &reqs.http_settings); if let Http2 = &reqs.http_version { builder = builder.http2_only(true); } HyperAdapter::from(builder.build::<_, SdkBody>(https)) } }
For any given service, make_connector
could be called multiple times to create connectors
for all required HTTP versions and settings.
Note: the make_connector
returns an Option
since an HTTP version may not be required, but rather, preferred
according to a Smithy model. For operations that list out ["h2", "HTTP/1.1"]
as the desired versions,
a customer could choose to provide only an HTTP 1 connector, and the operation should still succeed.
Solving the Connector Selection Problem
Each service operation needs to be able to select a connector that meets its requirements best from the customer provided connectors. Initially, the only selection criteria will be the HTTP version, but later when per-operation HTTP settings are implemented, the connector will also need to be keyed off of those settings. Since connector creation is not a cheap process, connectors will need to be cached after they are created.
This caching is currently handled by the Handle
in the fluent client, which holds on to the
Smithy client. This cache needs to be adjusted to:
- Support multiple connectors, keyed off of the customer provided
HttpSettings
, and also off of the Smithy modeled requirements. - Be lazy initialized. Services that have a mix of Event Stream and non-streaming operations shouldn't create an HTTP/2 client if the customer doesn't intend to use the Event Stream operations that require it.
To accomplish this, the Handle
will hold a cache that is optimized for many reads and few writes:
#[derive(Debug, Hash, Eq, PartialEq)]
struct ConnectorKey {
http_settings: HttpSettings,
http_version: HttpVersion,
}
struct Handle<C, M, R> {
clients: RwLock<HashMap<HttpRequirements<'static>, aws_smithy_client::Client<C, M, R>>>,
conf: crate::Config,
}
pub struct Client<C, M, R = Standard> {
handle: Arc<Handle<C, M, R>>,
}
With how the generics are organized, the connector type will have to be the same between HTTP implementations,
but this should be fine since it is generally a thin wrapper around a separate HTTP implementor.
For cases where it is not, the custom connector type can host its own dyn Trait
solution.
The HttpRequirements
struct will hold HttpSettings
as copy-on-write so that it can be used
for cache lookup without having to clone HttpSettings
:
struct HttpRequirements<'a> {
http_settings: Cow<'a, HttpSettings>,
http_version: HttpVersion,
}
impl<'a> HttpRequirements<'a> {
// Needed for converting a borrowed HttpRequirements into an owned cache key for cache population
pub fn into_owned(self) -> HttpRequirements<'static> {
Self {
http_settings: Cow::Owned(self.http_settings.into_owned()),
http_version: self.http_version,
}
}
}
With the cache established, each operation needs to be aware of its requirements. The code generator will be
updated to store a prioritized list of HttpVersion
in the property bag in an input's make_operation()
method.
This prioritized list will come from the Smithy protocol trait's http
or eventStreamHttp
attribute, depending
on the operation. The fluent client will then pull this list out of the property bag so that it can determine which
connector to use. This indirection is necessary so that an operation still holds all information
needed to make a service call from the Smithy client directly.
Note: This may be extended in the future to be more than just HttpVersion
, for example, when per-operation
HTTP setting overrides are implemented. This doc is not attempting to solve that problem.
In the fluent client, this will look as follows:
impl<C, M, R> AssumeRole<C, M, R> where ... {
pub async fn send(self) -> Result<AssumeRoleOutput, SdkError<AssumeRoleError>> where ... {
let input = self.create_input()?;
let op = input.make_operation(&self.handle.conf)?;
// Grab the `make_connector` implementation
let make_connector = self.config.make_connector();
// Acquire the prioritized HttpVersion list
let http_versions = op.properties().get::<HttpVersionList>();
// Make the actual request (using default HttpSettings until modifying those is implemented)
let client = self.handle
.get_or_create_client(make_connector, &default_http_settings(), &http_versions)
.await?;
client.call(op).await
}
}
If an operation requires a specific protocol version, and if the make_connection
implementation can't
provide that it, then the get_or_create_client()
function will return SdkError::ConstructionFailure
indicating the error.
Changes Checklist
-
Create
HttpVersion
inaws-smithy-http
withHttp1_1
andHttp2
-
Refactor existing
https()
connector creation functions to takeHttpVersion
-
Add
make_connector
toSharedConfig
, and wire up thehttps()
functions as a default -
Create
HttpRequirements
inaws-smithy-http
-
Implement the connector cache on
Handle
- Implement function to calculate a minimum required set of HTTP versions from a Smithy model in the code generator
-
Update the
make_operation
code gen to put anHttpVersionList
into the operation property bag -
Update the fluent client
send()
function code gen grab the HTTP version list and acquire the correct connector with it -
Add required defaulting for models that don't set the optional
http
andeventStreamHttp
protocol trait attributes
RFC: API for Presigned URLs
Status: Implemented
For a summarized list of proposed changes, see the Changes Checklist section.
Several AWS services allow for presigned requests in URL form, which is described well by S3's documentation on authenticating requests using query parameters.
This doc establishes the customer-facing API for creating these presigned URLs and how they will be implemented in a generic fashion in the SDK codegen.
Terminology
To differentiate between the clients that are present in the generated SDK today, the following terms will be used throughout this doc:
- Smithy Client: A
aws_smithy_client::Client<C, M, R>
struct that is responsible for gluing together the connector, middleware, and retry policy. This is not generated and lives in theaws-smithy-client
crate. - Fluent Client: A code-generated
Client<C, M, R>
that has methods for each service operation on it. A fluent builder is generated alongside it to make construction easier.
Presigned URL config
Today, presigned URLs take an expiration time that's not part of the service API. The SDK will make this configurable as a separate struct so that there's no chance of name collisions, and so that additional fields can be added in the future. Fields added later will require defaulting for backwards compatibility.
Customers should also be able to set a start time on the presigned URL's expiration so that they can
generate URLs that become active in the future. An optional start_time
option will be available and
default to SystemTime::now()
.
Construction PresigningConfig
can be done with a builder, but a PresigningConfig::expires_in
convenience function will be provided to bypass the builder for the most frequent use-case.
#[non_exhaustive]
#[derive(Debug, Clone)]
pub struct PresigningConfig {
start_time: SystemTime,
expires_in: Duration,
}
#[non_exhaustive]
#[derive(Debug)]
pub struct Builder {
start_time: Option<SystemTime>,
expires_in: Option<Duration>,
}
impl Builder {
pub fn start_time(self, start_time: SystemTime) -> Self { ... }
pub fn set_start_time(&mut self, start_time: Option<SystemTime>) { ... }
pub fn expires_in(self, expires_in: Duration) -> Self { ... }
pub fn set_expires_in(&mut self, expires_in: Option<Duration>) { ... }
// Validates `expires_in` is no greater than one week
pub fn build(self) -> Result<PresigningConfig, Error> { ... }
}
impl PresigningConfig {
pub fn expires_in(expires_in: Duration) -> PresigningConfig {
Self::builder().expires(expires).build().unwrap()
}
pub fn builder() -> Builder { ... }
}
Construction of PresigningConfig
will validate that expires_in
is no greater than one week, as this
is the longest supported expiration time for SigV4. This validation will result in a panic.
It's not inconceivable that PresigningConfig
will need additional service-specific parameters as customizations,
so it will be code generated with each service rather than living a shared location.
Fluent Presigned URL API
The generated fluent builders for operations that support presigning will have a presigned()
method
in addition to send()
that will return a presigned URL rather than sending the request. For S3's GetObject,
the usage of this will look as follows:
let config = aws_config::load_config_from_environment().await;
let client = s3::Client::new(&config);
let presigning_config = PresigningConfig::expires_in(Duration::from_secs(86400));
let presigned: PresignedRequest = client.get_object()
.bucket("example-bucket")
.key("example-object")
.presigned(presigning_config)
.await?;
This API requires a client, and for use-cases where no actual service calls need to be made, customers should be able to create presigned URLs without the overhead of an HTTP client. Once the HTTP Versions RFC is implemented, the underlying HTTP client won't be created until the first service call, so there will be no HTTP client overhead to this approach.
In a step away from the general pattern of keeping fluent client capabilities in line with Smithy client capabilities, creating presigned URLs directly from the Smithy client will not be supported. This is for two reasons:
- The Smithy client is not code generated, so adding a method to do presigning would apply to all operations, but not all operations can be presigned.
- Presigned URLs are not currently a Smithy concept (although this may change soon).
The result of calling presigned()
is a PresignedRequest
, which is a wrapper with delegating functions
around http::Request<()>
so that the request method and additional signing headers are also made available.
This is necessary since there are some presignable POST operations that require the signature to be in the
headers rather than the query.
Note: Presigning needs to be async
because the underlying credentials provider used to sign the
request may need to make service calls to acquire the credentials.
Input Presigned URL API
Even though generating a presigned URL through the fluent client doesn't necessitate an HTTP client, it will be clearer that this is the case by allowing the creation of presigned URLs directly from an input. This would look as follows:
let config = aws_config::load_config_from_environment().await;
let presigning_config = PresigningConfig::expires_in(Duration::from_secs(86400));
let presigned: PresignedRequest = GetObjectInput::builder()
.bucket("example-bucket")
.key("example-bucket")
.presigned(&config, presigning_config)
.await?;
Creating the URL through the input will exercise the same code path as creating it through the client, but it will be more apparent that the overhead of a client isn't present.
Behind the scenes
From an SDK's perspective, the following are required to make a presigned URL:
- Valid request input
- Endpoint
- Credentials to sign with
- Signing implementation
The AWS middleware provides everything except the request, and the request is provided as part
of the fluent builder API. The generated code needs to be able to run the middleware to fully populate
a request property bag, but not actually dispatch it. The expires_in
value from the presigning config
needs to be piped all the way through to the signer. Additionally, the SigV4 signing needs to adjusted
to do query param signing, which is slightly different than its header signing.
Today, request dispatch looks as follows:
- The customer creates a new fluent builder by calling
client.operation_name()
, fills in inputs, and then callssend()
. send()
:- Builds the final input struct, and then calls its
make_operation()
method with the stored config to create a SmithyOperation
. - Calls the underlying Smithy client with the operation.
- Builds the final input struct, and then calls its
- The Smithy client constructs a Tower Service with AWS middleware and a dispatcher at the bottom, and then executes it.
- The middleware acquire and add required signing parameters (region, credentials, endpoint, etc) to the request property bag.
- The SigV4 signing middleware signs the request by adding HTTP headers to it.
- The dispatcher makes the actual HTTP request and returns the response all the way back up the Tower.
Presigning will take advantage of a lot of these same steps, but will cut out the Operation
and
replace the dispatcher with a presigned URL generator:
- The customer creates a new fluent builder by calling
client.operation_name()
, fills in inputs, and then callspresigned()
. presigned()
:- Builds the final input struct, calls the
make_operation()
method with the stored config, and then extracts the request from the operation (discarding the rest). - Mutates the
OperationSigningConfig
in the property bag to:- Change the
signature_type
toHttpRequestQueryParams
so that the signer runs the correct signing logic. - Set
expires_in
to the value given by the customer in the presigning config.
- Change the
- Constructs a Tower Service with
AwsMiddleware
layered in, and aPresignedUrlGeneratorLayer
at the bottom. - Calls the Tower Service and returns its result
- Builds the final input struct, calls the
- The
AwsMiddleware
will sign the request. - The
PresignedUrlGeneratorLayer
directly returns the request since all of the work is done by the middleware.
It should be noted that the presigned()
function above is on the generated input struct, so implementing this for
the input API is identical to implementing it for the fluent client.
All the code for the new make_request()
is already in the existing make_operation()
and will just need to be split out.
Modeling Presigning
AWS models don't currently have any information about which operations can be presigned.
To work around this, the Rust SDK will create a synthetic trait to model presigning with, and
apply this trait to known presigned operations via customization. The code generator will
look for this synthetic trait when creating the fluent builders and inputs to know if a
presigned()
method should be added.
Avoiding name collision
If a presignable operation input has a member named presigned
, then there will be a name collision with
the function to generate a presigned URL. To mitigate this, RustReservedWords
will be updated
to rename the presigned
member to presigned_value
similar to how send
is renamed.
Changes Checklist
-
Update
aws-sigv4
to support query param signing -
Create
PresignedOperationSyntheticTrait
- Customize models for known presigned operations
-
Create
PresigningConfig
and its builder -
Implement
PresignedUrlGeneratorLayer
-
Create new AWS codegen decorator to:
-
Add new
presigned()
method to input code generator -
Add new
presigned()
method to fluent client generator
-
Add new
-
Update
RustReservedWords
to reservepresigned()
- Add integration test to S3
- Add integration test to Polly
-
Add examples for using presigning for:
- S3 GetObject and PutObject
- Polly SynthesizeSpeech
RFC: Retry Behavior
Status: Implemented
For a summarized list of proposed changes, see the Changes Checklist section.
It is not currently possible for users of the SDK to configure a client's maximum number of retry attempts. This RFC establishes a method for users to set the number of retries to attempt when calling a service and would allow users to disable retries entirely. This RFC would introduce breaking changes to the retry
module of the aws-smithy-client
crate.
Terminology
- Smithy Client: A
aws_smithy_client::Client<C, M, R>
struct that is responsible for gluing together the connector, middleware, and retry policy. This is not generated and lives in theaws-smithy-client
crate. - Fluent Client: A code-generated
Client<C, M, R>
that has methods for each service operation on it. A fluent builder is generated alongside it to make construction easier. - AWS Client: A specialized Fluent Client that defaults to using a
DynConnector
,AwsMiddleware
, andStandard
retry policy. - Shared Config: An
aws_types::Config
struct that is responsible for storing shared configuration data that is used across all services. This is not generated and lives in theaws-types
crate. - Service-specific Config: A code-generated
Config
that has methods for setting service-specific configuration. EachConfig
is defined in theconfig
module of its parent service. For example, the S3-specific config struct isuse
able fromaws_sdk_s3::config::Config
and re-exported asaws_sdk_s3::Config
. - Standard retry behavior: The standard set of retry rules across AWS SDKs. This mode includes a standard set of errors that are retried, and support for retry quotas. The default maximum number of attempts with this mode is three, unless
max_attempts
is explicitly configured. - Adaptive retry behavior: Adaptive retry mode dynamically limits the rate of AWS requests to maximize success rate. This may be at the expense of request latency. Adaptive retry mode is not recommended when predictable latency is important.
- Note: supporting the "adaptive" retry behavior is considered outside the scope of this RFC
Configuring the maximum number of retries
This RFC will demonstrate (with examples) the following ways that Users can set the maximum number of retry attempts:
- By calling the
Config::retry_config(..)
orConfig::disable_retries()
methods when building a service-specific config - By calling the
Config::retry_config(..)
orConfig::disable_retries()
methods when building a shared config - By setting the
AWS_MAX_ATTEMPTS
environment variable
The above list is in order of decreasing precedence e.g. setting maximum retry attempts with the max_attempts
builder method will override a value set by AWS_MAX_ATTEMPTS
.
The default number of retries is 3 as specified in the AWS SDKs and Tools Reference Guide.
Setting an environment variable
Here's an example app that logs your AWS user's identity
use aws_sdk_sts as sts;
#[tokio::main]
async fn main() -> Result<(), sts::Error> {
let config = aws_config::load_from_env().await;
let sts = sts::Client::new(&config);
let resp = sts.get_caller_identity().send().await?;
println!("your user id: {}", resp.user_id.unwrap_or_default());
Ok(())
}
Then, in your terminal:
# Set the env var before running the example program
export AWS_MAX_ATTEMPTS=5
# Run the example program
cargo run
Calling a method on an AWS shared config
Here's an example app that creates a shared config with custom retry behavior and then logs your AWS user's identity
use aws_sdk_sts as sts;
use aws_types::retry_config::StandardRetryConfig;
#[tokio::main]
async fn main() -> Result<(), sts::Error> {
let retry_config = StandardRetryConfig::builder().max_attempts(5).build();
let config = aws_config::from_env().retry_config(retry_config).load().await;
let sts = sts::Client::new(&config);
let resp = sts.get_caller_identity().send().await?;
println!("your user id: {}", resp.user_id.unwrap_or_default());
Ok(())
}
Calling a method on service-specific config
Here's an example app that creates a service-specific config with custom retry behavior and then logs your AWS user's identity
use aws_sdk_sts as sts;
use aws_types::retry_config::StandardRetryConfig;
#[tokio::main]
async fn main() -> Result<(), sts::Error> {
let config = aws_config::load_from_env().await;
let retry_config = StandardRetryConfig::builder().max_attempts(5).build();
let sts_config = sts::config::Config::from(&config).retry_config(retry_config).build();
let sts = sts::Client::new(&sts_config);
let resp = sts.get_caller_identity().send().await?;
println!("your user id: {}", resp.user_id.unwrap_or_default());
Ok(())
}
Disabling retries
Here's an example app that creates a shared config that disables retries and then logs your AWS user's identity
use aws_sdk_sts as sts;
use aws_types::config::Config;
#[tokio::main]
async fn main() -> Result<(), sts::Error> {
let config = aws_config::from_env().disable_retries().load().await;
let sts_config = sts::config::Config::from(&config).build();
let sts = sts::Client::new(&sts_config);
let resp = sts.get_caller_identity().send().await?;
println!("your user id: {}", resp.user_id.unwrap_or_default());
Ok(())
}
Retries can also be disabled by explicitly passing the RetryConfig::NoRetries
enum variant to the retry_config
builder method:
use aws_sdk_sts as sts;
use aws_types::retry_config::RetryConfig;
#[tokio::main]
async fn main() -> Result<(), sts::Error> {
let config = aws_config::load_from_env().await;
let sts_config = sts::config::Config::from(&config).retry_config(RetryConfig::NoRetries).build();
let sts = sts::Client::new(&sts_config);
let resp = sts.get_caller_identity().send().await?;
println!("your user id: {}", resp.user_id.unwrap_or_default());
Ok(())
}
Behind the scenes
Currently, when users want to send a request, the following occurs:
- The user creates either a shared config or a service-specific config
- The user creates a fluent client for the service they want to interact with and passes the config they created. Internally, this creates an AWS client with a default retry policy
- The user calls an operation builder method on the client which constructs a request
- The user sends the request by awaiting the
send()
method - The smithy client creates a new
Service
and attaches a copy of its retry policy - The
Service
iscall
ed, sending out the request and retrying it according to the retry policy
After this change, the process will work like this:
- The user creates either a shared config or a service-specific config
- If
AWS_MAX_ATTEMPTS
is set to zero, this is invalid and we will log it withtracing::warn
. However, this will not error until a request is made - If
AWS_MAX_ATTEMPTS
is 1, retries will be disabled - If
AWS_MAX_ATTEMPTS
is greater than 1, retries will be attempted at most as many times as is specified - If the user creates the config with the
.disable_retries
builder method, retries will be disabled - If the user creates the config with the
retry_config
builder method, retry behavior will be set according to theRetryConfig
they passed
- If
- The user creates a fluent client for the service they want to interact with and passes the config they created
- Provider precedence will determine what retry behavior is actually set, working like how
Region
is set
- Provider precedence will determine what retry behavior is actually set, working like how
- The user calls an operation builder method on the client which constructs a request
- The user sends the request by awaiting the
send()
method - The smithy client creates a new
Service
and attaches a copy of its retry policy - The
Service
iscall
ed, sending out the request and retrying it according to the retry policy
These changes will be made in such a way that they enable us to add the "adaptive" retry behavior at a later date without introducing a breaking change.
Changes checklist
-
Create new Kotlin decorator
RetryConfigDecorator
- Based on RegionDecorator.kt
- This decorator will live in the
codegen
project because it has relevance outside the SDK
-
Breaking changes:
-
Rename
aws_smithy_client::retry::Config
toStandardRetryConfig
-
Rename
aws_smithy_client::retry::Config::with_max_retries
method towith_max_attempts
in order to follow AWS convention -
Passing 0 to
with_max_attempts
will panic with a helpful, descriptive error message
-
Rename
-
Create non-exhaustive
aws_types::retry_config::RetryConfig
enum wrapping structs that represent specific retry behaviors-
A
NoRetry
variant that disables retries. Doesn't wrap a struct since it doesn't need to contain any data -
A
Standard
variant that enables the standard retry behavior. Wraps aStandardRetryConfig
struct.
-
A
-
Create
aws_config::meta::retry_config::RetryConfigProviderChain
-
Create
aws_config::meta::retry_config::ProvideRetryConfig
-
Create
EnvironmentVariableMaxAttemptsProvider
struct- Setting AWS_MAX_ATTEMPTS=0 and trying to load from env will panic with a helpful, descriptive error message
-
Add
retry_config
method toaws_config::ConfigLoader
-
Update
AwsFluentClientDecorator
to correctly configure the max retry attempts of its inneraws_hyper::Client
based on the passed-inConfig
-
Add tests
- Test that setting retry_config to 1 disables retries
-
Test that setting retry_config to
n
limits retries ton
wheren
is a non-zero integer - Test that correct precedence is respected when overriding retry behavior in a service-specific config
- Test that correct precedence is respected when overriding retry behavior in a shared config
- Test that creating a config from env if AWS_MAX_ATTEMPTS=0 will panic with a helpful, descriptive error message
-
Test that setting invalid
max_attempts=0
with aStandardRetryConfig
will panic with a helpful, descriptive error message
RFC: Smithy Rust Service Framework
Status: RFC
The Rust Smithy Framework is a full-fledged service framework whose main responsibility is to handle request lifecycles from beginning to end. It takes care of input de-serialization, operation execution, output serialization, error handling, and provides facilities to fulfill the requirements below.
Requirements
Smithy model-driven code generation
Server side code is generated from Smithy models and implements operations, input and output structures, and errors defined in the service model.
Performance
This new framework is built with performance in mind. It refrains from allocating memory when not needed and tries to use a majority of borrowed types, handling their memory lifetimes so that a request body can be stored in memory only once and not cloned if possible.
The code is implemented on solid and widely used foundations. It uses Hyper to handle the HTTP requests, the Tokio ecosystem for asynchronous (non-blocking) operations and Tower to implement middleware such as timeouts, rate limiting, retries, and more. CPU intensive operations are scheduled on a separated thread-pool to avoid blocking the event loop.
It uses Tokio axum, an HTTP framework built on top of the technologies mentioned above which handles routing, request extraction, response building, and workers lifecycle. Axum is a relatively thin layer on top of Hyper and adds very little overhead, so its performance is comparable to Hyper.
The framework should allow customers to use the built-in HTTP server or select other transport implementations that can be more performant or better suited than HTTP for their use case.
Extensibility
We want to deliver an extensible framework that can plugin components possibly during code generation and at runtime for specific scenarios that cannot be covered during generation. These components are developed using a standard interface provided by the framework itself.
Observability
Being able to report and trace the status of the service is vital for the success of any product. The framework is integrated with tracing and allows non-blocking I/O through the asynchronous tracing appender.
Metrics and logging are built with extensibility in mind, allowing customers to plug their own handlers following a well defined interface provided by the framework.
Client generation
Client generation is deferred to the various Smithy implementations.
Benchmarking
Benchmarking the framework is key and customers can't use anything that compromises the fundamental business objectives of latency and performance.
Model validation
The generated service code is responsible for validating the model constraints of input structures.
RFC: Service-specific middleware
Status: Implemented
For a summarized list of proposed changes, see the Changes Checklist section.
Currently, all services use a centralized AwsMiddleware
that is defined in the (poorly named) aws-hyper
crate. This
poses a number of long term risks and limitations:
- When creating a Smithy Client directly for a given service, customers are forced to implicitly assume that the
service uses stock
AwsMiddleware
. This prevents us from ever changing the middleware stack for a service in the future. - It is impossible / impractical in the current situation to alter the middleware stack for a given service. For services like S3, we will almost certainly want to customize endpoint middleware in a way that is currently impossible.
In light of these limitations, this RFC proposes moving middleware into each generated service. aws-inlineable
will be
used to host and test the middleware stack. Each service will then define a public middleware
module containing their
middleware stack.
Terminology
- Middleware: A tower layer that augments
operation::Request -> operation::Response
for things like signing and endpoint resolution. - Aws Middleware: A specific middleware stack that meets the requirements for AWS services.
- Smithy Client: A
aws_smithy_client::Client<C, M, R>
struct that is responsible for gluing together the connector, middleware, and retry policy. This is not generated and lives in theaws-smithy-client
crate. - Fluent Client: A code-generated
Client<C, M, R>
that has methods for each service operation on it. A fluent builder is generated alongside it to make construction easier. - AWS Client: A specialized Fluent Client that defaults to using a
DynConnector
,AwsMiddleware
, andStandard
retry policy. - Shared Config: An
aws_types::Config
struct that is responsible for storing shared configuration data that is used across all services. This is not generated and lives in theaws-types
crate. - Service-specific Config: A code-generated
Config
that has methods for setting service-specific configuration. EachConfig
is defined in theconfig
module of its parent service. For example, the S3-specific config struct isuse
able fromaws_sdk_s3::config::Config
and re-exported asaws_sdk_s3::Config
.
Detailed Design
Currently, AwsMiddleware
is defined in aws-hyper
. As part of this change, an aws-inlineable
dependency will be
added containing code that is largely identical. This will be exposed in a public middleware
module in all generated
services. At some future point, we could even expose a baseline set of default middleware for whitelabel Smithy services
to make them easier to use out-of-the-box.
The ClientGenerics
parameter of the AwsFluentClientGenerator
will be updated to become a RuntimeType
, enabling
loading the type directly. This has the advantage of making it fairly easy to do per-service middleware stacks since we
can easily configure AwsFluentClientGenerator
to insert different types based on the service id.
Changes Checklist
- Move aws-hyper into aws-inlineable. Update comments as needed including with a usage example about how customers can augment it.
-
Refactor
ClientGenerics
to contain a RuntimeType instead of a string and configure. UpdateAwsFluentClientDecorator
. -
Update all code and examples that use
aws-hyper
to use service-specific middleware. - Push an updated README to aws-hyper deprecating the package, explaining what happened. Do not yank previous versions since those will be relied on by older SDK versions.
RFC: Split Release Process
Status: Implemented in smithy-rs#986 and aws-sdk-rust#351
At the time of writing, the aws-sdk-rust
repository is used exclusively
for the entire release process of both the Rust runtime crates from smithy-rs
as
well as the AWS runtime crates and the AWS SDK. This worked well when smithy-rs
was
only used for the AWS SDK, but now that it's also used for server codegen, there
are issues around publishing the server-specific runtime crates since they don't
belong to the SDK.
This RFC proposes a new split-release process so that the entire smithy-rs
runtime
can be published separately before the AWS SDK is published.
Terminology
- Smithy Runtime Crate: A crate that gets published to crates.io and supports
the code generated by
smithy-rs
. These crates don't provide any SDK-only functionality. These crates can support client and/or server code, and clients or servers may use only a subset of them. - AWS Runtime Crate: A crate of SDK-specific code that supports the code generated
by the
aws/codegen
module insmithy-rs
. These also get published to crates.io. - Publish-ready Bundle: A build artifact that is ready to publish to crates.io without
additional steps (such as running the publisher tool's
fix-manifests
subcommand). Publishing one group of crates before another is not considered an additional step for this definition. - Releaser: A developer, automated process, or combination of the two that performs the actual release.
Requirements
At a high level, the requirements are: publish from both smithy-rs
and aws-sdk-rust
while preserving our current level of confidence in the quality of the release. This
can be enumerated as:
- All Smithy runtime crates must be published together from
smithy-rs
- AWS runtime crates and the SDK must be published together from
aws-sdk-rust
- CI on
smithy-rs
must give confidence that the Smithy runtime crates, AWS runtime crates, and SDK are all at the right quality bar for publish. - CI on the
aws-sdk-rust
repository must give confidence that the AWS SDK and its runtime crates are at the right quality bar for publish. To do this successfully, it must run against the exact versions of the Smithy runtime crates the code was generated against both before AND after they have been published to crates.io.
Background: How Publishing Worked Before
The publish process to crates.io relied on copying all the Smithy runtime crates
into the final aws-sdk-rust
repository. Overall, the process looked as follows:
smithy-rs
generates a completeaws-sdk-rust
source bundle at CI time- The releaser copies the generated bundle over to
aws-sdk-rust
- The releaser runs the
publisher fix-manifests
subcommand to correct theCargo.toml
files generated bysmithy-rs
- The
aws-sdk-rust
CI performs one last pass on the code to verify it's sound - The releaser runs the
publisher publish
subcommand to push all the crates up to crates.io
Proposed Solution
CI in smithy-rs
will be revised to generate two separate build artifacts where it generates
just an SDK artifact previously. Now, it will have two build targets that get executed from CI
to generate these artifacts:
rust-runtime:assemble
- Generates a publish-ready bundle of Smithy runtime crates.aws:sdk:assemble
- Generates a publish-ready bundle of AWS runtime crates, SDK crates, and just the Smithy runtime crates that are used by the SDK.
The aws-sdk-rust
repository will have a new next
branch that has its own set of CI workflows
and branch protection rules. The releaser will take the aws:sdk:assemble
artifact and apply it
directly to this next
branch as would have previously been done against the main
branch.
The main
branch will continue to have the same CI as next
.
When it's time to cut a release, the releaser will do the following:
- Tag
smithy-rs
with the desired version number - Wait for CI to build artifacts for the tagged release
- Pull-request the SDK artifacts over to
aws-sdk-rust/next
(this will be automated in the future) - Pull-request merge
aws-sdk-rust/next
intoaws-sdk-rust/main
- Wait for successful CI in
main
- Tag release for
main
- Publish SDK with publisher tool
The server team can then download the rust-runtime:assemble
build artifact for the tagged release
in smithy-rs
, and publish the aws-smithy-http-server
crate from there.
Avoiding mistakes by disallowing creation of publish-ready bundles outside of CI
It should be difficult to accidentally publish a locally built set of crates. To add friction to this,
the smithy-rs
build process will look for the existence of the GITHUB_ACTIONS=true
environment variable.
If this environment variable is not set, then it will pass a flag to the Rust codegen plugin that tells it to
emit a publish = false
under [package]
in the generated Cargo.toml
.
This could be easily circumvented, but the goal is to reduce the chances of accidentally publishing crates rather than making it impossible.
Alternatives Considered
Publish Smithy runtime crates from smithy-rs
build artifacts
This approach is similar to the proposed solution, except that the SDK would not publish
the Smithy runtime crates. The aws-sdk-rust/main
branch would have a small tweak to its CI
so that the SDK is tested against the Smithy runtime crates that are published to crates.io
This CI process would look as follows:
- Shallow clone
aws-sdk-rust
with the revision being tested - Run a script to remove the
path
argument for the Smithy runtime crate dependencies for every crate inaws-sdk-rust
. For example,
aws-smithy-types = { version = "0.33.0", path = "../aws-smithy-types" }
Would become:
aws-smithy-types = { version = "0.33.0" }
- Run the tests as usual
When it's time to cut a release, the releaser will do the following:
- Tag
smithy-rs
with the desired version number - Wait for CI to build artifacts for the tagged release
- Pull-request the SDK artifacts over to
aws-sdk-rust/next
- Wait for successful CI in
aws-sdk-rust/next
- Download the Smithy runtime crates build artifact and publish it to crates.io
- Pull-request merge
aws-sdk-rust/next
intoaws-sdk-rust/main
- Wait for successful CI in
main
(this time actually running against the crates.io Smithy runtime crates) - Tag release for
main
- Publish SDK with publisher tool
Keep Smithy runtime crates in smithy-rs
This approach is similar to the previous alternative, except that the aws-sdk-rust
repository
won't have a snapshot of the Smithy runtime crates, and an additional step needs to be performed
during CI for the next
branch so that it looks as follows:
- Make a shallow clone of
aws-sdk-rust/next
- Retrieve the
smithy-rs
commit hash that was used to generate the SDK from a file that was generated alongside the rest of the build artifacts fromsmithy-rs
and copied intoaws-sdk-rust
. - Make a shallow clone of
smithy-rs
at the correct commit hash - Use a script to add a
[patch]
section to all the AWS SDK crates to point to the Smithy runtime crates from the local clone ofsmithy-rs
. For example:
# The dependencies section is left alone, but is here for context
[dependencies]
# Some version of aws-smithy-types that isn't on crates.io yet, referred to as `<unreleased>` below
aws-smithy-types = "<unreleased>"
# This patch section gets added by the script
[patch.crates-io]
aws-smithy-types = { version = "<unreleased>", path = "path/to/local/smithy-rs/rust-runtime/aws-smithy-types"}
- Run CI as normal.
Note: smithy-rs
would need to do the same patching in CI as aws-sdk-rust/next
since the generated
SDK would not have path dependencies for the Smithy runtime crates (since they are a publish-ready bundle
intended for landing in aws-sdk-rust
). The script that does this patching could live in smithy-rs
and be
reused by aws-sdk-rust
.
The disadvantage of this approach is that a customer having an issue with the current release wouldn't be able
to get a fix sooner by patching their own project's crate manifest to use the aws-sdk-rust/next
branch before
a release is cut since their project wouldn't be able to find the unreleased Smithy runtime crates.
Changes Checklist
- In
smithy-rs
:-
Move publisher tool from
aws-sdk-rust
intosmithy-rs
-
Modify
aws:sdk:assemble
target to run the publisherfix-manifests
subcommand -
Add
rust-runtime:assemble
target that generates publish-ready Smithy runtime crates - Add CI step to create Smithy runtime bundle artifact
-
Add
GITHUB_ACTIONS=true
env var check for setting thepublish
flag in generated AND runtime manifests - Revise publisher tool to publish from an arbitrary directory
-
Move publisher tool from
- In
aws-sdk-rust
:-
Implement CI for the
aws-sdk-rust/next
branch - Remove the publisher tool
-
Implement CI for the
- Update release process documentation
Summary
Status: Implemented
Smithy models paginated responses
. Customers of Smithy generated code & the Rust SDK will have an improved user experience if code is generated to
support this. Fundamentally, paginators are a way to automatically make a series of requests with the SDK, where subsequent
requests automatically forward output from the previous responses. There is nothing a paginator does that a user could not do manually,
they merely simplify the common task of interacting with paginated APIs. **Specifically, a paginator will resend the orginal request
but with inputToken
updated to the value of the previous outputToken
.
In this RFC, we propose modeling paginated data as
a Stream
of output shapes.
- When an output is paginated, a
paginate()
method will be added to the high level builder - An
<OperationName>Paginator
struct will be generated into thepaginator
module. - If
items
is modeled,paginate().items()
will be added to produce the paginated items.<OperationName>PaginatorItems
will be generated into thepaginator
module.
The Stream
trait enables customers to use a number of
abstractions including simple looping, and collect()
ing all data in a single call. A paginator will resend the
original input, but with the field marked inputToken
to the value of outputToken
in the previous output.
Usage example:
let paginator = client
.list_tables()
.paginate()
.items()
.page_size(10)
.send()
.await;
let tables: Result<Vec<_ >, _ > = paginator.collect().await;
Paginators are lazy and only retrieve pages when polled by a client.
Details
Paginators will be generated into the paginator
module of service crates. Currently, paginators are not feature gated, but this
could be considered in the future. A paginator
struct captures 2 pieces of data:
// dynamodb/src/paginator.rs
struct ListTablesPaginator<C, M, R> {
// holds the low-level client and configuration
handle: Arc<Handle<C, M, R>>,
// input builder to construct the actual input on demand
input: ListTablesInputBuilder
}
In addition to the basic usage example above, when pageSize
is modeled, customers can specify the page size during
pagination:
let mut tables = vec![];
let mut pages = client
.list_tables()
.paginate()
.page_size(20)
.send();
while let Some(next_page) = pages.try_next().await? {
// pages of 20 items requested from DynamoDb
tables.extend(next_page.table_names.unwrap_or_default().into_iter());
}
Paginators define a public method send()
. This method
returns impl Stream<Item=Result<OperationOutput, OperationError>
. This uses FnStream
defined in the aws-smithy-async
crate which
enables demand driven execution of a closure. A rendezvous channel is used which will block on send
until demand exists.
When modeled by Smithy, page_size
which automatically sets the appropriate page_size parameter and items()
which returns an
automatically flattened paginator are also generated. Note: page_size
directly sets the modeled parameter on the internal builder.
This means that a value set for page size will override any previously set value for that field.
// Generated paginator for ListTables
impl<C, M, R> ListTablesPaginator<C, M, R>
{
/// Set the page size
pub fn page_size(mut self, limit: i32) -> Self {
self.builder.limit = Some(limit);
self
}
/// Create a flattened paginator
///
/// This paginator automatically flattens results using `table_names`. Queries to the underlying service
/// are dispatched lazily.
pub fn items(self) -> crate::paginator::ListTablesPaginatorItems<C, M, R> {
crate::paginator::ListTablesPaginatorItems(self)
}
/// Create the pagination stream
///
/// _Note:_ No requests will be dispatched until the stream is used (eg. with [`.next().await`](tokio_stream::StreamExt::next)).
pub async fn send(
self,
) -> impl tokio_stream::Stream<
Item = std::result::Result<
crate::output::ListTablesOutput,
aws_smithy_http::result::SdkError<crate::error::ListTablesError>,
>,
> + Unpin
{
// Move individual fields out of self for the borrow checker
let builder = self.builder;
let handle = self.handle;
fn_stream::FnStream::new(move |tx| {
Box::pin(async move {
// Build the input for the first time. If required fields are missing, this is where we'll produce an early error.
let mut input = match builder.build().map_err(|err| {
SdkError::ConstructionFailure(err.into())
}) {
Ok(input) => input,
Err(e) => {
let _ = tx.send(Err(e)).await;
return;
}
};
loop {
let op = match input.make_operation(&handle.conf).await.map_err(|err| {
SdkError::ConstructionFailure(err.into())
}) {
Ok(op) => op,
Err(e) => {
let _ = tx.send(Err(e)).await;
return;
}
};
let resp = handle.client.call(op).await;
// If the input member is None or it was an error
let done = match resp {
Ok(ref resp) => {
input.exclusive_start_table_name = crate::lens::reflens_structure_crate_output_list_tables_output_last_evaluated_table_name(resp).cloned();
input.exclusive_start_table_name.is_none()
}
Err(_) => true,
};
if let Err(_) = tx.send(resp).await {
// receiving end was dropped
return;
}
if done {
return;
}
}
})
})
}
}
On Box::pin: The stream returned by AsyncStream
does not implement Unpin
. Unfortunately, this makes iteration
require an invocation of pin_mut!
and generates several hundred lines of compiler errors. Box::pin seems a worthwhile
trade off to improve the user experience.
On the + Unpin
bound: Because auto-traits leak across impl Trait
boundaries, + Unpin
prevents accidental
regressions in the generated code which would break users.
On the crate::reflens::...: We use LensGenerator.kt
to generate potentially complex accessors to deeply nested fields.
Updates to ergonomic clients
The builders
generated by ergonomic clients will gain the following method, if they represent an operation that implements the Paginated
trait:
/// Create a paginator for this request
///
/// Paginators are used by calling [`send().await`](crate::paginator::ListTablesPaginator::send) which returns a [`Stream`](tokio_stream::Stream).
pub fn paginate(self) -> crate::paginator::ListTablesPaginator<C, M, R> {
crate::paginator::ListTablesPaginator::new(self.handle, self.inner)
}
Discussion Areas
On send().await
Calling send().await
is not necessary from an API perspective—we could have the paginators impl-stream directly. However,
it enables using impl Trait
syntax and also makes the API consistent with other SDK APIs.
On tokio_stream::Stream
Currently, the core trait we use is tokio_stream::Stream
. This is a re-export from futures-util. There are a few other choices:
- Re-export
Stream
from tokio_stream. - Use
futures_util
directly
On Generics
Currently, the paginators forward the generics from the client (C, M, R
) along with their fairly annoying bounds.
However, if we wanted to we could simplify this and erase all the generics when the paginator was created. Since everything
is code generated, there isn't actually much duplicated code in the generator, just in the generated code.
Changes Checklist
-
Create and test
FnStream
abstraction - Generate page-level paginators
-
Generate
.items()
paginators - Generate doc hints pointing people to paginators
- Integration test using mocked HTTP traffic against a generated paginator for a real service
- Integration test using real traffic
RFC: Examples Consolidation
Status: Implemented
Currently, the AWS Rust SDK's examples are duplicated across
awslabs/aws-sdk-rust
,
smithy-lang/smithy-rs
,
and awsdocs/aws-doc-sdk-examples
.
The smithy-rs
repository was formerly the source of truth for examples,
with the examples being copied over to aws-sdk-rust
as part of the release
process, and examples were manually copied over to aws-doc-sdk-examples
so that
they could be included in the developer guide.
Now that the SDK is more stable with less frequent breaking changes,
the aws-doc-sdk-examples
repository can become the source of truth
so long as the examples are tested against smithy-rs
and continue to be
copied into aws-sdk-rust
.
Requirements
- Examples are authored and maintained in
aws-doc-sdk-examples
- Examples are no longer present in
smithy-rs
- CI in
smithy-rs
checks out examples fromaws-doc-sdk-examples
and builds them against the generated SDK. Success for this CI job is optional for merging since there can be a time lag between identifying that examples are broken and fixing them. - Examples must be copied into
aws-sdk-rust
so that the examples for a specific version of the SDK can be easily referenced. - Examples must be verified in
aws-sdk-rust
prior to merging into themain
branch.
Example CI in smithy-rs
A CI job will be added to smithy-rs
that:
- Depends on the CI job that generates the full AWS SDK
- Checks out the
aws-doc-sdk-examples
repository - Modifies example Cargo.toml files to point to the newly generated AWS SDK crates
- Runs
cargo check
on each example
This job will not be required to pass for branch protection, but will let us know that examples need to be updated before the next release.
Auto-sync to aws-sdk-rust
from smithy-rs
changes
The auto-sync job that copies generated code from smithy-rs
into the
aws-sdk-rust/next
branch will be updated to check out the aws-doc-sdk-examples
repository and copy the examples into aws-sdk-rust
. The example Cargo.toml files
will also be updated to point to the local crate paths as part of this process.
The aws-sdk-rust
CI already requires examples to compile, so merging next
into main
,
the step required to perform a release, will be blocked until the examples are fixed.
In the event the examples don't work on the next
branch, developers and example writers
will need to be able to point the examples in aws-doc-sdk-examples
to the generated
SDK in next
so that they can verify their fixes. This can be done by hand, or a tool
can be written to automate it if a significant number of examples need to be fixed.
Process Risks
There are a couple of risks with this approach:
-
Risk: Examples are broken and an urgent fix needs to be released.
Possible mitigations:
- Revert the change that broke the examples and then add the urgent fix
- Create a patch branch in
aws-sdk-rust
, apply the fix to that based off an older version ofsmithy-rs
with the fix applied, and merge that intomain
.
-
Risk: A larger project requires changes to examples prior to GA, but multiple releases need to occur before the project completion.
Possible mitigations:
- If the required changes compile against the older SDK, then just make the changes to the examples.
- Feature gate any incremental new functionality in
smithy-rs
, and work on example changes on a branch inaws-doc-sdk-examples
. When wrapping up the project, remove the feature gating and merge the examples into themain
branch.
Alternatives
aws-sdk-rust
as the source of truth
Alternatively, the examples could reside in aws-sdk-rust
, be referenced
from smithy-rs
CI, and get copied into aws-doc-sdk-examples
for inclusion
in the user guide.
Pros:
- Prior to GA, fixing examples after making breaking changes to the SDK would be easier.
Otherwise, Cargo.toml files have to be temporarily modified to point to the
aws-sdk-rust/next
branch in order to make fixes. - If a customer discovers examples via the
aws-sdk-rust
repository rather than via the SDK user guide, then it would be more obvious how to make changes to examples. At time of writing, the examples in the user guide link to theaws-doc-sdk-examples
repository, so if the examples are discovered that way, then updating them should already be clear.
Cons:
- Tooling would need to be built to sync examples from
aws-sdk-rust
intoaws-doc-sdk-examples
so that they could be incorporated into the user guide. - Creates a circular dependency between the
aws-sdk-rust
andsmithy-rs
repositories. CI insmithy-rs
needs to exercise examples, which would be inaws-sdk-rust
, andaws-sdk-rust
has its code generated bysmithy-rs
. This is workable, but may lead to problems later on.
The tooling to auto-sync from aws-sdk-rust
into aws-doc-sdk-examples
will likely cost
more than tooling to temporarily update Cargo.toml files to make example fixes (if
that tooling is even necessary).
Changes Checklist
-
Add example CI job to
smithy-rs
-
Diff examples in
smithy-rs
andaws-doc-sdk-examples
and move desired differences intoaws-doc-sdk-examples
-
Apply example fix PRs from
aws-sdk-rust
intoaws-doc-sdk-examples
-
Update
smithy-rs
CI to copy examples fromaws-doc-sdk-examples
rather than from smithy-rs -
Delete examples from
smithy-rs
RFC: Waiters
Status: Accepted
Waiters are a convenient polling mechanism to wait for a resource to become available or to
be deleted. For example, a waiter could be used to wait for a S3 bucket to be created after
a call to the CreateBucket
API, and this would only require a small amount of code rather
than building out an entire polling mechanism manually.
At the highest level, a waiter is a simple polling loop (pseudo-Rust):
// Track state that contains the number of attempts made and the previous delay
let mut state = initial_state();
loop {
// Poll the service
let result = poll_service().await;
// Classify the action that needs to be taken based on the Smithy model
match classify(result) {
// If max attempts hasn't been exceeded, then retry after a delay. Otherwise, error.
Retry => if state.should_retry() {
let delay = state.next_retry();
sleep(delay).await;
} else {
return error_max_attempts();
}
// Otherwise, if the termination condition was met, return the output
Terminate(result) => return result,
}
}
In the AWS SDK for Rust, waiters can be added without making any backwards breaking changes to the current API. This doc outlines the approach to add them in this fashion, but does NOT examine code generating response classification from JMESPath expressions, which can be left to the implementer without concern for the overall API.
Terminology
Today, there are three layers of Client
that are easy to confuse, so to make the following easier to follow,
the following terms will be used:
- Connector: An implementor of Tower's
Service
trait that converts a request into a response. This is typically a thin wrapper around a Hyper client. - Smithy Client: A
aws_smithy_client::Client<C, M, R>
struct that is responsible for gluing together the connector, middleware, and retry policy. This isn't intended to be used directly. - Fluent Client: A code generated
Client<C, M, R>
that has methods for each service operation on it. A fluent builder is generated alongside it to make construction easier. - AWS Client: A specialized Fluent Client that uses a
DynConnector
,DefaultMiddleware
, andStandard
retry policy.
All of these are just called Client
in code today. This is something that could be clarified in a separate refactor.
Requirements
Waiters must adhere to the Smithy waiter specification. To summarize:
- Waiters are specified by the Smithy
@waitable
trait - Retry during polling must be exponential backoff with jitter, with the min/max delay times and
max attempts configured by the
@waitable
trait - The SDK's built-in retry needs to be replaced by the waiter's retry since the Smithy model can specify retry conditions that are contrary to the defaults. For example, an error that would otherwise be retried by default might be the termination condition for the waiter.
- Classification of the response must be code generated based on the JMESPath expression in the model.
Waiter API
To invoke a waiter, customers will only need to invoke a single function on the AWS Client. For example, if waiting for a S3 bucket to exist, it would look like the following:
// Request bucket creation
client.create_bucket()
.bucket_name("my-bucket")
.send()
.await()?;
// Wait for it to be created
client.wait_until_bucket_exists()
.bucket_name("my-bucket")
.send()
.await?;
The call to wait_until_bucket_exists()
will return a waiter-specific fluent builder with a send()
function
that will start the polling and return a future.
To avoid name conflicts with other API methods, the waiter functions can be added to the client via trait:
pub trait WaitUntilBucketExists {
fn wait_until_bucket_exists(&self) -> crate::waiter::bucket_exists::Builder;
}
This trait would be implemented for the service's fluent client (which will necessitate making the fluent client's
handle
field pub(crate)
).
Waiter Implementation
A waiter trait implementation will merely return a fluent builder:
impl WaitUntilBucketExists for Client {
fn wait_until_bucket_exists(&self) -> crate::waiter::bucket_exists::Builder {
crate::waiter::bucket_exists::Builder::new()
}
}
This builder will have a short send()
function to kick off the actual waiter implementation:
impl Builder {
// ... existing fluent builder codegen can be reused to create all the setters and constructor
pub async fn send(self) -> Result<HeadBucketOutput, SdkError<HeadBucketError>> {
// Builds an input from this builder
let input = self.inner.build().map_err(|err| aws_smithy_http::result::SdkError::ConstructionFailure(err.into()))?;
// Passes in the client's handle, which contains a Smithy client and client config
crate::waiter::bucket_exists::wait(self.handle, input).await
}
}
This wait function needs to, in a loop similar to the pseudo-code in the beginning, convert the given input into an operation, replace the default response classifier on it with a no-retry classifier, and then determine what to do next based on that classification:
pub async fn wait(
handle: Arc<Handle<DynConnector, DynMiddleware<DynConnector>, retry::Standard>>,
input: HeadBucketInput,
) -> Result<HeadBucketOutput, SdkError<HeadBucketError>> {
loop {
let operation = input
.make_operation(&handle.conf)
.await
.map_err(|err| {
aws_smithy_http::result::SdkError::ConstructionFailure(err.into())
})?;
// Assume `ClassifyRetry` trait is implemented for `NeverRetry` to always return `RetryKind::Unnecessary`
let operation = operation.with_retry_classifier(NeverRetry::new());
let result = handle.client.call(operation).await;
match classify_result(&input, result) {
AcceptorState::Retry => {
// The sleep implementation is available here from `handle.conf.sleep_impl`
unimplemented!("Check if another attempt should be made and calculate delay time if so")
}
AcceptorState::Terminate(output) => return output,
}
}
}
fn classify_result(
input: &HeadBucketInput,
result: Result<HeadBucketOutput, SdkError<HeadBucketError>>,
) -> AcceptorState<HeadBucketOutput, SdkError<HeadBucketError>> {
unimplemented!(
"The Smithy model would dictate conditions to check here to produce an `AcceptorState`"
)
}
The retry delay time should be calculated by the same exponential backoff with jitter code that the
default RetryHandler
uses in aws-smithy-client
. This function will need to be split up and made
available to the waiter implementations so that just the delay can be calculated.
Changes Checklist
-
Codegen fluent builders for waiter input and their
send()
functions - Codegen waiter invocation traits
- Commonize exponential backoff with jitter delay calculation
-
Codegen
wait()
functions with delay and max attempts configuration from Smithy model -
Codegen
classify_result()
functions based on JMESPath expressions in Smithy model
RFC: Publishing the Alpha SDK to Crates.io
Status: Implemented
The AWS SDK for Rust and its supporting Smithy crates need to be published to crates.io so that customers can include them in their projects and also publish crates of their own that depend on them.
This doc proposes a short-term solution for publishing to crates.io. This approach is intended to be executed manually by a developer using scripts and an SOP no more than once per week, and should require less than a dev week to implement.
Terminology
- AWS SDK Crate: A crate that provides a client for calling a given AWS service, such as
aws-sdk-s3
for calling S3. - AWS Runtime Crate: Any runtime crate that the AWS SDK generated code relies on, such as
aws-types
. - Smithy Runtime Crate: Any runtime crate that the smithy-rs generated code relies on, such as
smithy-types
.
Requirements
Versioning
Cargo uses semver for versioning,
with a major.minor.patch-pre
format:
major
: Incompatible API changesminor
: Added functionality in backwards compatible mannerpatch
: Backwards compatible bug fixespre
: Pre-release version tag (omitted for normal releases)
For now, AWS SDK crates (including aws-config
) will maintain a consistent major
and minor
version number
across all services. The latest version of aws-sdk-s3
will always have the same major.minor
version as the
latest aws-sdk-dynamodb
, for example. The patch
version is allowed to be different between service crates,
but it is unlikely that we will make use of patch
versions throughout alpha and dev preview.
Smithy runtime crates will have different version numbers from the AWS SDK crates, but will also maintain
a consistent major.minor
.
The pre
version tag will be alpha
during the Rust SDK alpha, and will be removed once the SDK is in
dev preview.
During alpha, the major
version will always be 0, and the minor
will be bumped for all published
crates for every release. A later RFC may change the process during dev preview.
Yanking
Mistakes will inevitably be made, and a mechanism is needed to yank packages while keeping the latest version of the SDK successfully consumable from crates.io. To keep this simple, the entire published batch of crates will be yanked if any crate in that batch needs to be yanked. For example, if 260 crates were published in a batch, and it turns out there's a problem that requires yanking one of them, then all 260 will be yanked. Attempting to do partial yanking will require a lot of effort and be difficult to get right. Yanking should be a last resort.
Concrete Scenarios
The following changes will be bundled together as a minor
version bump during weekly releases:
- AWS model updates
- New features
- Bug fixes in runtime crates or codegen
In exceptional circumstances, a patch
version will be issued if the fix doesn't require API breaking changes:
- CVE discovered in a runtime crate
- Buggy update to a runtime crate
In the event of a CVE being discovered in an external dependency, if the external dependency is
internal to a crate, then a patch
revision can be issued for that crate to correct it. Otherwise if the CVE
is in a dependency that is part of the public API, a minor
revision will be issued with an expedited release.
For a CVE in generated code, a minor
revision will be issued with an expedited release.
Proposal
The short-term approach builds off our pre-crates.io weekly release process. That process was the following:
- Run script to update AWS models
- Manually update AWS SDK version in
aws/sdk/gradle.properties
in smithy-rs - Tag smithy-rs
- Wait for GitHub actions to generate AWS SDK using newly released smithy-rs
- Check out aws-sdk-rust, delete existing SDK code, unzip generated SDK in place, and update readme
- Tag aws-sdk-rust
To keep things simple:
- The Smithy runtime crates will have the same smithy-rs version
- All AWS crates will have the same AWS SDK version
patch
revisions are exceptional and will be one-off manually published by a developer
All runtime crate version numbers in smithy-rs will be locked at 0.0.0-smithy-rs-head
. This is a fake
version number that gets replaced when generating the SDK.
The SDK generator script in smithy-rs will be updated to:
- Replace Smithy runtime crate versions with the smithy-rs version from
aws/sdk/gradle.properties
- Replace AWS runtime crate versions with AWS SDK version from
aws/sdk/gradle.properties
- Add correct version numbers to all path dependencies in all the final crates that end up in the build artifacts
This will result in all the crates having the correct version and manifests when imported into aws-sdk-rust. From there, a script needs to be written to determine crate dependency order, and publish crates (preferably with throttling and retry) in the correct order. This script needs to be able to recover from an interruption part way through publishing all the crates, and it also needs to output a list of all crate versions published together. This crate list will be commented on the release issue so that yanking the batch can be done if necessary.
The new release process would be:
- Run script to update AWS models
- Manually update both the AWS SDK version and the smithy-rs version in
aws/sdk/gradle.properties
in smithy-rs - Tag smithy-rs
- Wait for automation to sync changes to
aws-sdk-rust/next
- Cut a PR to merge
aws-sdk-rust/next
intoaws-sdk-rust/main
- Tag aws-sdk-rust
- Run publish script
Short-term Changes Checklist
- Prepare runtime crate manifests for publication to crates.io (https://github.com/smithy-lang/smithy-rs/pull/755)
- Update SDK generator to set correct crate versions (https://github.com/smithy-lang/smithy-rs/pull/755)
- Write bulk publish script
- Write bulk yank script
- Write automation to sync smithy-rs to aws-sdk-rust
RFC: Independent Crate Versioning
Status: RFC
During its alpha and dev preview releases, the AWS SDK for Rust adopted a short-term solution for versioning and publishing to crates.io. This doc proposes a long-term versioning strategy that will carry the SDK from dev preview into general availability.
This strategy will be implemented in two phases:
- Dev Preview: The SDK will break with its current version strategy
of maintaining consistent
major.minor
version numbers. - Stability and 1.x: This phase begins when the SDK becomes generally available. The major version will be bumped to 1, and backwards breaking changes will no longer be allowed without a major version bump to all crates in the SDK.
Terminology
- AWS SDK Crate: A crate that provides a client for calling a given AWS service, such as
aws-sdk-s3
for calling S3. - AWS Runtime Crate: Any runtime crate that the AWS SDK generated code relies on, such as
aws-types
. - Smithy Runtime Crate: Any runtime crate that the smithy-rs generated code relies on, such as
smithy-types
.
Requirements
Versioning
Cargo uses semver for versioning, with a major.minor.patch-pre
format:
major
: Incompatible API changesminor
: Added functionality in backwards compatible mannerpatch
: Backwards compatible bug fixespre
: Pre-release version tag (omitted for normal releases)
In the new versioning strategy, the minor
version number will no longer be coordinated across
all SDK and Smithy runtime crates.
During phases 1 and 2, the major
version will always be 0, and the following scheme will be used:
minor
:- New features
- Breaking changes
- Dependency updates for dependencies that are part of the public API
- Model updates with API changes
- For code-generated crates: when a newer version of smithy-rs is used to generate the crate
patch
:- Bug fixes that do not break backwards compatibility
- Model updates that only have documentation changes
During phase 3:
major
: Breaking changesminor
:- Changes that aren't breaking
- Dependency updates for dependencies that are part of the public API
- Model updates with API changes
- For code-generated crates: when a newer version of smithy-rs is used to generate the crate
patch
:- Bug fixes that do not break backwards compatibility
- Model updates that only have documentation changes
During phase 3, bumps to the major
version must be coordinated across all SDK and runtime crates.
Release Identification
Since there will no longer be one SDK "version", release tags will be dates in YYYY-MM-DD
format
rather than version numbers. Additionally, the SDK's user agent string will need to include a separate
service version number (this requirement has already been implemented).
Yanking
It must be possible to yank an entire release with a single action. The publisher tool must be updated to understand which crate versions were released with a given release tag, and be able to yank all the crates published from that tag.
Phase 1: Dev Preview
Phase 1 will address the following challenges introduced by uncoordinating the major.minor
versions:
- Tracking of versions associated with a release tag
- Creation of version bump process for code generated crates
- Enforcement of version bump process in runtime crates
- Yanking of versions associated with a release tag
Version Tracking
A new manifest file will be introduced in the root of aws-sdk-rust named versions.toml
that describes
all versioning information for any given commit in the repository. In the main branch, the versions.toml
in tagged commits will become the source of truth for which crate versions belong to that release, as well
as additional metadata that's required for maintaining version process in the future.
The special 0.0.0-smithy-rs-head
version that is used prior to Phase 1 for maintaining the runtime crate
versions will no longer be used (as detailed in Versioning for Runtime Crates).
This format will look as follows:
smithy_rs_version = "<release-tag|commit-hash>"
[aws-smithy-types]
version = "0.50.1"
[aws-config]
version = "0.40.0"
[aws-sdk-s3]
version = "0.89.0"
model_hash = "<hash>"
# ...
The auto-sync tool is responsible for maintaining this file. When it generates a new SDK, it will take the version numbers from runtime crates directly, and it will use the rules from the next section to determine the version numbers for the generated crates.
Versioning for Code Generated (SDK Service) Crates
Code generated crates will have their minor
version bumped when the version of smithy-rs used to generate
them changes, or when model updates with API changes are made. Three pieces of information are required to
handle this process: the previously released version number, the smithy-rs version used to generate the code,
and the level of model updates being applied. For this last one, if there are multiple model updates that
affect only documentation, but then one model update that affects an API, then as a whole they will be
considered as affecting an API and require a minor
version bump.
The previously released version number will be retrieved from crates.io using its API. The smithy-rs version
used during code generation will become a build artifact that is saved to versions.toml
in aws-sdk-rust.
During phase 1, the tooling required to know if a model is a documentation-only change will not be available,
so all model changes will result in a minor
version bump during this phase.
Overall, determining a generated crate's version number looks as follows:
flowchart TD start[Generate crate version] --> smithyrschanged{A. smithy-rs changed?} smithyrschanged -- Yes --> minor1[Minor version bump] smithyrschanged -- No --> modelchanged{B. model changed?} modelchanged -- Yes --> minor2[Minor version bump] modelchanged -- No --> keep[Keep current version]
- A: smithy-rs changed?: Compare the
smithy_rs_version
in the previousversions.toml
with the nextversions.toml
file, and if the values are different, consider smithy-rs to have changed. - B: model changed?: Similarly, compare the
model_hash
for the crate inversions.toml
.
Versioning for Runtime Crates
The old scheme of all runtime crates in smithy-rs having a fake 0.0.0-smithy-rs-head
version number with
a build step to replace those with a consistent major.minor
will be removed. These runtime crates will begin
having their actual next version number in the Cargo.toml file in smithy-rs.
This introduces a new problem where a developer can forget to bump a runtime crate version, so a method of
process enforcement needs to be introduced. This will be done through CI when merging into smithy-rs/main
and repeated when merging into aws-sdk-rust/main
.
The following checks need to be run for runtime crates:
flowchart TD A[Check runtime crate] --> B{A. Crate has changed?} B -- Yes --> C{B. Minor bumped?} B -- No --> H{C. Version changed?} C -- Yes --> K[Pass] C -- No --> E{D. Patch bumped?} E -- Yes --> F{E. Semverver passes?} E -- No --> L[Fail] F -- Yes --> D[Pass] F -- No --> G[Fail] H -- Yes --> I[Fail] H -- No --> J[Pass]
- A: Crate has changed? The crate's source files and manifest will be hashed for the previous version and the next version. If these hashes match, then the crate is considered unchanged.
- B: Minor bumped? The previous version is compared against the next version to see if the minor version number was bumped.
- C: Version changed? The previous version is compared against the next version to see if it changed.
- D: Patch bumped? The previous version is compared against the next version to see if the patch version number was bumped.
- E: Semverver passes? Runs rust-semverver against the old and new versions of the crate.
- If semverver fails to run (for example, if it needs to be updated to the latest nightly to succeed), then fail CI saying that either semverver needs maintenance, or that a minor version bump is required.
- If semverver results in errors, fail CI indicating a minor version bump is required.
- If semverver passes, then pass CI.
When running semverver, the path dependencies of the crate under examination should be updated to be crates.io references if there were no changes in those crates since the last public to crates.io. Otherwise, the types referenced from those crates in the public API will always result in breaking changes since, as far as the Rust compiler is concerned, they are different types originating from separate path-dependency crates.
For CI, the aws-sdk-rust/main
branch's versions.toml
file is the source of truth for the previous release's
crate versions and source code.
Yanking
The publisher tool will be updated to read the versions.toml
to yank all versions published in a release.
This process will look as follows:
- Take a path to a local clone of the aws-sdk-rust repository
- Confirm the working tree is currently unmodified and on a release tag.
- Read
versions.toml
and print out summary of crates to yank - Confirm with user before proceeding
- Yank crates
Changes Checklist
-
Update
rust-semverver
to a newer nightly that can compileaws-smithy-client
-
Establish initial
versions.toml
inaws-sdk-rust/main
- Set version numbers in runtime crates in smithy-rs
-
Update the auto-sync tool to generate
versions.toml
-
Create CI tool to check runtime crate version
-
Integrate with
smithy-rs/main
CI -
Integrate with
aws-sdk-rust/main
CI
-
Integrate with
-
Update CI to verify no older runtime crates are used. For example, if
aws-smithy-client
is bumped to0.50.0
, then verify no crates (generated or runtime) depend on0.49.0
or lower.
Estimate: 2-4 dev weeks
Phase 2: Stability and 1.x
When stabilizing to 1.x, the version process will stay the same, but the minor version bumps caused by version bumping runtime crates, updating models, or changing the code generator will be candidate for automatic upgrade per semver. At that point, no further API breaking changes can be made without a major version bump.
RFC: Callback APIs for ByteStream
and SdkBody
Status: RFC
Adding a callback API to ByteStream
and SdkBody
will enable developers using the SDK to implement things like checksum validations and 'read progress' callbacks.
The Implementation
Note that comments starting with '//' are not necessarily going to be included in the actual implementation and are intended as clarifying comments for the purposes of this RFC.
// in aws_smithy_http::callbacks...
/// A callback that, when inserted into a request body, will be called for corresponding lifecycle events.
trait BodyCallback: Send {
/// This lifecycle function is called for each chunk **successfully** read. If an error occurs while reading a chunk,
/// this method will not be called. This method takes `&mut self` so that implementors may modify an implementing
/// struct/enum's internal state. Implementors may return an error.
fn update(&mut self, #[allow(unused_variables)] bytes: &[u8]) -> Result<(), BoxError> { Ok(()) }
/// This callback is called once all chunks have been read. If the callback encountered one or more errors
/// while running `update`s, this is how those errors are raised. Implementors may return a [`HeaderMap`][HeaderMap]
/// that will be appended to the HTTP body as a trailer. This is only useful to do for streaming requests.
fn trailers(&self) -> Result<Option<HeaderMap<HeaderValue>>, BoxError> { Ok(None) }
/// Create a new `BodyCallback` from an existing one. This is called when a `BodyCallback` needs to be
/// re-initialized with default state. For example: when a request has a body that needs to be
/// rebuilt, all callbacks for that body need to be run again but with a fresh internal state.
fn make_new(&self) -> Box<dyn BodyCallback>;
}
impl BodyCallback for Box<dyn BodyCallback> {
fn update(&mut self, bytes: &[u8]) -> Result<(), BoxError> { BodyCallback::update(self, bytes) }
fn trailers(&self) -> Result<Option<HeaderMap<HeaderValue>>, BoxError> { BodyCallback::trailers(self) }
fn make_new(&self) -> Box<dyn SendCallback> { BodyCallback::make_new(self) }
}
The changes we need to make to ByteStream
:
(The current version of ByteStream
and Inner
can be seen here.)
// in `aws_smithy_http::byte_stream`...
// We add a new method to `ByteStream` for inserting callbacks
impl ByteStream {
// ...other impls omitted
// A "builder-style" method for setting callbacks
pub fn with_body_callback(&mut self, body_callback: Box<dyn BodyCallback>) -> &mut Self {
self.inner.with_body_callback(body_callback);
self
}
}
impl Inner<SdkBody> {
// `Inner` wraps an `SdkBody` which has a "builder-style" function for adding callbacks.
pub fn with_body_callback(&mut self, body_callback: Box<dyn BodyCallback>) -> &mut Self {
self.body.with_body_callback(body_callback);
self
}
}
The changes we need to make to SdkBody
:
(The current version of SdkBody
can be seen here.)
// In aws_smithy_http::body...
#[pin_project]
pub struct SdkBody {
#[pin]
inner: Inner,
rebuild: Option<Arc<dyn (Fn() -> Inner) + Send + Sync>>,
// We add a `Vec` to store the callbacks
#[pin]
callbacks: Vec<Box<dyn BodyCallback>>,
}
impl SdkBody {
// We update the various fns that create `SdkBody`s to create an empty `Vec` to store callbacks.
// Those updates are very simple so I've omitted them from this code example.
fn poll_inner(
self: Pin<&mut Self>,
cx: &mut Context<'_>,
) -> Poll<Option<Result<Bytes, Error>>> {
let mut this = self.project();
// This block is old. I've included for context.
let polling_result = match this.inner.project() {
InnerProj::Once(ref mut opt) => {
let data = opt.take();
match data {
Some(bytes) if bytes.is_empty() => Poll::Ready(None),
Some(bytes) => Poll::Ready(Some(Ok(bytes))),
None => Poll::Ready(None),
}
}
InnerProj::Streaming(body) => body.poll_data(cx).map_err(|e| e.into()),
InnerProj::Dyn(box_body) => box_body.poll_data(cx),
InnerProj::Taken => {
Poll::Ready(Some(Err("A `Taken` body should never be polled".into())))
}
};
// This block is new.
match &polling_result {
// When we get some bytes back from polling, pass those bytes to each callback in turn
Poll::Ready(Some(Ok(bytes))) => {
for callback in this.callbacks.iter_mut() {
// Callbacks can run into errors when reading bytes. They'll be surfaced here
callback.update(bytes)?;
}
}
// When we're done polling for bytes, run each callback's `trailers()` method. If any calls to
// `trailers()` return an error, propagate that error up. Otherwise, continue.
Poll::Ready(None) => {
for callback_result in this.callbacks.iter().map(BodyCallback::trailers) {
if let Err(e) = callback_result {
return Poll::Ready(Some(Err(e)));
}
}
}
_ => (),
}
// Now that we've inspected the polling result, all that's left to do is to return it.
polling_result
}
// This function now has the added responsibility of cloning callback functions (but with fresh state)
// in the case that the `SdkBody` needs to be rebuilt.
pub fn try_clone(&self) -> Option<Self> {
self.rebuild.as_ref().map(|rebuild| {
let next = rebuild();
let callbacks = self
.callbacks
.iter()
.map(Callback::make_new)
.collect();
Self {
inner: next,
rebuild: self.rebuild.clone(),
callbacks,
}
})
}
pub fn with_callback(&mut self, callback: BodyCallback) -> &mut Self {
self.callbacks.push(callback);
self
}
}
/// Given two [`HeaderMap`][HeaderMap]s, merge them together and return the merged `HeaderMap`. If the
/// two `HeaderMap`s share any keys, values from the right `HeaderMap` be appended to the left `HeaderMap`.
///
/// # Example
///
/// ```rust
/// let header_name = HeaderName::from_static("some_key");
///
/// let mut left_hand_side_headers = HeaderMap::new();
/// left_hand_side_headers.insert(
/// header_name.clone(),
/// HeaderValue::from_str("lhs value").unwrap(),
/// );
///
/// let mut right_hand_side_headers = HeaderMap::new();
/// right_hand_side_headers.insert(
/// header_name.clone(),
/// HeaderValue::from_str("rhs value").unwrap(),
/// );
///
/// let merged_header_map =
/// append_merge_header_maps(left_hand_side_headers, right_hand_side_headers);
/// let merged_values: Vec<_> = merged_header_map
/// .get_all(header_name.clone())
/// .into_iter()
/// .collect();
///
/// // Will print 'some_key: ["lhs value", "rhs value"]'
/// println!("{}: {:?}", header_name.as_str(), merged_values);
/// ```
fn append_merge_header_maps(
mut lhs: HeaderMap<HeaderValue>,
rhs: HeaderMap<HeaderValue>,
) -> HeaderMap<HeaderValue> {
let mut last_header_name_seen = None;
for (header_name, header_value) in rhs.into_iter() {
// For each yielded item that has None provided for the `HeaderName`,
// then the associated header name is the same as that of the previously
// yielded item. The first yielded item will have `HeaderName` set.
// https://docs.rs/http/latest/http/header/struct.HeaderMap.html#method.into_iter-2
match (&mut last_header_name_seen, header_name) {
(_, Some(header_name)) => {
lhs.append(header_name.clone(), header_value);
last_header_name_seen = Some(header_name);
}
(Some(header_name), None) => {
lhs.append(header_name.clone(), header_value);
}
(None, None) => unreachable!(),
};
}
lhs
}
impl http_body::Body for SdkBody {
// The other methods have been omitted because they haven't changed
fn poll_trailers(
self: Pin<&mut Self>,
_cx: &mut Context<'_>,
) -> Poll<Result<Option<HeaderMap<HeaderValue>>, Self::Error>> {
let header_map = self
.callbacks
.iter()
.filter_map(|callback| {
match callback.trailers() {
Ok(optional_header_map) => optional_header_map,
// early return if a callback encountered an error
Err(e) => { return e },
}
})
// Merge any `HeaderMap`s from the last step together, one by one.
.reduce(append_merge_header_maps);
Poll::Ready(Ok(header_map))
}
}
Implementing Checksums
What follows is a simplified example of how this API could be used to introduce checksum validation for outgoing request payloads. In this example, the checksum calculation is fallible and no validation takes place. All it does it calculate
the checksum of some data and then returns the checksum of that data when trailers
is called. This is fine because it's
being used to calculate the checksum of a streaming body for a request.
#[derive(Default)]
struct Crc32cChecksumCallback {
state: Option<u32>,
}
impl ReadCallback for Crc32cChecksumCallback {
fn update(&mut self, bytes: &[u8]) -> Result<(), BoxError> {
self.state = match self.state {
Some(crc) => { self.state = Some(crc32c_append(crc, bytes)) }
None => { Some(crc32c(&bytes)) }
};
Ok(())
}
fn trailers(&self) ->
Result<Option<HeaderMap<HeaderValue>>,
Box<dyn std::error::Error + Send + Sync>>
{
let mut header_map = HeaderMap::new();
// This checksum name is an Amazon standard and would be a `const` in the real implementation
let key = HeaderName::from_static("x-amz-checksum-crc32c");
// If no data was provided to this callback and no CRC was ever calculated, we return zero as the checksum.
let crc = self.state.unwrap_or_default();
// Convert the CRC to a string, base 64 encode it, and then convert it into a `HeaderValue`.
let value = HeaderValue::from_str(&base64::encode(crc.to_string())).expect("base64 will always produce valid header values");
header_map.insert(key, value);
Some(header_map)
}
fn make_new(&self) -> Box<dyn ReadCallback> {
Box::new(Crc32cChecksumCallback::default())
}
}
NOTE: If Crc32cChecksumCallback
needed to validate a response, then we could modify it to check its internal state against a target checksum value and calling trailers
would produce an error if the values didn't match.
In order to use this in a request, we'd modify codegen for that request's service.
- We'd check if the user had requested validation and also check if they'd pre-calculated a checksum.
- If validation was requested but no pre-calculated checksum was given, we'd create a callback similar to the one above
- Then, we'd create a new checksum callback and:
- (if streaming) we'd set the checksum callback on the request body object
- (if non-streaming) we'd immediately read the body and call
BodyCallback::update
manually. Once all data was read, we'd get the checksum by callingtrailers
and insert that data as a request header.
RFC: Fine-grained timeout configuration
Status: Implemented
For a summarized list of proposed changes, see the Changes Checklist section.
While it is currently possible for users to implement request timeouts by racing operation send futures against timeout futures, this RFC proposes a more ergonomic solution that would also enable users to set timeouts for things like TLS negotiation and "time to first byte".
Terminology
There's a lot of terminology to define, so I've broken it up into three sections.
General terms
- Smithy Client: A
aws_smithy_client::Client<C, M, R>
struct that is responsible for gluing together the connector, middleware, and retry policy. This is not generated and lives in theaws-smithy-client
crate. - Fluent Client: A code-generated
Client<C, M, R>
that has methods for each service operation on it. A fluent builder is generated alongside it to make construction easier. - AWS Client: A specialized Fluent Client that defaults to using a
DynConnector
,AwsMiddleware
, andStandard
retry policy. - Shared Config: An
aws_types::Config
struct that is responsible for storing shared configuration data that is used across all services. This is not generated and lives in theaws-types
crate. - Service-specific Config: A code-generated
Config
that has methods for setting service-specific configuration. EachConfig
is defined in theconfig
module of its parent service. For example, the S3-specific config struct isuse
able fromaws_sdk_s3::config::Config
and re-exported asaws_sdk_s3::Config
. In this case, "service" refers to an AWS offering like S3.
HTTP stack terms
- Service: A trait defined in the
tower-service
crate. The lowest level of abstraction we deal with when making HTTP requests. Services act directly on data to transform and modify that data. A Service is what eventually turns a request into a response. - Layer: Layers are a higher-order abstraction over services that is used to compose multiple services together, creating a new service from that combination. Nothing prevents us from manually wrapping services within services, but Layers allow us to do it in a flexible and generic manner. Layers don't directly act on data but instead can wrap an existing service with additional functionality, creating a new service. Layers can be thought of as middleware. NOTE: The use of Layers can produce compiler errors that are difficult to interpret and defining a layer requires a large amount of boilerplate code.
- Middleware: a term with several meanings,
- Generically speaking, middleware are similar to Services and Layers in that they modify requests and responses.
- In the SDK, "Middleware" refers to a layer that can be wrapped around a
DispatchService
. In practice, this means that the resultingService
(and the inner service) must meet the boundT: where T: Service<operation::Request, Response=operation::Response, Error=SendOperationError>
.- Note: This doesn't apply to the middlewares we use when generating presigned request because those don't wrap a
DispatchService
.
- Note: This doesn't apply to the middlewares we use when generating presigned request because those don't wrap a
- The most notable example of a Middleware is the AwsMiddleware. Other notable examples include MapRequest, AsyncMapRequest, and ParseResponse.
- DispatchService: The innermost part of a group of nested services. The Service that actually makes an HTTP call on behalf of a request. Responsible for parsing success and error responses.
- Connector: a term with several meanings,
- DynConnectors (a struct that implements DynConnect) are Services with their specific type erased so that we can do dynamic dispatch.
- A term from
hyper
for any object that implements the Connect trait. Really just an alias for tower_service::Service. Sometimes referred to as aConnection
.
- Stage: A form of middleware that's not related to
tower
. These currently function as a way of transforming requests and don't have the ability to transform responses. - Stack: higher order abstraction over Layers defined in the tower crate e.g. Layers wrap services in one another and Stacks wrap layers within one another.
Timeout terms
- Connect Timeout: A limit on the amount of time after making an initial connect attempt on a socket to complete the
connect-handshake.
- TODO: the runtime is based on Hyper which reuses connection and doesn't currently have a way of guaranteeing that a fresh connection will be use for a given request.
- TLS Negotiation Timeout: A limit on the amount of time a TLS handshake takes from when the CLIENT HELLO message is sent to the time the client and server have fully negotiated ciphers and exchanged keys.
- Time to First Byte Timeout: Sometimes referred to as a "read timeout." A limit on the amount of time an application takes to attempt to read the first byte over an established, open connection after write request.
- HTTP Request Timeout For A Single Attempt: A limit on the amount of time it takes for the first byte to be sent over an established, open connection and when the last byte is received from the service.
- HTTP Request Timeout For Multiple Attempts: This timeout acts like the previous timeout but constrains the total time
it takes to make a request plus any retries.
- NOTE: In a way, this is already possible in that users are free to race requests against timer futures with the futures::future::select macro or to use tokio::time::timeout. See relevant discussion in hyper#1097
Configuring timeouts
Just like with Retry Behavior Configuration, these settings can be configured in several places and have the same precedence rules (paraphrased here for clarity).
- Service-specific config builders
- Shared config builders
- Environment variables
- Profile config file (e.g.,
~/.aws/credentials
)
The above list is in order of decreasing precedence e.g. configuration set in an app will override values from environment variables.
Configuration options
The table below details the specific ways each timeout can be configured. In all cases, valid values are non-negative floats representing the number of seconds before a timeout is triggered.
Timeout | Environment Variable | AWS Config Variable | Builder Method |
---|---|---|---|
Connect | AWS_CONNECT_TIMEOUT | connect_timeout | connect_timeout |
TLS Negotiation | AWS_TLS_NEGOTIATION_TIMEOUT | tls_negotiation_timeout | tls_negotiation_timeout |
Time To First Byte | AWS_READ_TIMEOUT | read_timeout | read_timeout |
HTTP Request - single attempt | AWS_API_CALL_ATTEMPT_TIMEOUT | api_call_attempt_timeout | api_call_attempt_timeout |
HTTP Request - all attempts | AWS_API_CALL_TIMEOUT | api_call_timeout | api_call_timeout |
SDK-specific defaults set by AWS service teams
QUESTION: How does the SDK currently handle these defaults?
Prior Art
- hjr3/hyper-timeout is a
Connector
for hyper that enables setting connect, read, and write timeouts - sfackler/tokio-io-timeout provides timeouts for tokio IO operations. Used within
hyper-timeout
. - [tokio::time::sleep_until] creates a
Future
that completes after some time has elapsed. Used withintokio-io-timeout
.
Behind the scenes
Timeouts are achieved by racing a future against a tokio::time::Sleep
future. The question, then, is "how can I create a future that represents a condition I want to watch for?". For example, in the case of a ConnectTimeout
, how do we watch an ongoing request to see if it's completed the connect-handshake? Our current stack of Middleware acts on requests at different levels of granularity. The timeout Middlewares will be no different.
Middlewares for AWS Client requests
View AwsMiddleware in GitHub
#[derive(Debug, Default)]
#[non_exhaustive]
pub struct AwsMiddleware;
impl<S> tower::Layer<S> for AwsMiddleware {
type Service = <AwsMiddlewareStack as tower::Layer<S>>::Service;
fn layer(&self, inner: S) -> Self::Service {
let credential_provider = AsyncMapRequestLayer::for_mapper(CredentialsStage::new());
let signer = MapRequestLayer::for_mapper(SigV4SigningStage::new(SigV4Signer::new()));
let endpoint_resolver = MapRequestLayer::for_mapper(AwsAuthStage);
let user_agent = MapRequestLayer::for_mapper(UserAgentStage::new());
ServiceBuilder::new()
.layer(endpoint_resolver)
.layer(user_agent)
.layer(credential_provider)
.layer(signer)
.service(inner)
}
}
The above code is only included for context. This RFC doesn't define any timeouts specific to AWS so AwsMiddleware
won't require any changes.
Middlewares for Smithy Client requests
View aws_smithy_client::Client::call_raw in GitHub
impl<C, M, R> Client<C, M, R>
where
C: bounds::SmithyConnector,
M: bounds::SmithyMiddleware<C>,
R: retry::NewRequestPolicy,
{
// ...other methods omitted
pub async fn call_raw<O, T, E, Retry>(
&self,
input: Operation<O, Retry>,
) -> Result<SdkSuccess<T>, SdkError<E>>
where
R::Policy: bounds::SmithyRetryPolicy<O, T, E, Retry>,
bounds::Parsed<<M as bounds::SmithyMiddleware<C>>::Service, O, Retry>:
Service<Operation<O, Retry>, Response=SdkSuccess<T>, Error=SdkError<E>> + Clone,
{
let connector = self.connector.clone();
let mut svc = ServiceBuilder::new()
// Create a new request-scoped policy
.retry(self.retry_policy.new_request_policy())
.layer(ParseResponseLayer::<O, Retry>::new())
// These layers can be considered as occurring in order. That is, first invoke the
// customer-provided middleware, then dispatch dispatch over the wire.
.layer(&self.middleware)
.layer(DispatchLayer::new())
.service(connector);
svc.ready().await?.call(input).await
}
}
The Smithy Client creates a new Stack
of services to handle each request it sends. Specifically:
- A method
retry
is used set the retry handler. The configuration for this was set during creation of theClient
. ParseResponseLayer
inserts a service for transforming responses into operation-specific outputs or errors. TheO
generic parameter ofinput
is what decides exactly how the transformation is implemented.- A middleware stack that was included during
Client
creation is inserted into the stack. In the case of the AWS SDK, this would beAwsMiddleware
. DispatchLayer
inserts a service for transforming anhttp::Request
into anoperation::Request
. It's also responsible for re-attaching the property bag from the Operation that triggered the request.- The innermost
Service
is aDynConnector
wrapping ahyper
client (which one depends on the TLS implementation was enabled by cargo features.)
The HTTP Request Timeout For A Single Attempt and HTTP Request Timeout For Multiple Attempts can be implemented at this level. The same Layer
can be used to create both TimeoutService
s. The TimeoutLayer
would require two inputs:
sleep_fn
: A runtime-specific implementation ofsleep
. The SDK is currentlytokio
-based and would default totokio::time::sleep
(this default is set in theaws_smithy_async::rt::sleep
module.)- The duration of the timeout as a
std::time::Duration
The resulting code would look like this:
impl<C, M, R> Client<C, M, R>
where
C: bounds::SmithyConnector,
M: bounds::SmithyMiddleware<C>,
R: retry::NewRequestPolicy,
{
// ...other methods omitted
pub async fn call_raw<O, T, E, Retry>(
&self,
input: Operation<O, Retry>,
) -> Result<SdkSuccess<T>, SdkError<E>>
where
R::Policy: bounds::SmithyRetryPolicy<O, T, E, Retry>,
bounds::Parsed<<M as bounds::SmithyMiddleware<C>>::Service, O, Retry>:
Service<Operation<O, Retry>, Response=SdkSuccess<T>, Error=SdkError<E>> + Clone,
{
let connector = self.connector.clone();
let sleep_fn = aws_smithy_async::rt::sleep::default_async_sleep();
let mut svc = ServiceBuilder::new()
.layer(TimeoutLayer::new(
sleep_fn,
self.timeout_config.api_call_timeout(),
))
// Create a new request-scoped policy
.retry(self.retry_policy.new_request_policy())
.layer(TimeoutLayer::new(
sleep_fn,
self.timeout_config.api_call_attempt_timeout(),
))
.layer(ParseResponseLayer::<O, Retry>::new())
// These layers can be considered as occurring in order. That is, first invoke the
// customer-provided middleware, then dispatch dispatch over the wire.
.layer(&self.middleware)
.layer(DispatchLayer::new())
.service(connector);
svc.ready().await?.call(input).await
}
}
Note: Our HTTP client supports multiple TLS implementations. We'll likely have to implement this feature once per library.
Timeouts will be implemented in the following places:
- HTTP request timeout for multiple requests will be implemented as the outermost Layer in
Client::call_raw
. - HTTP request timeout for a single request will be implemented within
RetryHandler::retry
. - Time to first byte, TLS negotiation, and connect timeouts will be implemented within the central
hyper
connector.
Changes checklist
Changes are broken into to sections:
- HTTP requests (single or multiple) are implementable as layers within our current stack
- Other timeouts will require changes to our dependencies and may be slower to implement
Implementing HTTP request timeouts
-
Add
TimeoutConfig
tosmithy-types
-
Add
TimeoutConfigProvider
toaws-config
- Add provider that fetches config from environment variables
- Add provider that fetches config from profile
-
Add
timeout
method toaws_types::Config
for setting timeout configuration -
Add
timeout
method to generatedConfig
s too -
Create a generic
TimeoutService
and accompanyingLayer
-
TimeoutLayer
should accept asleep
function so that it doesn't have a hard dependency ontokio
-
-
insert a
TimeoutLayer
before theRetryPolicy
to handle timeouts for multiple-attempt requests -
insert a
TimeoutLayer
after theRetryPolicy
to handle timeouts for single-attempt requests -
Add tests for timeout behavior
- test multi-request timeout triggers after 3 slow retries
- test single-request timeout triggers correctly
- test single-request timeout doesn't trigger if request completes in time
RFC: How Cargo "features" should be used in the SDK and runtime crates
Status: Accepted
Some background on features
What is a feature? Here's a definition from the Cargo Book section on features:
Cargo "features" provide a mechanism to express conditional compilation and optional dependencies. A package defines a set of named features in the
[features]
table ofCargo.toml
, and each feature can either be enabled or disabled. Features for the package being built can be enabled on the command-line with flags such as--features
. Features for dependencies can be enabled in the dependency declaration inCargo.toml
.
We use features in a majority of our runtime crates and in all of our SDK crates. For example, aws-sigv4 uses them to enable event streams. Another common use case is exhibited by aws-sdk-s3 which uses them to enable the tokio
runtime and the TLS implementation used when making requests.
Features should be additive
The Cargo book has this to say:
When a dependency is used by multiple packages, Cargo will use the union of all features enabled on that dependency when building it. This helps ensure that only a single copy of the dependency is used.
A consequence of this is that features should be additive. That is, enabling a feature should not disable functionality, and it should usually be safe to enable any combination of features. A feature should not introduce a SemVer-incompatible change.
What does this mean for the SDK?
Despite the constraints outlined above, we should use features in the SDKs because of the benefits they bring:
- Features enable users to avoid compiling code that they won't be using. Additionally, features allow both general and specific control of compiled code, serving the needs of both novice and expert users.
- A single feature in a crate can activate or deactivate multiple features exposed by that crate's dependencies, freeing the user from having to specifically activate or deactivate them.
- Features can help users understand what a crate is capable of in the same way that looking at a graph of a crate's modules can.
When using features, we should adhere to the guidelines outlined below.
Avoid writing code that relies on only activating one feature from a set of mutually exclusive features.
As noted earlier in an excerpt from the Cargo book:
enabling a feature should not disable functionality, and it should usually be safe to enable any combination of features. A feature should not introduce a SemVer-incompatible change.
#![allow(unused)] fn main() { #[cfg(feature = "rustls")] impl<M, R> ClientBuilder<(), M, R> { /// Connect to the service over HTTPS using Rustls. pub fn tls_adapter(self) -> ClientBuilder<Adapter<crate::conns::Https>, M, R> { self.connector(Adapter::builder().build(crate::conns::https())) } } #[cfg(feature = "native-tls")] impl<M, R> ClientBuilder<(), M, R> { /// Connect to the service over HTTPS using the native TLS library on your platform. pub fn tls_adapter( self, ) -> ClientBuilder<Adapter<hyper_tls::HttpsConnector<hyper::client::HttpConnector>>, M, R> { self.connector(Adapter::builder().build(crate::conns::native_tls())) } } }
When the example code above is compiled with both features enabled, compilation will fail with a "duplicate definitions with name tls_adapter
" error. Also, note that the return type of the function differs between the two versions. This is a SemVer-incompatible change.
Here's an updated version of the example that fixes these issues:
#![allow(unused)] fn main() { #[cfg(feature = "rustls")] impl<M, R> ClientBuilder<(), M, R> { /// Connect to the service over HTTPS using Rustls. pub fn rustls(self) -> ClientBuilder<Adapter<crate::conns::Https>, M, R> { self.connector(Adapter::builder().build(crate::conns::https())) } } #[cfg(feature = "native-tls")] impl<M, R> ClientBuilder<(), M, R> { /// Connect to the service over HTTPS using the native TLS library on your platform. pub fn native_tls( self, ) -> ClientBuilder<Adapter<hyper_tls::HttpsConnector<hyper::client::HttpConnector>>, M, R> { self.connector(Adapter::builder().build(crate::conns::native_tls())) } } }
Both features can now be enabled at once without creating a conflict. Since both methods have different names, it's now Ok for them to have different return types.
This is real code, see it in context
We should avoid using #[cfg(not(feature = "some-feature"))]
At the risk of seeming repetitive, the Cargo book says:
enabling a feature should not disable functionality, and it should usually be safe to enable any combination of features
Conditionally compiling code when a feature is not activated can make it hard for users and maintainers to reason about what will happen when they activate a feature. This is also a sign that a feature may not be "additive".
NOTE: It's ok to use #[cfg(not())]
to conditionally compile code based on a user's OS. It's also useful when controlling what code gets rendered when testing or when generating docs.
One case where using not
is acceptable is when providing a fallback when no features are set:
#[cfg(feature = "rt-tokio")]
pub fn default_async_sleep() -> Option<Arc<dyn AsyncSleep>> {
Some(sleep_tokio())
}
#[cfg(not(feature = "rt-tokio"))]
pub fn default_async_sleep() -> Option<Arc<dyn AsyncSleep>> {
None
}
Don't default to defining "default features"
Because Cargo will use the union of all features enabled on a dependency when building it, we should be wary of marking features as default. Once we do mark features as default, users that want to exclude code and dependencies brought in by those features will have a difficult time doing so. One need look no further than this issue submitted by a user that wanted to use Native TLS and struggled to make sure that Rustls was actually disabled (This issue was resolved in this PR which removed default features from our runtime crates.) This is not to say that we should never use them, as having defaults for the most common use cases means less work for those users.
When a default feature providing some functionality is disabled, active features must not automatically replace that functionality
As the SDK is currently designed, the TLS implementation in use can change depending on what features are pulled in. Currently, if a user disables default-features
(which include rustls
) and activates the native-tls
feature, then we automatically use native-tls
when making requests. For an example of what this looks like from the user's perspective, see this example.
This RFC proposes that we should have a single default for any configurable functionality and that that functionality depends on a corresponding default feature being active. If default-features
are disabled, then so is the corresponding default functionality. In its place would be functionality that fails fast with a message describing why it failed (a default was deactivated but the user didn't set a replacement), and what the user should do to fix it (with links to documentation and examples where necessary). We should use compile-time errors to communicate failures with users, or panic
s for cases that can't be evaluated at compile-time.
For an example: Say you have a crate with features a
, b
, c
that all provide some version of functionality foo
. Feature a
is part of default-features
. When no-default-features = true
but features b
and c
are active, don't automatically fall back to b
or c
. Instead, emit an error with a message like this:
"When default features are disabled, you must manually set
foo
. Featuresb
andc
active; You can use one of those. See an example of setting a customfoo
here: link-to-docs.amazon.com/setting-foo"
Further reading
- How to tell what "features" are available per crate?
- How do I 'pass down' feature flags to subdependencies in Cargo?
- A small selection of feature-related GitHub issues submitted for popular crates
RFC: Supporting Flexible Checksums
Status: Implemented
We can't currently update the S3 SDK because we don't support the new "Flexible Checksums" feature. This RFC describes this new feature and details how we should implement it in smithy-rs
.
What is the "Flexible Checksums" feature?
S3 has previously supported MD5 checksum validation of data. Now, it supports more checksum algorithms like CRC32, CRC32C, SHA-1, and SHA-256. This validation is available when putting objects to S3 and when getting them from S3. For more information, see this AWS News Blog post.
Implementing Checksums
Checksum callbacks were introduced as a result of the acceptance of RFC0013 and this RFC proposes a refactor to those callbacks, as well as several new wrappers for SdkBody
that will provide new functionality.
Refactoring aws-smithy-checksums
TLDR; This refactor of aws-smithy-checksums:
-
Removes the "callback" terminology: As a word, "callback" doesn't carry any useful information, and doesn't aid in understanding.
-
Removes support for the
BodyCallback
API: Instead of adding checksum callbacks to a body, we're going to use a "body wrapping" instead. "Body wrapping" is demonstrated in theChecksumBody
,AwsChunkedBody
, andChecksumValidatedBody
sections.NOTE: This doesn't remove the
BodyCallback
trait. That will still exist, we just won't use it. -
Updates terminology to focus on "headers" instead of "trailers": Because the types we deal with in this module are named for HTTP headers, I chose to use that terminology instead. My hope is that this will be less strange to people reading this code.
-
Adds
fn checksum_algorithm_to_checksum_header_name
: a function that's used in generated code to set a checksum request header. -
Adds
fn checksum_header_name_to_checksum_algorithm
: a function that's used in generated code when creating a checksum-validating response body. -
Add new checksum-related "body wrapping" HTTP body types: These are defined in the
body
module and will be shown later in this RFC.
// In aws-smithy-checksums/src/lib.rs
//! Checksum calculation and verification callbacks
use aws_smithy_types::base64;
use bytes::Bytes;
use http::header::{HeaderMap, HeaderName, HeaderValue};
use sha1::Digest;
use std::io::Write;
pub mod body;
// Valid checksum algorithm names
pub const CRC_32_NAME: &str = "crc32";
pub const CRC_32_C_NAME: &str = "crc32c";
pub const SHA_1_NAME: &str = "sha1";
pub const SHA_256_NAME: &str = "sha256";
pub const CRC_32_HEADER_NAME: HeaderName = HeaderName::from_static("x-amz-checksum-crc32");
pub const CRC_32_C_HEADER_NAME: HeaderName = HeaderName::from_static("x-amz-checksum-crc32c");
pub const SHA_1_HEADER_NAME: HeaderName = HeaderName::from_static("x-amz-checksum-sha1");
pub const SHA_256_HEADER_NAME: HeaderName = HeaderName::from_static("x-amz-checksum-sha256");
// Preserved for compatibility purposes. This should never be used by users, only within smithy-rs
const MD5_NAME: &str = "md5";
const MD5_HEADER_NAME: HeaderName = HeaderName::from_static("content-md5");
/// Given a `&str` representing a checksum algorithm, return the corresponding `HeaderName`
/// for that checksum algorithm.
pub fn checksum_algorithm_to_checksum_header_name(checksum_algorithm: &str) -> HeaderName {
if checksum_algorithm.eq_ignore_ascii_case(CRC_32_NAME) {
CRC_32_HEADER_NAME
} else if checksum_algorithm.eq_ignore_ascii_case(CRC_32_C_NAME) {
CRC_32_C_HEADER_NAME
} else if checksum_algorithm.eq_ignore_ascii_case(SHA_1_NAME) {
SHA_1_HEADER_NAME
} else if checksum_algorithm.eq_ignore_ascii_case(SHA_256_NAME) {
SHA_256_HEADER_NAME
} else if checksum_algorithm.eq_ignore_ascii_case(MD5_NAME) {
MD5_HEADER_NAME
} else {
// TODO what's the best way to handle this case?
HeaderName::from_static("x-amz-checksum-unknown")
}
}
/// Given a `HeaderName` representing a checksum algorithm, return the name of that algorithm
/// as a `&'static str`.
pub fn checksum_header_name_to_checksum_algorithm(
checksum_header_name: &HeaderName,
) -> &'static str {
if checksum_header_name == CRC_32_HEADER_NAME {
CRC_32_NAME
} else if checksum_header_name == CRC_32_C_HEADER_NAME {
CRC_32_C_NAME
} else if checksum_header_name == SHA_1_HEADER_NAME {
SHA_1_NAME
} else if checksum_header_name == SHA_256_HEADER_NAME {
SHA_256_NAME
} else if checksum_header_name == MD5_HEADER_NAME {
MD5_NAME
} else {
// TODO what's the best way to handle this case?
"unknown-checksum-algorithm"
}
}
/// When a response has to be checksum-verified, we have to check possible headers until we find the
/// header with the precalculated checksum. Because a service may send back multiple headers, we have
/// to check them in order based on how fast each checksum is to calculate.
pub const CHECKSUM_HEADERS_IN_PRIORITY_ORDER: [HeaderName; 4] = [
CRC_32_C_HEADER_NAME,
CRC_32_HEADER_NAME,
SHA_1_HEADER_NAME,
SHA_256_HEADER_NAME,
];
type BoxError = Box<dyn std::error::Error + Send + Sync>;
/// Checksum algorithms are use to validate the integrity of data. Structs that implement this trait
/// can be used as checksum calculators. This trait requires Send + Sync because these checksums are
/// often used in a threaded context.
pub trait Checksum: Send + Sync {
/// Given a slice of bytes, update this checksum's internal state.
fn update(&mut self, bytes: &[u8]) -> Result<(), BoxError>;
/// Either return this checksum as a `HeaderMap` containing one HTTP header, or return an error
/// describing why checksum calculation failed.
fn headers(&self) -> Result<Option<HeaderMap<HeaderValue>>, BoxError>;
/// Return the `HeaderName` used to represent this checksum algorithm
fn header_name(&self) -> HeaderName;
/// "Finalize" this checksum, returning the calculated value as `Bytes` or an error that
/// occurred during checksum calculation. To print this value in a human-readable hexadecimal
/// format, you can print it using Rust's builtin [formatter].
///
/// _**NOTE:** typically, "finalizing" a checksum in Rust will take ownership of the checksum
/// struct. In this method, we clone the checksum's state before finalizing because checksums
/// may be used in a situation where taking ownership is not possible._
///
/// [formatter]: https://doc.rust-lang.org/std/fmt/trait.UpperHex.html
fn finalize(&self) -> Result<Bytes, BoxError>;
/// Return the size of this checksum algorithms resulting checksum, in bytes. For example, the
/// CRC32 checksum algorithm calculates a 32 bit checksum, so a CRC32 checksum struct
/// implementing this trait method would return 4.
fn size(&self) -> u64;
}
/// Create a new `Box<dyn Checksum>` from an algorithm name. Valid algorithm names are defined as
/// `const`s in this module.
pub fn new_checksum(checksum_algorithm: &str) -> Box<dyn Checksum> {
if checksum_algorithm.eq_ignore_ascii_case(CRC_32_NAME) {
Box::new(Crc32::default())
} else if checksum_algorithm.eq_ignore_ascii_case(CRC_32_C_NAME) {
Box::new(Crc32c::default())
} else if checksum_algorithm.eq_ignore_ascii_case(SHA_1_NAME) {
Box::new(Sha1::default())
} else if checksum_algorithm.eq_ignore_ascii_case(SHA_256_NAME) {
Box::new(Sha256::default())
} else if checksum_algorithm.eq_ignore_ascii_case(MD5_NAME) {
// It's possible to create an MD5 and we do this in some situations for compatibility.
// We deliberately hide this from users so that they don't go using it.
Box::new(Md5::default())
} else {
panic!("unsupported checksum algorithm '{}'", checksum_algorithm)
}
}
#[derive(Debug, Default)]
struct Crc32 {
hasher: crc32fast::Hasher,
}
impl Crc32 {
fn update(&mut self, bytes: &[u8]) -> Result<(), BoxError> {
self.hasher.update(bytes);
Ok(())
}
fn headers(&self) -> Result<Option<HeaderMap<HeaderValue>>, BoxError> {
let mut header_map = HeaderMap::new();
header_map.insert(Self::header_name(), self.header_value());
Ok(Some(header_map))
}
fn finalize(&self) -> Result<Bytes, BoxError> {
Ok(Bytes::copy_from_slice(
&self.hasher.clone().finalize().to_be_bytes(),
))
}
// Size of the checksum in bytes
fn size() -> u64 {
4
}
fn header_name() -> HeaderName {
CRC_32_HEADER_NAME
}
fn header_value(&self) -> HeaderValue {
// We clone the hasher because `Hasher::finalize` consumes `self`
let hash = self.hasher.clone().finalize();
HeaderValue::from_str(&base64::encode(u32::to_be_bytes(hash)))
.expect("will always produce a valid header value from a CRC32 checksum")
}
}
impl Checksum for Crc32 {
fn update(
&mut self,
bytes: &[u8],
) -> Result<(), Box<(dyn std::error::Error + Send + Sync + 'static)>> {
Self::update(self, bytes)
}
fn headers(
&self,
) -> Result<Option<HeaderMap>, Box<(dyn std::error::Error + Send + Sync + 'static)>> {
Self::headers(self)
}
fn header_name(&self) -> HeaderName {
Self::header_name()
}
fn finalize(&self) -> Result<Bytes, BoxError> {
Self::finalize(self)
}
fn size(&self) -> u64 {
Self::size()
}
}
#[derive(Debug, Default)]
struct Crc32c {
state: Option<u32>,
}
impl Crc32c {
fn update(&mut self, bytes: &[u8]) -> Result<(), BoxError> {
self.state = match self.state {
Some(crc) => Some(crc32c::crc32c_append(crc, bytes)),
None => Some(crc32c::crc32c(bytes)),
};
Ok(())
}
fn headers(&self) -> Result<Option<HeaderMap<HeaderValue>>, BoxError> {
let mut header_map = HeaderMap::new();
header_map.insert(Self::header_name(), self.header_value());
Ok(Some(header_map))
}
fn finalize(&self) -> Result<Bytes, BoxError> {
Ok(Bytes::copy_from_slice(
&self.state.unwrap_or_default().to_be_bytes(),
))
}
// Size of the checksum in bytes
fn size() -> u64 {
4
}
fn header_name() -> HeaderName {
CRC_32_C_HEADER_NAME
}
fn header_value(&self) -> HeaderValue {
// If no data was provided to this callback and no CRC was ever calculated, return zero as the checksum.
let hash = self.state.unwrap_or_default();
HeaderValue::from_str(&base64::encode(u32::to_be_bytes(hash)))
.expect("will always produce a valid header value from a CRC32C checksum")
}
}
impl Checksum for Crc32c {
fn update(
&mut self,
bytes: &[u8],
) -> Result<(), Box<(dyn std::error::Error + Send + Sync + 'static)>> {
Self::update(self, bytes)
}
fn headers(
&self,
) -> Result<Option<HeaderMap>, Box<(dyn std::error::Error + Send + Sync + 'static)>> {
Self::headers(self)
}
fn header_name(&self) -> HeaderName {
Self::header_name()
}
fn finalize(&self) -> Result<Bytes, BoxError> {
Self::finalize(self)
}
fn size(&self) -> u64 {
Self::size()
}
}
#[derive(Debug, Default)]
struct Sha1 {
hasher: sha1::Sha1,
}
impl Sha1 {
fn update(&mut self, bytes: &[u8]) -> Result<(), BoxError> {
self.hasher.write_all(bytes)?;
Ok(())
}
fn headers(&self) -> Result<Option<HeaderMap<HeaderValue>>, BoxError> {
let mut header_map = HeaderMap::new();
header_map.insert(Self::header_name(), self.header_value());
Ok(Some(header_map))
}
fn finalize(&self) -> Result<Bytes, BoxError> {
Ok(Bytes::copy_from_slice(
self.hasher.clone().finalize().as_slice(),
))
}
// Size of the checksum in bytes
fn size() -> u64 {
20
}
fn header_name() -> HeaderName {
SHA_1_HEADER_NAME
}
fn header_value(&self) -> HeaderValue {
// We clone the hasher because `Hasher::finalize` consumes `self`
let hash = self.hasher.clone().finalize();
HeaderValue::from_str(&base64::encode(&hash[..]))
.expect("will always produce a valid header value from a SHA-1 checksum")
}
}
impl Checksum for Sha1 {
fn update(
&mut self,
bytes: &[u8],
) -> Result<(), Box<(dyn std::error::Error + Send + Sync + 'static)>> {
Self::update(self, bytes)
}
fn headers(
&self,
) -> Result<Option<HeaderMap>, Box<(dyn std::error::Error + Send + Sync + 'static)>> {
Self::headers(self)
}
fn header_name(&self) -> HeaderName {
Self::header_name()
}
fn finalize(&self) -> Result<Bytes, BoxError> {
Self::finalize(self)
}
fn size(&self) -> u64 {
Self::size()
}
}
#[derive(Debug, Default)]
struct Sha256 {
hasher: sha2::Sha256,
}
impl Sha256 {
fn update(&mut self, bytes: &[u8]) -> Result<(), BoxError> {
self.hasher.write_all(bytes)?;
Ok(())
}
fn headers(&self) -> Result<Option<HeaderMap<HeaderValue>>, BoxError> {
let mut header_map = HeaderMap::new();
header_map.insert(Self::header_name(), self.header_value());
Ok(Some(header_map))
}
fn finalize(&self) -> Result<Bytes, BoxError> {
Ok(Bytes::copy_from_slice(
self.hasher.clone().finalize().as_slice(),
))
}
// Size of the checksum in bytes
fn size() -> u64 {
32
}
fn header_name() -> HeaderName {
SHA_256_HEADER_NAME
}
fn header_value(&self) -> HeaderValue {
// We clone the hasher because `Hasher::finalize` consumes `self`
let hash = self.hasher.clone().finalize();
HeaderValue::from_str(&base64::encode(&hash[..]))
.expect("will always produce a valid header value from a SHA-256 checksum")
}
}
impl Checksum for Sha256 {
fn update(
&mut self,
bytes: &[u8],
) -> Result<(), Box<(dyn std::error::Error + Send + Sync + 'static)>> {
Self::update(self, bytes)
}
fn headers(
&self,
) -> Result<Option<HeaderMap>, Box<(dyn std::error::Error + Send + Sync + 'static)>> {
Self::headers(self)
}
fn header_name(&self) -> HeaderName {
Self::header_name()
}
fn finalize(&self) -> Result<Bytes, BoxError> {
Self::finalize(self)
}
fn size(&self) -> u64 {
Self::size()
}
}
#[derive(Debug, Default)]
struct Md5 {
hasher: md5::Md5,
}
impl Md5 {
fn update(&mut self, bytes: &[u8]) -> Result<(), BoxError> {
self.hasher.write_all(bytes)?;
Ok(())
}
fn headers(&self) -> Result<Option<HeaderMap<HeaderValue>>, BoxError> {
let mut header_map = HeaderMap::new();
header_map.insert(Self::header_name(), self.header_value());
Ok(Some(header_map))
}
fn finalize(&self) -> Result<Bytes, BoxError> {
Ok(Bytes::copy_from_slice(
self.hasher.clone().finalize().as_slice(),
))
}
// Size of the checksum in bytes
fn size() -> u64 {
16
}
fn header_name() -> HeaderName {
MD5_HEADER_NAME
}
fn header_value(&self) -> HeaderValue {
// We clone the hasher because `Hasher::finalize` consumes `self`
let hash = self.hasher.clone().finalize();
HeaderValue::from_str(&base64::encode(&hash[..]))
.expect("will always produce a valid header value from an MD5 checksum")
}
}
impl Checksum for Md5 {
fn update(
&mut self,
bytes: &[u8],
) -> Result<(), Box<(dyn std::error::Error + Send + Sync + 'static)>> {
Self::update(self, bytes)
}
fn headers(
&self,
) -> Result<Option<HeaderMap>, Box<(dyn std::error::Error + Send + Sync + 'static)>> {
Self::headers(self)
}
fn header_name(&self) -> HeaderName {
Self::header_name()
}
fn finalize(&self) -> Result<Bytes, BoxError> {
Self::finalize(self)
}
fn size(&self) -> u64 {
Self::size()
}
}
// We have existing tests for the checksums, those don't require an update
ChecksumBody
When creating a checksum-validated request with an in-memory request body, we can read the body, calculate a checksum, and insert the checksum header, all before sending the request. When creating a checksum-validated request with a streaming request body, we don't have that luxury. Instead, we must calculate a checksum while sending the body, and append that checksum as a trailer.
We will accomplish this by wrapping the SdkBody
that requires validation within a ChecksumBody
. Afterwards, we'll need to wrap the ChecksumBody
in yet another layer which we'll discuss in the AwsChunkedBody
and AwsChunkedBodyOptions
section.
// In aws-smithy-checksums/src/body.rs
use crate::{new_checksum, Checksum};
use aws_smithy_http::body::SdkBody;
use aws_smithy_http::header::append_merge_header_maps;
use aws_smithy_types::base64;
use bytes::{Buf, Bytes};
use http::header::HeaderName;
use http::{HeaderMap, HeaderValue};
use http_body::{Body, SizeHint};
use pin_project::pin_project;
use std::fmt::Display;
use std::pin::Pin;
use std::task::{Context, Poll};
/// A `ChecksumBody` will read and calculate a request body as it's being sent. Once the body has
/// been completely read, it'll append a trailer with the calculated checksum.
#[pin_project]
pub struct ChecksumBody<InnerBody> {
#[pin]
inner: InnerBody,
#[pin]
checksum: Box<dyn Checksum>,
}
impl ChecksumBody<SdkBody> {
/// Given an `SdkBody` and the name of a checksum algorithm as a `&str`, create a new
/// `ChecksumBody<SdkBody>`. Valid checksum algorithm names are defined in this crate's
/// [root module](super).
///
/// # Panics
///
/// This will panic if the given checksum algorithm is not supported.
pub fn new(body: SdkBody, checksum_algorithm: &str) -> Self {
Self {
checksum: new_checksum(checksum_algorithm),
inner: body,
}
}
/// Return the name of the trailer that will be emitted by this `ChecksumBody`
pub fn trailer_name(&self) -> HeaderName {
self.checksum.header_name()
}
/// Calculate and return the sum of the:
/// - checksum when base64 encoded
/// - trailer name
/// - trailer separator
///
/// This is necessary for calculating the true size of the request body for certain
/// content-encodings.
pub fn trailer_length(&self) -> u64 {
let trailer_name_size_in_bytes = self.checksum.header_name().as_str().len() as u64;
let base64_encoded_checksum_size_in_bytes = base64::encoded_length(self.checksum.size());
(trailer_name_size_in_bytes
// HTTP trailer names and values may be separated by either a single colon or a single
// colon and a whitespace. In the AWS Rust SDK, we use a single colon.
+ ":".len() as u64
+ base64_encoded_checksum_size_in_bytes)
}
fn poll_inner(
self: Pin<&mut Self>,
cx: &mut Context<'_>,
) -> Poll<Option<Result<Bytes, aws_smithy_http::body::Error>>> {
let this = self.project();
let inner = this.inner;
let mut checksum = this.checksum;
match inner.poll_data(cx) {
Poll::Ready(Some(Ok(mut data))) => {
let len = data.chunk().len();
let bytes = data.copy_to_bytes(len);
if let Err(e) = checksum.update(&bytes) {
return Poll::Ready(Some(Err(e)));
}
Poll::Ready(Some(Ok(bytes)))
}
Poll::Ready(None) => Poll::Ready(None),
Poll::Ready(Some(Err(e))) => Poll::Ready(Some(Err(e))),
Poll::Pending => Poll::Pending,
}
}
}
impl http_body::Body for ChecksumBody<SdkBody> {
type Data = Bytes;
type Error = aws_smithy_http::body::Error;
fn poll_data(
self: Pin<&mut Self>,
cx: &mut Context<'_>,
) -> Poll<Option<Result<Self::Data, Self::Error>>> {
self.poll_inner(cx)
}
fn poll_trailers(
self: Pin<&mut Self>,
cx: &mut Context<'_>,
) -> Poll<Result<Option<HeaderMap<HeaderValue>>, Self::Error>> {
let this = self.project();
match (
this.checksum.headers(),
http_body::Body::poll_trailers(this.inner, cx),
) {
// If everything is ready, return trailers, merging them if we have more than one map
(Ok(outer_trailers), Poll::Ready(Ok(inner_trailers))) => {
let trailers = match (outer_trailers, inner_trailers) {
// Values from the inner trailer map take precedent over values from the outer map
(Some(outer), Some(inner)) => Some(append_merge_header_maps(inner, outer)),
// If only one or neither produced trailers, just combine the `Option`s with `or`
(outer, inner) => outer.or(inner),
};
Poll::Ready(Ok(trailers))
}
// If the inner poll is Ok but the outer body's checksum callback encountered an error,
// return the error
(Err(e), Poll::Ready(Ok(_))) => Poll::Ready(Err(e)),
// Otherwise return the result of the inner poll.
// It may be pending or it may be ready with an error.
(_, inner_poll) => inner_poll,
}
}
fn is_end_stream(&self) -> bool {
self.inner.is_end_stream()
}
fn size_hint(&self) -> SizeHint {
let body_size_hint = self.inner.size_hint();
match body_size_hint.exact() {
Some(size) => {
let checksum_size_hint = self.checksum.size();
SizeHint::with_exact(size + checksum_size_hint)
}
// TODO is this the right behavior?
None => {
let checksum_size_hint = self.checksum.size();
let mut summed_size_hint = SizeHint::new();
summed_size_hint.set_lower(body_size_hint.lower() + checksum_size_hint);
if let Some(body_size_hint_upper) = body_size_hint.upper() {
summed_size_hint.set_upper(body_size_hint_upper + checksum_size_hint);
}
summed_size_hint
}
}
}
}
// The tests I have written are omitted from this RFC for brevity. The request body checksum calculation and trailer size calculations are all tested.
ChecksumValidatedBody
Users may request checksum validation for response bodies. That capability is provided by ChecksumValidatedBody
, which will calculate a checksum as the response body is being read. Once all data has been read, the calculated checksum is compared to a precalculated checksum set during body creation. If the checksums don't match, then the body will emit an error.
// In aws-smithy-checksums/src/body.rs
/// A response body that will calculate a checksum as it is read. If all data is read and the
/// calculated checksum doesn't match a precalculated checksum, this body will emit an
/// [asw_smithy_http::body::Error].
#[pin_project]
pub struct ChecksumValidatedBody<InnerBody> {
#[pin]
inner: InnerBody,
#[pin]
checksum: Box<dyn Checksum>,
precalculated_checksum: Bytes,
}
impl ChecksumValidatedBody<SdkBody> {
/// Given an `SdkBody`, the name of a checksum algorithm as a `&str`, and a precalculated
/// checksum represented as `Bytes`, create a new `ChecksumValidatedBody<SdkBody>`.
/// Valid checksum algorithm names are defined in this crate's [root module](super).
///
/// # Panics
///
/// This will panic if the given checksum algorithm is not supported.
pub fn new(body: SdkBody, checksum_algorithm: &str, precalculated_checksum: Bytes) -> Self {
Self {
checksum: new_checksum(checksum_algorithm),
inner: body,
precalculated_checksum,
}
}
fn poll_inner(
self: Pin<&mut Self>,
cx: &mut Context<'_>,
) -> Poll<Option<Result<Bytes, aws_smithy_http::body::Error>>> {
let this = self.project();
let inner = this.inner;
let mut checksum = this.checksum;
match inner.poll_data(cx) {
Poll::Ready(Some(Ok(mut data))) => {
let len = data.chunk().len();
let bytes = data.copy_to_bytes(len);
if let Err(e) = checksum.update(&bytes) {
return Poll::Ready(Some(Err(e)));
}
Poll::Ready(Some(Ok(bytes)))
}
// Once the inner body has stopped returning data, check the checksum
// and return an error if it doesn't match.
Poll::Ready(None) => {
let actual_checksum = {
match checksum.finalize() {
Ok(checksum) => checksum,
Err(err) => {
return Poll::Ready(Some(Err(err)));
}
}
};
if *this.precalculated_checksum == actual_checksum {
Poll::Ready(None)
} else {
// So many parens it's starting to look like LISP
Poll::Ready(Some(Err(Box::new(Error::checksum_mismatch(
this.precalculated_checksum.clone(),
actual_checksum,
)))))
}
}
Poll::Ready(Some(Err(e))) => Poll::Ready(Some(Err(e))),
Poll::Pending => Poll::Pending,
}
}
}
/// Errors related to checksum calculation and validation
#[derive(Debug, Eq, PartialEq)]
#[non_exhaustive]
pub enum Error {
/// The actual checksum didn't match the expected checksum. The checksummed data has been
/// altered since the expected checksum was calculated.
ChecksumMismatch { expected: Bytes, actual: Bytes },
}
impl Error {
/// Given an expected checksum and an actual checksum in `Bytes` form, create a new
/// `Error::ChecksumMismatch`.
pub fn checksum_mismatch(expected: Bytes, actual: Bytes) -> Self {
Self::ChecksumMismatch { expected, actual }
}
}
impl Display for Error {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> Result<(), std::fmt::Error> {
match self {
Error::ChecksumMismatch { expected, actual } => write!(
f,
"body checksum mismatch. expected body checksum to be {:x} but it was {:x}",
expected, actual
),
}
}
}
impl std::error::Error for Error {}
impl http_body::Body for ChecksumValidatedBody<SdkBody> {
type Data = Bytes;
type Error = aws_smithy_http::body::Error;
fn poll_data(
self: Pin<&mut Self>,
cx: &mut Context<'_>,
) -> Poll<Option<Result<Self::Data, Self::Error>>> {
self.poll_inner(cx)
}
fn poll_trailers(
self: Pin<&mut Self>,
cx: &mut Context<'_>,
) -> Poll<Result<Option<HeaderMap<HeaderValue>>, Self::Error>> {
self.project().inner.poll_trailers(cx)
}
// Once the inner body returns true for is_end_stream, we still need to
// verify the checksum; Therefore, we always return false here.
fn is_end_stream(&self) -> bool {
false
}
fn size_hint(&self) -> SizeHint {
self.inner.size_hint()
}
}
// The tests I have written are omitted from this RFC for brevity. The response body checksum verification is tested.
AwsChunkedBody
and AwsChunkedBodyOptions
In order to send a request with checksum trailers, we must use an AWS-specific content encoding called aws-chunked
. This encoding requires that we:
- Divide the original body content into one or more chunks. For our purposes we only ever use one chunk.
- Append a hexadecimal chunk size header to each chunk.
- Suffix each chunk with a CRLF (carriage return line feed).
- Send a 0 and CRLF to close the original body content section.
- Send trailers as part of the request body, suffixing each with a CRLF.
- Send a final CRLF to close the request body.
As an example, Sending a regular request body with a SHA-256 checksum would look similar to this:
PUT SOMEURL HTTP/1.1
x-amz-checksum-sha256: ZOyIygCyaOW6GjVnihtTFtIS9PNmskdyMlNKiuyjfzw=
Content-Length: 11
...
Hello world
and the aws-chunked
version would look like this:
PUT SOMEURL HTTP/1.1
x-amz-trailer: x-amz-checksum-sha256
x-amz-decoded-content-length: 11
Content-Encoding: aws-chunked
Content-Length: 87
...
B\r\n
Hello world\r\n
0\r\n
x-amz-checksum-sha256:ZOyIygCyaOW6GjVnihtTFtIS9PNmskdyMlNKiuyjfzw=\r\n
\r\n
NOTES:
- In the second example,
B
is the hexadecimal representation of 11. - Authorization and other headers are omitted from the examples above for brevity.
- When using
aws-chunked
content encoding, S3 requires that we send thex-amz-decoded-content-length
with the length of the original body content.
This encoding scheme is performed by AwsChunkedBody
and configured with AwsChunkedBodyOptions
.
// In aws-http/src/content_encoding.rs
use aws_smithy_checksums::body::ChecksumBody;
use aws_smithy_http::body::SdkBody;
use bytes::{Buf, Bytes, BytesMut};
use http::{HeaderMap, HeaderValue};
use http_body::{Body, SizeHint};
use pin_project::pin_project;
use std::pin::Pin;
use std::task::{Context, Poll};
const CRLF: &str = "\r\n";
const CHUNK_TERMINATOR: &str = "0\r\n";
/// Content encoding header value constants
pub mod header_value {
/// Header value denoting "aws-chunked" encoding
pub const AWS_CHUNKED: &str = "aws-chunked";
}
/// Options used when constructing an [`AwsChunkedBody`][AwsChunkedBody].
#[derive(Debug, Default)]
#[non_exhaustive]
pub struct AwsChunkedBodyOptions {
/// The total size of the stream. For unsigned encoding this implies that
/// there will only be a single chunk containing the underlying payload,
/// unless ChunkLength is also specified.
pub stream_length: Option<u64>,
/// The maximum size of each chunk to be sent.
///
/// If ChunkLength and stream_length are both specified, the stream will be
/// broken up into chunk_length chunks. The encoded length of the aws-chunked
/// encoding can still be determined as long as all trailers, if any, have a
/// fixed length.
pub chunk_length: Option<u64>,
/// The length of each trailer sent within an `AwsChunkedBody`. Necessary in
/// order to correctly calculate the total size of the body accurately.
pub trailer_lens: Vec<u64>,
}
impl AwsChunkedBodyOptions {
/// Create a new [`AwsChunkedBodyOptions`][AwsChunkedBodyOptions]
pub fn new() -> Self {
Self::default()
}
/// Set stream length
pub fn with_stream_length(mut self, stream_length: u64) -> Self {
self.stream_length = Some(stream_length);
self
}
/// Set chunk length
pub fn with_chunk_length(mut self, chunk_length: u64) -> Self {
self.chunk_length = Some(chunk_length);
self
}
/// Set a trailer len
pub fn with_trailer_len(mut self, trailer_len: u64) -> Self {
self.trailer_lens.push(trailer_len);
self
}
}
#[derive(Debug, PartialEq, Eq)]
enum AwsChunkedBodyState {
WritingChunkSize,
WritingChunk,
WritingTrailers,
Closed,
}
/// A request body compatible with `Content-Encoding: aws-chunked`
///
/// Chunked-Body grammar is defined in [ABNF] as:
///
/// ```txt
/// Chunked-Body = *chunk
/// last-chunk
/// chunked-trailer
/// CRLF
///
/// chunk = chunk-size CRLF chunk-data CRLF
/// chunk-size = 1*HEXDIG
/// last-chunk = 1*("0") CRLF
/// chunked-trailer = *( entity-header CRLF )
/// entity-header = field-name ":" OWS field-value OWS
/// ```
/// For more info on what the abbreviations mean, see https://datatracker.ietf.org/doc/html/rfc7230#section-1.2
///
/// [ABNF]:https://en.wikipedia.org/wiki/Augmented_Backus%E2%80%93Naur_form
#[derive(Debug)]
#[pin_project]
pub struct AwsChunkedBody<InnerBody> {
#[pin]
inner: InnerBody,
#[pin]
state: AwsChunkedBodyState,
options: AwsChunkedBodyOptions,
}
// Currently, we only use this in terms of a streaming request body with checksum trailers
type Inner = ChecksumBody<SdkBody>;
impl AwsChunkedBody<Inner> {
/// Wrap the given body in an outer body compatible with `Content-Encoding: aws-chunked`
pub fn new(body: Inner, options: AwsChunkedBodyOptions) -> Self {
Self {
inner: body,
state: AwsChunkedBodyState::WritingChunkSize,
options,
}
}
fn encoded_length(&self) -> Option<u64> {
if self.options.chunk_length.is_none() && self.options.stream_length.is_none() {
return None;
}
let mut length = 0;
let stream_length = self.options.stream_length.unwrap_or_default();
if stream_length != 0 {
if let Some(chunk_length) = self.options.chunk_length {
let num_chunks = stream_length / chunk_length;
length += num_chunks * get_unsigned_chunk_bytes_length(chunk_length);
let remainder = stream_length % chunk_length;
if remainder != 0 {
length += get_unsigned_chunk_bytes_length(remainder);
}
} else {
length += get_unsigned_chunk_bytes_length(stream_length);
}
}
// End chunk
length += CHUNK_TERMINATOR.len() as u64;
// Trailers
for len in self.options.trailer_lens.iter() {
length += len + CRLF.len() as u64;
}
// Encoding terminator
length += CRLF.len() as u64;
Some(length)
}
}
fn prefix_with_chunk_size(data: Bytes, chunk_size: u64) -> Bytes {
// Len is the size of the entire chunk as defined in `AwsChunkedBodyOptions`
let mut prefixed_data = BytesMut::from(format!("{:X?}\r\n", chunk_size).as_bytes());
prefixed_data.extend_from_slice(&data);
prefixed_data.into()
}
fn get_unsigned_chunk_bytes_length(payload_length: u64) -> u64 {
let hex_repr_len = int_log16(payload_length);
hex_repr_len + CRLF.len() as u64 + payload_length + CRLF.len() as u64
}
fn trailers_as_aws_chunked_bytes(
total_length_of_trailers_in_bytes: u64,
trailer_map: Option<HeaderMap>,
) -> Bytes {
use std::fmt::Write;
// On 32-bit operating systems, we might not be able to convert the u64 to a usize, so we just
// use `String::new` in that case.
let mut trailers = match usize::try_from(total_length_of_trailers_in_bytes) {
Ok(total_length_of_trailers_in_bytes) => {
String::with_capacity(total_length_of_trailers_in_bytes)
}
Err(_) => String::new(),
};
let mut already_wrote_first_trailer = false;
if let Some(trailer_map) = trailer_map {
for (header_name, header_value) in trailer_map.into_iter() {
match header_name {
// New name, new value
Some(header_name) => {
if already_wrote_first_trailer {
// First trailer shouldn't have a preceding CRLF, but every trailer after it should
trailers.write_str(CRLF).unwrap();
} else {
already_wrote_first_trailer = true;
}
trailers.write_str(header_name.as_str()).unwrap();
trailers.write_char(':').unwrap();
}
// Same name, new value
None => {
trailers.write_char(',').unwrap();
}
}
trailers.write_str(header_value.to_str().unwrap()).unwrap();
}
}
// Write CRLF to end the body
trailers.write_str(CRLF).unwrap();
// If we wrote at least one trailer, we need to write an extra CRLF
if total_length_of_trailers_in_bytes != 0 {
trailers.write_str(CRLF).unwrap();
}
trailers.into()
}
impl Body for AwsChunkedBody<Inner> {
type Data = Bytes;
type Error = aws_smithy_http::body::Error;
fn poll_data(
self: Pin<&mut Self>,
cx: &mut Context<'_>,
) -> Poll<Option<Result<Self::Data, Self::Error>>> {
tracing::info!("polling AwsChunkedBody");
let mut this = self.project();
match *this.state {
AwsChunkedBodyState::WritingChunkSize => match this.inner.poll_data(cx) {
Poll::Ready(Some(Ok(data))) => {
// A chunk must be prefixed by chunk size in hexadecimal
tracing::info!("writing chunk size and start of chunk");
*this.state = AwsChunkedBodyState::WritingChunk;
let total_chunk_size = this
.options
.chunk_length
.or(this.options.stream_length)
.unwrap_or_default();
Poll::Ready(Some(Ok(prefix_with_chunk_size(data, total_chunk_size))))
}
Poll::Ready(None) => {
tracing::info!("chunk was empty, writing last-chunk");
*this.state = AwsChunkedBodyState::WritingTrailers;
Poll::Ready(Some(Ok(Bytes::from("0\r\n"))))
}
Poll::Ready(Some(Err(e))) => Poll::Ready(Some(Err(e))),
Poll::Pending => Poll::Pending,
},
AwsChunkedBodyState::WritingChunk => match this.inner.poll_data(cx) {
Poll::Ready(Some(Ok(mut data))) => {
tracing::info!("writing rest of chunk data");
Poll::Ready(Some(Ok(data.copy_to_bytes(data.len()))))
}
Poll::Ready(None) => {
tracing::info!("no more chunk data, writing CRLF and last-chunk");
*this.state = AwsChunkedBodyState::WritingTrailers;
Poll::Ready(Some(Ok(Bytes::from("\r\n0\r\n"))))
}
Poll::Ready(Some(Err(e))) => Poll::Ready(Some(Err(e))),
Poll::Pending => Poll::Pending,
},
AwsChunkedBodyState::WritingTrailers => {
return match this.inner.poll_trailers(cx) {
Poll::Ready(Ok(trailers)) => {
*this.state = AwsChunkedBodyState::Closed;
let total_length_of_trailers_in_bytes =
this.options.trailer_lens.iter().fold(0, |acc, n| acc + n);
Poll::Ready(Some(Ok(trailers_as_aws_chunked_bytes(
total_length_of_trailers_in_bytes,
trailers,
))))
}
Poll::Pending => Poll::Pending,
Poll::Ready(Err(e)) => Poll::Ready(Some(Err(e))),
};
}
AwsChunkedBodyState::Closed => {
return Poll::Ready(None);
}
}
}
fn poll_trailers(
self: Pin<&mut Self>,
_cx: &mut Context<'_>,
) -> Poll<Result<Option<HeaderMap<HeaderValue>>, Self::Error>> {
// Trailers were already appended to the body because of the content encoding scheme
Poll::Ready(Ok(None))
}
fn is_end_stream(&self) -> bool {
self.state == AwsChunkedBodyState::Closed
}
fn size_hint(&self) -> SizeHint {
SizeHint::with_exact(
self.encoded_length()
.expect("Requests made with aws-chunked encoding must have known size")
as u64,
)
}
}
// Used for finding how many hexadecimal digits it takes to represent a base 10 integer
fn int_log16<T>(mut i: T) -> u64
where
T: std::ops::DivAssign + PartialOrd + From<u8> + Copy,
{
let mut len = 0;
let zero = T::from(0);
let sixteen = T::from(16);
while i > zero {
i /= sixteen;
len += 1;
}
len
}
#[cfg(test)]
mod tests {
use super::AwsChunkedBody;
use crate::content_encoding::AwsChunkedBodyOptions;
use aws_smithy_checksums::body::ChecksumBody;
use aws_smithy_http::body::SdkBody;
use bytes::Buf;
use bytes_utils::SegmentedBuf;
use http_body::Body;
use std::io::Read;
#[tokio::test]
async fn test_aws_chunked_encoded_body() {
let input_text = "Hello world";
let sdk_body = SdkBody::from(input_text);
let checksum_body = ChecksumBody::new(sdk_body, "sha256");
let aws_chunked_body_options = AwsChunkedBodyOptions {
stream_length: Some(input_text.len() as u64),
chunk_length: None,
trailer_lens: vec![
"x-amz-checksum-sha256:ZOyIygCyaOW6GjVnihtTFtIS9PNmskdyMlNKiuyjfzw=".len() as u64,
],
};
let mut aws_chunked_body = AwsChunkedBody::new(checksum_body, aws_chunked_body_options);
let mut output = SegmentedBuf::new();
while let Some(buf) = aws_chunked_body.data().await {
output.push(buf.unwrap());
}
let mut actual_output = String::new();
output
.reader()
.read_to_string(&mut actual_output)
.expect("Doesn't cause IO errors");
let expected_output = "B\r\nHello world\r\n0\r\nx-amz-checksum-sha256:ZOyIygCyaOW6GjVnihtTFtIS9PNmskdyMlNKiuyjfzw=\r\n\r\n";
// Verify data is complete and correctly encoded
assert_eq!(expected_output, actual_output);
assert!(
aws_chunked_body
.trailers()
.await
.expect("checksum generation was without error")
.is_none(),
"aws-chunked encoded bodies don't have normal HTTP trailers"
);
}
#[tokio::test]
async fn test_empty_aws_chunked_encoded_body() {
let sdk_body = SdkBody::from("");
let checksum_body = ChecksumBody::new(sdk_body, "sha256");
let aws_chunked_body_options = AwsChunkedBodyOptions {
stream_length: Some(0),
chunk_length: None,
trailer_lens: vec![
"x-amz-checksum-sha256:47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=".len() as u64,
],
};
let mut aws_chunked_body = AwsChunkedBody::new(checksum_body, aws_chunked_body_options);
let mut output = SegmentedBuf::new();
while let Some(buf) = aws_chunked_body.data().await {
output.push(buf.unwrap());
}
let mut actual_output = String::new();
output
.reader()
.read_to_string(&mut actual_output)
.expect("Doesn't cause IO errors");
let expected_output =
"0\r\nx-amz-checksum-sha256:47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=\r\n\r\n";
// Verify data is complete and correctly encoded
assert_eq!(expected_output, actual_output);
assert!(
aws_chunked_body
.trailers()
.await
.expect("checksum generation was without error")
.is_none(),
"aws-chunked encoded bodies don't have normal HTTP trailers"
);
}
}
Sigv4 Update
When sending checksum-verified requests with a streaming body, we must update the usual signing process. Instead of signing the request based on the request body's checksum, we must sign it with a special header instead:
Authorization: <computed authorization header value using "STREAMING-UNSIGNED-PAYLOAD-TRAILER">
x-amz-content-sha256: STREAMING-UNSIGNED-PAYLOAD-TRAILER
Setting STREAMING-UNSIGNED-PAYLOAD-TRAILER
tells the signer that we're sending an unsigned streaming body that will be followed by trailers.
We can achieve this by:
- Adding a new variant to
SignableBody
:/// A signable HTTP request body #[derive(Debug, Clone, Eq, PartialEq)] #[non_exhaustive] pub enum SignableBody<'a> { // existing variants have been omitted for brevity... /// An unsigned payload with trailers /// /// StreamingUnsignedPayloadTrailer is used for streaming requests where the contents of the /// body cannot be known prior to signing **AND** which include HTTP trailers. StreamingUnsignedPayloadTrailer, }
- Updating the
CanonicalRequest::payload_hash
method to include the newSignableBody
variant:fn payload_hash<'b>(body: &'b SignableBody<'b>) -> Cow<'b, str> { // Payload hash computation // // Based on the input body, set the payload_hash of the canonical request: // Either: // - compute a hash // - use the precomputed hash // - use `UnsignedPayload` // - use `StreamingUnsignedPayloadTrailer` match body { SignableBody::Bytes(data) => Cow::Owned(sha256_hex_string(data)), SignableBody::Precomputed(digest) => Cow::Borrowed(digest.as_str()), SignableBody::UnsignedPayload => Cow::Borrowed(UNSIGNED_PAYLOAD), SignableBody::StreamingUnsignedPayloadTrailer => { Cow::Borrowed(STREAMING_UNSIGNED_PAYLOAD_TRAILER) } } }
- (in generated code) Inserting the
SignableBody
into the request property bag when making a checksum-verified streaming request:if self.checksum_algorithm.is_some() { request .properties_mut() .insert(aws_sig_auth::signer::SignableBody::StreamingUnsignedPayloadTrailer); }
It's possible to send aws-chunked
requests where each chunk is signed individually. Because this feature isn't strictly necessary for flexible checksums, I've avoided implementing it.
Inlineables
In order to avoid writing lots of Rust in Kotlin, I have implemented request and response building functions as inlineables:
- Building checksum-validated requests with in-memory request bodies:
// In aws/rust-runtime/aws-inlineable/src/streaming_body_with_checksum.rs /// Given a `&mut http::request::Request`, and checksum algorithm name, calculate a checksum and /// then modify the request to include the checksum as a header. pub fn build_checksum_validated_request( request: &mut http::request::Request<aws_smithy_http::body::SdkBody>, checksum_algorithm: &str, ) -> Result<(), aws_smithy_http::operation::BuildError> { let data = request.body().bytes().unwrap_or_default(); let mut checksum = aws_smithy_checksums::new_checksum(checksum_algorithm); checksum .update(data) .map_err(|err| aws_smithy_http::operation::BuildError::Other(err))?; let checksum = checksum .finalize() .map_err(|err| aws_smithy_http::operation::BuildError::Other(err))?; request.headers_mut().insert( aws_smithy_checksums::checksum_algorithm_to_checksum_header_name(checksum_algorithm), aws_smithy_types::base64::encode(&checksum[..]) .parse() .expect("base64-encoded checksums are always valid header values"), ); Ok(()) }
- Building checksum-validated requests with streaming request bodies:
/// Given an `http::request::Builder`, `SdkBody`, and a checksum algorithm name, return a /// `Request<SdkBody>` with checksum trailers where the content is `aws-chunked` encoded. pub fn build_checksum_validated_request_with_streaming_body( request_builder: http::request::Builder, body: aws_smithy_http::body::SdkBody, checksum_algorithm: &str, ) -> Result<http::Request<aws_smithy_http::body::SdkBody>, aws_smithy_http::operation::BuildError> { use http_body::Body; let original_body_size = body .size_hint() .exact() .expect("body must be sized if checksum is requested"); let body = aws_smithy_checksums::body::ChecksumBody::new(body, checksum_algorithm); let checksum_trailer_name = body.trailer_name(); let aws_chunked_body_options = aws_http::content_encoding::AwsChunkedBodyOptions::new() .with_stream_length(original_body_size as usize) .with_trailer_len(body.trailer_length() as usize); let body = aws_http::content_encoding::AwsChunkedBody::new(body, aws_chunked_body_options); let encoded_content_length = body .size_hint() .exact() .expect("encoded_length must return known size"); let request_builder = request_builder .header( http::header::CONTENT_LENGTH, http::HeaderValue::from(encoded_content_length), ) .header( http::header::HeaderName::from_static("x-amz-decoded-content-length"), http::HeaderValue::from(original_body_size), ) .header( http::header::HeaderName::from_static("x-amz-trailer"), checksum_trailer_name, ) .header( http::header::CONTENT_ENCODING, aws_http::content_encoding::header_value::AWS_CHUNKED.as_bytes(), ); let body = aws_smithy_http::body::SdkBody::from_dyn(http_body::combinators::BoxBody::new(body)); request_builder .body(body) .map_err(|err| aws_smithy_http::operation::BuildError::Other(Box::new(err))) }
- Building checksum-validated responses:
/// Given a `Response<SdkBody>`, checksum algorithm name, and pre-calculated checksum, return a /// `Response<SdkBody>` where the body will processed with the checksum algorithm and checked /// against the pre-calculated checksum. pub fn build_checksum_validated_sdk_body( body: aws_smithy_http::body::SdkBody, checksum_algorithm: &str, precalculated_checksum: bytes::Bytes, ) -> aws_smithy_http::body::SdkBody { let body = aws_smithy_checksums::body::ChecksumValidatedBody::new( body, checksum_algorithm, precalculated_checksum.clone(), ); aws_smithy_http::body::SdkBody::from_dyn(http_body::combinators::BoxBody::new(body)) } /// Given the name of a checksum algorithm and a `HeaderMap`, extract the checksum value from the /// corresponding header as `Some(Bytes)`. If the header is unset, return `None`. pub fn check_headers_for_precalculated_checksum( headers: &http::HeaderMap<http::HeaderValue>, ) -> Option<(&'static str, bytes::Bytes)> { for header_name in aws_smithy_checksums::CHECKSUM_HEADERS_IN_PRIORITY_ORDER { if let Some(precalculated_checksum) = headers.get(&header_name) { let checksum_algorithm = aws_smithy_checksums::checksum_header_name_to_checksum_algorithm(&header_name); let precalculated_checksum = bytes::Bytes::copy_from_slice(precalculated_checksum.as_bytes()); return Some((checksum_algorithm, precalculated_checksum)); } } None }
Codegen
Codegen will be updated to insert the appropriate inlineable functions for operations that are tagged with the @httpchecksum
trait. Some operations will require an MD5 checksum fallback if the user hasn't set a checksum themselves.
Users also have the option of supplying a precalculated checksum of their own. This is already handled by our current header insertion logic and won't require updating the existing implementation. Because this checksum validation behavior is AWS-specific, it will be defined in SDK codegen.
Implementation Checklist
-
Implement codegen for building checksum-validated requests:
-
In-memory request bodies
- Support MD5 fallback behavior for services that enable it.
- Streaming request bodies
-
In-memory request bodies
- Implement codegen for building checksum-validated responses:
RFC: Customizable Client Operations
Status: Implemented
For a summarized list of proposed changes, see the Changes Checklist section.
SDK customers occasionally need to add additional HTTP headers to requests, and currently, the SDK has no easy way to accomplish this. At time of writing, the lower level Smithy client has to be used to create an operation, and then the HTTP request augmented on that operation type. For example:
let input = SomeOperationInput::builder().some_value(5).build()?;
let operation = {
let op = input.make_operation(&service_config).await?;
let (request, response) = op.into_request_response();
let request = request.augment(|req, _props| {
req.headers_mut().insert(
HeaderName::from_static("x-some-header"),
HeaderValue::from_static("some-value")
);
Result::<_, Infallible>::Ok(req)
})?;
Operation::from_parts(request, response)
};
let response = smithy_client.call(operation).await?;
This approach is both difficult to discover and implement since it requires acquiring a Smithy client rather than the generated fluent client, and it's anything but ergonomic.
This RFC proposes an easier way to augment requests that is compatible with the fluent client.
Terminology
- Smithy Client: A
aws_smithy_client::Client<C, M, R>
struct that is responsible for gluing together the connector, middleware, and retry policy. - Fluent Client: A code generated
Client
that has methods for each service operation on it. A fluent builder is generated alongside it to make construction easier.
Proposal
The code generated fluent builders returned by the fluent client should have a method added to them,
similar to send
, but that returns a customizable request. The customer experience should look as
follows:
let response = client.some_operation()
.some_value(5)
.customize()
.await?
.mutate_request(|mut req| {
req.headers_mut().insert(
HeaderName::from_static("x-some-header"),
HeaderValue::from_static("some-value")
);
})
.send()
.await?;
This new async customize
method would return the following:
pub struct CustomizableOperation<O, R> {
handle: Arc<Handle>,
operation: Operation<O, R>,
}
impl<O, R> CustomizableOperation<O, R> {
// Allows for customizing the operation's request
fn map_request<E>(
mut self,
f: impl FnOnce(Request<SdkBody>) -> Result<Request<SdkBody>, E>,
) -> Result<Self, E> {
let (request, response) = self.operation.into_request_response();
let request = request.augment(|req, _props| f(req))?;
self.operation = Operation::from_parts(request, response);
Ok(self)
}
// Convenience for `map_request` where infallible direct mutation of request is acceptable
fn mutate_request<E>(
mut self,
f: impl FnOnce(&mut Request<SdkBody>) -> (),
) -> Self {
self.map_request(|mut req| {
f(&mut req);
Result::<_, Infallible>::Ok(req)
}).expect("infallible");
Ok(self)
}
// Allows for customizing the entire operation
fn map_operation<E>(
mut self,
f: impl FnOnce(Operation<O, R>) -> Result<Operation<O, R>, E>,
) -> Result<Self, E> {
self.operation = f(self.operation)?;
Ok(self)
}
// Direct access to read the request
fn request(&self) -> &Request<SdkBody> {
self.operation.request()
}
// Direct access to mutate the request
fn request_mut(&mut self) -> &mut Request<SdkBody> {
self.operation.request_mut()
}
// Sends the operation's request
async fn send<T, E>(self) -> Result<T, SdkError<E>>
where
O: ParseHttpResponse<Output = Result<T, E>> + Send + Sync + Clone + 'static,
E: std::error::Error,
R: ClassifyResponse<SdkSuccess<T>, SdkError<E>> + Send + Sync,
{
self.handle.client.call(self.operation).await
}
}
Additionally, for those who want to avoid closures, the Operation
type will have
request
and request_mut
methods added to it to get direct access to its underlying
HTTP request.
The CustomizableOperation
type will then mirror these functions so that the experience
can look as follows:
let mut operation = client.some_operation()
.some_value(5)
.customize()
.await?;
operation.request_mut()
.headers_mut()
.insert(
HeaderName::from_static("x-some-header"),
HeaderValue::from_static("some-value")
);
let response = operation.send().await?;
Why not remove async
from customize
to make this more ergonomic?
In the proposal above, customers must await
the result of customize
in order
to get the CustomizableOperation
. This is a result of the underlying map_operation
function that customize
needs to call being async, which was made async during
the implementation of customizations for Glacier (see #797, #801, and #1474). It
is possible to move these Glacier customizations into middleware to make map_operation
sync, but keeping it async is much more future-proof since if a future customization
or feature requires it to be async, it won't be a breaking change in the future.
Why the name customize
?
Alternatively, the name build
could be used, but this increases the odds that
customers won't realize that they can call send
directly, and then call a longer
build
/send
chain when customization isn't needed:
client.some_operation()
.some_value()
.build() // Oops, didn't need to do this
.send()
.await?;
vs.
client.some_operation()
.some_value()
.send()
.await?;
Additionally, no AWS services at time of writing have a member named customize
that would conflict with the new function, so adding it would not be a breaking change.
Changes Checklist
-
Create
CustomizableOperation
as an inlinable, and code generate it intoclient
so that it has access toHandle
-
Code generate the
customize
method on fluent builders -
Update the
RustReservedWords
class to includecustomize
-
Add ability to mutate the HTTP request on
Operation
- Add examples for both approaches
- Comment on older discussions asking about how to do this with this improved approach
RFC: Logging in the Presence of Sensitive Data
Status: Accepted
Smithy provides a sensitive trait which exists as a @sensitive
field annotation syntactically and has the following semantics:
Sensitive data MUST NOT be exposed in things like exception messages or log output. Application of this trait SHOULD NOT affect wire logging (i.e., logging of all data transmitted to and from servers or clients).
This RFC is concerned with solving the problem of honouring this specification in the context of logging.
Progress has been made towards this goal in the form of the Sensitive Trait PR, which uses code generation to remove sensitive fields from Debug
implementations.
The problem remains open due to the existence of HTTP binding traits and a lack of clearly defined user guidelines which customers may follow to honour the specification.
This RFC proposes:
- A new logging middleware is generated and applied to each
OperationHandler
Service
. - A developer guideline is provided on how to avoid violating the specification.
Terminology
- Model: A Smithy Model, usually pertaining to the one in use by the customer.
- Runtime crate: A crate existing within the
rust-runtime
folder, used to implement shared functionalities that do not have to be code-generated. - Service: The tower::Service trait. The lowest level of abstraction we deal with when making HTTP requests. Services act directly on data to transform and modify that data. A Service is what eventually turns a request into a response.
- Middleware: Broadly speaking, middleware modify requests and responses. Concretely, these are exist as implementations of Layer/a
Service
wrapping an innerService
. - Potentially sensitive: Data that could be bound to a sensitive field of a structure, for example via the HTTP Binding Traits.
Background
HTTP Binding Traits
Smithy provides various HTTP binding traits. These allow protocols to configure a HTTP request by way of binding fields to parts of the request. For this reason sensitive data might be unintentionally leaked through logging of a bound request.
Trait | Configurable |
---|---|
httpHeader | Headers |
httpPrefixHeaders | Headers |
httpLabel | URI |
httpPayload | Payload |
httpQuery | Query Parameters |
httpResponseCode | Status Code |
Each of these configurable parts must therefore be logged cautiously.
Scope and Guidelines
It would be unfeasible to forbid the logging of sensitive data all together using the type system. With the current API, the customer will always have an opportunity to log a request containing sensitive data before it enters the Service<Request<B>>
that we provide to them.
// The API provides us with a `Service<Request<B>>`
let app: Router = OperationRegistryBuilder::default().build().expect("unable to build operation registry").into();
// We can `ServiceExt::map_request` log a request with potentially sensitive data
let app = app.map_request(|request| {
info!(?request);
request
});
A more subtle violation of the specification may occur when the customer enables verbose logging - a third-party dependency might simply log data marked as sensitive, for example tokio
or hyper
.
These two cases illustrate that smithy-rs
can only prevent violation of the specification in a restricted scope - logs emitted from generated code and the runtime crates. A smithy-rs
specific guideline should be available to the customer which outlines how to avoid violating the specification in areas outside of our control.
Routing
The sensitivity and HTTP bindings are declared within specific structures/operations. For this reason, in the general case, it's unknowable whether or not any given part of a request is sensitive until we determine which operation is tasked with handling the request and hence which fields are bound. Implementation wise, this means that any middleware applied before routing has taken place cannot log anything potentially sensitive without performing routing logic itself.
Note that:
- We are not required to deserialize the entire request before we can make judgments on what data is sensitive or not - only which operation it has been routed to.
- We are permitted to emit logs prior to routing when:
- they contain no potentially sensitive data, or
- the request failed to route, in which case it's not subject to the constraints of an operation.
Runtime Crates
The crates existing in rust-runtime
are not code generated - their source code is agnostic to the specific model in use. For this reason, if such a crate wanted to log potentially sensitive data then there must be a way to conditionally toggle that log without manipulation of the source code. Any proposed solution must acknowledge this concern.
Proposal
This proposal serves to honor the sensitivity specification via code generation of a logging middleware which is aware of the sensitivity, together with a developer contract disallowing logging potentially sensitive data in the runtime crates. A developer guideline should be provided in addition to the middleware.
All data known to be sensitive should be replaced with "{redacted}"
when logged. Implementation wise this means that tracing::Events and tracing::Spans of the form debug!(field = "sensitive data")
and span!(..., field = "sensitive data")
must become debug!(field = "{redacted}")
and span!(..., field = "{redacted}")
.
Debug Logging
Developers might want to observe sensitive data for debugging purposes. It should be possible to opt-out of the redactions by enabling a feature flag unredacted-logging
(which is disabled by default).
To prevent excessive branches such as
if cfg!(feature = "unredacted-logging") {
debug!(%data, "logging here");
} else {
debug!(data = "{redacted}", "logging here");
}
the following wrapper should be provided from a runtime crate:
pub struct Sensitive<T>(T);
impl<T> Debug for Sensitive<T>
where
T: Debug
{
fn fmt(&self, f: &mut Formatter<'_>) -> Result<(), Error> {
if cfg!(feature = "unredacted-logging") {
self.0.fmt(f)
} else {
"{redacted}".fmt(f)
}
}
}
impl<T> Display for Sensitive<T>
where
T: Display
{
fn fmt(&self, f: &mut Formatter<'_>) -> Result<(), Error> {
if cfg!(feature = "unredacted-logging") {
self.0.fmt(f)
} else {
"{redacted}".fmt(f)
}
}
}
In which case the branch above becomes
debug!(sensitive_data = %Sensitive(data));
Code Generated Logging Middleware
Using the smithy model, for each operation, a logging middleware should be generated. Through the model, the code generation knows which fields are sensitive and which HTTP bindings exist, therefore the logging middleware can be carefully crafted to avoid leaking sensitive data.
As a request enters this middleware it should record the method, HTTP headers, status code, and URI in a tracing::span
. As a response leaves this middleware it should record the HTTP headers and status code in a tracing::debug
.
The following model
@readonly
@http(uri: "/inventory/{name}", method: "GET")
operation Inventory {
input: Product,
output: Stocked
}
@input
structure Product {
@required
@sensitive
@httpLabel
name: String
}
@output
structure Stocked {
@sensitive
@httpResponseCode
code: String,
}
should generate the following
// NOTE: This code is intended to show behavior - it does not compile
pub struct InventoryLogging<S> {
inner: S,
operation_name: &'static str
}
impl<S> InventoryLogging<S> {
pub fn new(inner: S) -> Self {
Self {
inner
}
}
}
impl<B, S> Service<Request<B>> for InventoryLogging<S>
where
S: Service<Request<B>>
{
type Response = Response<BoxBody>;
type Error = S::Error;
type Future = /* Implementation detail */;
fn call(&mut self, request: Request<B>) -> Self::Future {
// Remove sensitive data from parts of the HTTP
let uri = /* redact {name} from URI */;
let headers = /* no redactions */;
let fut = async {
let response = self.inner.call(request).await;
let status_code = /* redact status code */;
let headers = /* no redactions */;
debug!(%status_code, ?headers, "response");
response
};
// Instrument the future with a span
let span = debug_span!("request", operation = %self.operation_name, method = %request.method(), %uri, ?headers);
fut.instrument(span)
}
}
HTTP Debug/Display Wrappers
The Service::call
path, seen in Code Generated Logging Middleware, is latency-sensitive. Careful implementation is required to avoid excess allocations during redaction of sensitive data. Wrapping Uri and HeaderMap then providing a new Display/Debug implementation which skips over the sensitive data is preferable over allocating a new String
/HeaderMap
and then mutating it.
These wrappers should be provided alongside the Sensitive
struct described in Debug Logging. If they are implemented on top of Sensitive
, they will inherit the same behavior - allowing redactions to be toggled using unredacted-logging
feature flag.
Middleware Position
This logging middleware should be applied outside of the OperationHandler after its construction in the (generated) operation_registry.rs
file. The middleware should preserve the associated types of the OperationHandler
(Response = Response<BoxBody>
, Error = Infallible
) to cause minimal disruption.
An easy position to apply the logging middleware is illustrated below in the form of Logging{Operation}::new
:
let empty_operation = LoggingEmptyOperation::new(operation(registry.empty_operation));
let get_pokemon_species = LoggingPokemonSpecies::new(operation(registry.get_pokemon_species));
let get_server_statistics = LoggingServerStatistics::new(operation(registry.get_server_statistics));
let routes = vec![
(BoxCloneService::new(empty_operation), empty_operation_request_spec),
(BoxCloneService::new(get_pokemon_species), get_pokemon_species_request_spec),
(BoxCloneService::new(get_server_statistics), get_server_statistics_request_spec),
];
let router = aws_smithy_http_server::routing::Router::new_rest_json_router(routes);
Although an acceptable first step, putting logging middleware here is suboptimal - the Router
allows a tower::Layer
to be applied to the operation by using the Router::layer method. This middleware will be applied outside of the logging middleware and, as a result, will not be subject to the span of any middleware. Therefore, the Router
must be changed to allow for middleware to be applied within the logging middleware rather than outside of it.
This is a general problem, not specific to this proposal. For example, Use Request Extensions must also solve this problem.
Fortunately, this problem is separable from the actual implementation of the logging middleware and we can get immediate benefit by application of it in the suboptimal position described above.
Logging within the Router
There is need for logging within the Router
implementation - this is a crucial area of business logic. As mentioned in the Routing section, we are permitted to log potentially sensitive data in cases where requests fail to get routed to an operation.
In the case of AWS JSON 1.0 and 1.1 protocols, the request URI is always /
, putting it outside of the reach of the @sensitive
trait. We therefore have the option to log it before routing occurs. We make a choice not to do this in order to remove the special case - relying on the logging layer to log URIs when appropriate.
Developer Guideline
A guideline should be made available, which includes:
- The HTTP bindings traits and why they are of concern in the presence of
@sensitive
. - The Debug implementation on structures.
- How to use the
Sensitive
struct, HTTP wrappers, and theunredacted-logging
feature flag described in Debug Logging and HTTP Debug/Display Wrappers. - A warning against the two potential leaks described in Scope and Guidelines:
- Sensitive data leaking from third-party dependencies.
- Sensitive data leaking from middleware applied to the
Router
.
Alternative Proposals
All of the following proposals are compatible with, and benefit from, Debug Logging, HTTP Debug/Display Wrappers, and Developer Guideline portions of the main proposal.
The main proposal disallows the logging of potentially sensitive data in the runtime crates, instead opting for a dedicated code generated logging middleware. In contrast, the following proposals all seek ways to accommodate logging of potentially sensitive data in the runtime crates.
Use Request Extensions
Request extensions can be used to adjoin data to a Request as it passes through the middleware. Concretely, they exist as the type map http::Extensions accessed via http::extensions and http::extensions_mut.
These can be used to provide data to middleware interested in logging potentially sensitive data.
struct Sensitivity {
/* Data concerning which parts of the request are sensitive */
}
struct Middleware<S> {
inner: S
}
impl<B, S> Service<Request<B>> for Middleware<S> {
/* ... */
fn call(&mut self, request: Request<B>) -> Self::Future {
if let Some(sensitivity) = request.extensions().get::<Sensitivity>() {
if sensitivity.is_method_sensitive() {
debug!(method = %request.method());
}
}
/* ... */
self.inner.call(request)
}
}
A middleware layer must be code generated (much in the same way as the logging middleware) which is dedicated to inserting the Sensitivity
struct into the extensions of each incoming request.
impl<B, S> Service<Request<B>> for SensitivityInserter<S>
where
S: Service<Request<B>>
{
/* ... */
fn call(&mut self, request: Request<B>) -> Self::Future {
let sensitivity = Sensitivity {
/* .. */
};
request.extensions_mut().insert(sensitivity);
self.inner.call(request)
}
}
Advantages
- Applicable to all middleware which takes
http::Request<B>
. - Does not pollute the API of the middleware - code internal to middleware simply inspects the request's extensions and performs logic based on its value.
Disadvantages
- The sensitivity and HTTP bindings are known at compile time whereas the insertion/retrieval of the extension data is done at runtime.
- http::Extensions is approximately a
HashMap<u64, Box<dyn Any>>
so lookup/insertion involves indirection/cache misses/heap allocation.
- http::Extensions is approximately a
Accommodate the Sensitivity in Middleware API
It is possible that sensitivity is a parameter passed to middleware during construction. This is similar in nature to Use Request Extensions except that the Sensitivity
is passed to middleware during construction.
struct Middleware<S> {
inner: S,
sensitivity: Sensitivity
}
impl Middleware<S> {
pub fn new(inner: S) -> Self { /* ... */ }
pub fn new_with_sensitivity(inner: S, sensitivity: Sensitivity) -> Self { /* ... */ }
}
impl<B, S> Service<Request<B>> for Middleware<S> {
/* ... */
fn call(&mut self, request: Request<B>) -> Self::Future {
if self.sensitivity.is_method_sensitive() {
debug!(method = %Sensitive(request.method()));
}
/* ... */
self.inner.call(request)
}
}
It would then be required that the code generation responsible constructing a Sensitivity
for each operation. Additionally, if any middleware is being applied to a operation then the code generation would be responsible for passing that middleware the appropriate Sensitivity
before applying it.
Advantages
- Applicable to all middleware.
- As the
Sensitivity
struct will be known statically, the compiler will remove branches, making it cheap.
Disadvantages
- Pollutes the API of middleware.
Redact values using a tracing Layer
Distinct from tower::Layer
, a tracing::Layer is a "composable handler for tracing
events". It would be possible to write an implementation which would filter out events which contain sensitive data.
Examples of filtering tracing::Layer
s already exist in the form of the EnvFilter and Targets. It is unlikely that we'll be able to leverage them for our use, but the underlying principle remains the same - the tracing::Layer
inspects tracing::Event
s/tracing::Span
s, filtering them based on some criteria.
Code generation would be need to be used in order to produce the filtering criteria from the models. Internal developers would need to adhere to a common set of field names in order for them to be subject to the filtering. Spans would need to be opened after routing occurs in order for the tracing::Layer
to know which operation Event
s are being produced within and hence which filtering rules to apply.
Advantages
- Applicable to all middleware.
- Good separation of concerns:
- Does not pollute the API of the middleware
- No specific logic required within middleware.
Disadvantages
- Complex implementation.
- Not necessarily fast.
tracing::Layer
s seem to only support filtering entireEvent
s, rather than more fine grained removal of fields.
Changes Checklist
- Implement and integrate code generated logging middleware.
-
Add logging to
Router
implementation. - Write developer guideline.
-
Refactor
Router
to allow for better positioning described in Middleware Position.
RFC: Errors for event streams
Status: Implemented
For a summarized list of proposed changes, see the Changes Checklist section.
This RFC defines how client and server will use errors defined in @streaming
unions (event streams).
The user experience if this RFC is implemented
In the current version of smithy-rs, customers who want to use errors in event streams need to use them as so:
stream! {
yield Ok(EventStreamUnion::ErrorVariant ...)
}
Furthermore, there is no support for @error
s in event streams being terminal; that is, when an error is sent,
it does not signal termination and thus does not complete the stream.
This RFC proposes to make changes to:
- terminate the stream upon receiving a modeled error
- change the API so that customers will write their business logic in a more Rust-like experience:
stream! {
yield Err(EventStreamUnionError::ErrorKind ...)
}
Thus any Err(_)
from the stream is terminal, rather than any Ok(x)
with x
being matched against the set of modeled variant errors in the union.
How to actually implement this RFC
In order to implement this feature:
- Errors modeled in streaming unions are going to be treated like operation errors
- They are in the
error::
namespace - They have the same methods operation errors have (
name
on the server,metadata
on the client and so on) - They are not variants in the corresponding error structure
- They are in the
- Errors need to be marshalled and unmarshalled
Receiver
must treat any error coming from the other end as terminal
The code examples below have been generated using the following model:
@http(uri: "/capture-pokemon-event/{region}", method: "POST")
operation CapturePokemonOperation {
input: CapturePokemonOperationEventsInput,
output: CapturePokemonOperationEventsOutput,
errors: [UnsupportedRegionError, ThrottlingError]
}
@input
structure CapturePokemonOperationEventsInput {
@httpPayload
events: AttemptCapturingPokemonEvent,
@httpLabel
@required
region: String,
}
@output
structure CapturePokemonOperationEventsOutput {
@httpPayload
events: CapturePokemonEvents,
}
@streaming
union AttemptCapturingPokemonEvent {
event: CapturingEvent,
masterball_unsuccessful: MasterBallUnsuccessful,
}
structure CapturingEvent {
@eventPayload
payload: CapturingPayload,
}
structure CapturingPayload {
name: String,
pokeball: String,
}
@streaming
union CapturePokemonEvents {
event: CaptureEvent,
invalid_pokeball: InvalidPokeballError,
throttlingError: ThrottlingError,
}
structure CaptureEvent {
@eventHeader
name: String,
@eventHeader
captured: Boolean,
@eventHeader
shiny: Boolean,
@eventPayload
pokedex_update: Blob,
}
@error("server")
structure UnsupportedRegionError {
@required
region: String,
}
@error("client")
structure InvalidPokeballError {
@required
pokeball: String,
}
@error("server")
structure MasterBallUnsuccessful {
@required
message: String,
}
@error("client")
structure ThrottlingError {}
Wherever irrelevant, documentation and other lines are stripped out from the code examples below.
Errors in streaming unions
The error in AttemptCapturingPokemonEvent
is modeled as follows.
On the client,
pub struct AttemptCapturingPokemonEventError {
pub kind: AttemptCapturingPokemonEventErrorKind,
pub(crate) meta: aws_smithy_types::Error,
}
pub enum AttemptCapturingPokemonEventErrorKind {
MasterBallUnsuccessful(crate::error::MasterBallUnsuccessful),
Unhandled(Box<dyn std::error::Error + Send + Sync + 'static>),
}
On the server,
pub enum AttemptCapturingPokemonEventError {
MasterBallUnsuccessful(crate::error::MasterBallUnsuccessful),
}
Both are modeled as normal errors, where the name comes from Error
with a prefix of the union's name.
In fact, both the client and server
generate operation errors and event stream errors the same way.
Event stream errors have their own marshaller.
To make it work for users to stream errors, EventStreamSender<>
, in addition to the union type T
, takes an error type E
; that is, the AttemptCapturingPokemonEventError
in the example.
This means that an error from the stream is marshalled and sent as a data structure similarly to the union's non-error members.
On the other side, the Receiver<>
needs to terminate the stream upon receiving any error.
A terminated stream has no more data and will always be a bug to use it.
An example of how errors can be used on clients, extracted from this test:
yield Err(AttemptCapturingPokemonEventError::new(
AttemptCapturingPokemonEventErrorKind::MasterBallUnsuccessful(MasterBallUnsuccessful::builder().build()),
Default::default()
));
Because unions can be used in input or output of more than one operation, errors must be generated once as they are in the error::
namespace.
Changes checklist
-
Errors are in the
error::
namespace and created as operation errors - Errors can be sent to the stream
- Errors terminate the stream
-
Customers' experience using errors mirrors the Rust way:
Err(error::StreamingError ...)
RFC: Service Builder Improvements
Status: Accepted
One might characterize smithy-rs
as a tool for transforming a Smithy service into a tower::Service builder. A Smithy model defines behavior of the generated service partially - handlers must be passed to the builder before the tower::Service
is fully specified. This builder structure is the primary API surface we provide to the customer, as a result, it is important that it meets their needs.
This RFC proposes a new builder, deprecating the existing one, which addresses API deficiencies and takes steps to improve performance.
Terminology
- Model: A Smithy Model, usually pertaining to the one in use by the customer.
- Smithy Service: The entry point of an API that aggregates resources and operations together within a Smithy model. Described in detail here.
- Service: The
tower::Service
trait is an interface for writing network applications in a modular and reusable way.Service
s act on requests to produce responses. - Service Builder: A
tower::Service
builder, generated from a Smithy service, bysmithy-rs
. - Middleware: Broadly speaking, middleware modify requests and responses. Concretely, these are exist as implementations of Layer/a
Service
wrapping an innerService
. - Handler: A closure defining the behavior of a particular request after routing. These are provided to the service builder to complete the description of the service.
Background
To provide context for the proposal we perform a survey of the current state of affairs.
The following is a reference model we will use throughout the RFC:
operation Operation0 {
input: Input0,
output: Output0
}
operation Operation1 {
input: Input1,
output: Output1
}
@restJson1
service Service0 {
operations: [
Operation0,
Operation1,
]
}
We have purposely omitted details from the model that are unimportant to describing the proposal. We also omit distracting details from the Rust snippets. Code generation is linear in the sense that, code snippets can be assumed to extend to multiple operations in a predictable way. In the case where we do want to speak generally about an operation and its associated types, we use {Operation}
, for example {Operation}Input
is the input type of an unspecified operation.
Here is a quick example of what a customer might write when using the service builder:
async fn handler0(input: Operation0Input) -> Operation0Output {
todo!()
}
async fn handler1(input: Operation1Input) -> Operation1Output {
todo!()
}
let app: Router = OperationRegistryBuilder::default()
// Use the setters
.operation0(handler0)
.operation1(handler1)
// Convert to `OperationRegistry`
.build()
.unwrap()
// Convert to `Router`
.into();
During the survey we touch on the major mechanisms used to achieve this API.
Handlers
A core concept in the service builder is the Handler
trait:
pub trait Handler<T, Input> {
async fn call(self, req: http::Request) -> http::Response;
}
Its purpose is to provide an even interface over closures of the form FnOnce({Operation}Input) -> impl Future<Output = {Operation}Output>
and FnOnce({Operation}Input, State) -> impl Future<Output = {Operation}Output>
. It's this abstraction which allows the customers to supply both async fn handler(input: {Operation}Input) -> {Operation}Output
and async fn handler(input: {Operation}Input, state: Extension<S>) -> {Operation}Output
to the service builder.
We generate Handler
implementations for said closures in ServerOperationHandlerGenerator.kt:
impl<Fun, Fut> Handler<(), Operation0Input> for Fun
where
Fun: FnOnce(Operation0Input) -> Fut,
Fut: Future<Output = Operation0Output>,
{
async fn call(self, request: http::Request) -> http::Response {
let input = /* Create `Operation0Input` from `request: http::Request` */;
// Use closure on the input
let output = self(input).await;
let response = /* Create `http::Response` from `output: Operation0Output` */
response
}
}
impl<Fun, Fut> Handler<Extension<S>, Operation0Input> for Fun
where
Fun: FnOnce(Operation0Input, Extension<S>) -> Fut,
Fut: Future<Output = Operation0Output>,
{
async fn call(self, request: http::Request) -> http::Response {
let input = /* Create `Operation0Input` from `request: http::Request` */;
// Use closure on the input and fetched extension data
let extension = Extension(request.extensions().get::<T>().clone());
let output = self(input, extension).await;
let response = /* Create `http::Response` from `output: Operation0Output` */
response
}
}
Creating {Operation}Input
from a http::Request
and http::Response
from a {Operation}Output
involves protocol aware serialization/deserialization, for example, it can involve the HTTP binding traits. The RuntimeError enumerates error cases such as serialization/deserialization failures, extensions().get::<T>()
failures, etc. We omit error handling in the snippet above, but, in full, it also involves protocol aware conversions from the RuntimeError
to http::Response
. The reader should make note of the influence of the model on the different sections of this procedure.
The request.extensions().get::<T>()
present in the Fun: FnOnce(Operation0Input, Extension<S>) -> Fut
implementation is the current approach to injecting state into handlers. The customer is required to apply a AddExtensionLayer to the output of the service builder so that, when the request reaches the handler, the extensions().get::<T>()
will succeed.
To convert the closures described above into a Service
an OperationHandler
is used:
pub struct OperationHandler<H, T, Input> {
handler: H,
}
impl<H, T, Input> Service<Request<B>> for OperationHandler<H, T, Input>
where
H: Handler<T, I>,
{
type Response = http::Response;
type Error = Infallible;
#[inline]
fn poll_ready(&mut self, _cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> {
Poll::Ready(Ok(()))
}
async fn call(&mut self, req: Request<B>) -> Result<Self::Response, Self::Error> {
self.handler.call(req).await.map(Ok)
}
}
Builder
The service builder we provide to the customer is the OperationRegistryBuilder
, generated from ServerOperationRegistryGenerator.kt.
Currently, the reference model would generate the following OperationRegistryBuilder
and OperationRegistry
:
pub struct OperationRegistryBuilder<Op0, In0, Op1, In1> {
operation1: Option<Op0>,
operation2: Option<Op1>,
}
pub struct OperationRegistry<Op0, In0, Op1, In1> {
operation1: Op0,
operation2: Op1,
}
The OperationRegistryBuilder
includes a setter per operation, and a fallible build
method:
impl<Op0, In0, Op1, In1> OperationRegistryBuilder<Op0, In0, Op1, In1> {
pub fn operation0(mut self, value: Op0) -> Self {
self.operation0 = Some(value);
self
}
pub fn operation1(mut self, value: Op1) -> Self {
self.operation1 = Some(value);
self
}
pub fn build(
self,
) -> Result<OperationRegistry<Op0, In0, Op1, In1>, OperationRegistryBuilderError> {
Ok(OperationRegistry {
operation0: self.operation0.ok_or(/* OperationRegistryBuilderError */)?,
operation1: self.operation1.ok_or(/* OperationRegistryBuilderError */)?,
})
}
}
The OperationRegistry
does not include any methods of its own, however it does enjoy a From<OperationRegistry> for Router<B>
implementation:
impl<B, Op0, In0, Op1, In1> From<OperationRegistry<B, Op0, In0, Op1, In1>> for Router<B>
where
Op0: Handler<B, In0, Operation0Input>,
Op1: Handler<B, In1, Operation1Input>,
{
fn from(registry: OperationRegistry<B, Op0, In0, Op1, In1>) -> Self {
let operation0_request_spec = /* Construct Operation0 routing information */;
let operation1_request_spec = /* Construct Operation1 routing information */;
// Convert handlers into boxed services
let operation0_svc = Box::new(OperationHandler::new(registry.operation0));
let operation1_svc = Box::new(OperationHandler::new(registry.operation1));
// Initialize the protocol specific router
// We demonstrate it here with `new_rest_json_router`, but note that there is a different router constructor
// for each protocol.
aws_smithy_http_server::routing::Router::new_rest_json_router(vec![
(
operation0_request_spec,
operation0_svc
),
(
operation1_request_spec,
operation1_svc
)
])
}
}
Router
The aws_smithy_http::routing::Router provides the protocol aware routing of requests to their target , it exists as
pub struct Route {
service: Box<dyn Service<http::Request, Response = http::Response>>,
}
enum Routes {
RestXml(Vec<(Route, RequestSpec)>),
RestJson1(Vec<(Route, RequestSpec)>),
AwsJson1_0(TinyMap<String, Route>),
AwsJson11(TinyMap<String, Route>),
}
pub struct Router {
routes: Routes,
}
and enjoys the following Service<http::Request>
implementation:
impl Service<http::Request> for Router
{
type Response = http::Response;
type Error = Infallible;
fn poll_ready(&mut self, _: &mut Context<'_>) -> Poll<Result<(), Self::Error>> {
Poll::Ready(Ok(()))
}
async fn call(&mut self, request: http::Request) -> Result<Self::Response, Self::Error> {
match &self.routes {
Routes::/* protocol */(routes) => {
let route: Result<Route, _> = /* perform route matching logic */;
match route {
Ok(ok) => ok.oneshot().await,
Err(err) => /* Convert routing error into http::Response */
}
}
}
}
}
Along side the protocol specific constructors, Router
includes a layer
method. This provides a way for the customer to apply a tower::Layer
to all routes. For every protocol, Router::layer
has the approximately the same behavior:
let new_routes = old_routes
.into_iter()
// Apply the layer
.map(|route| layer.layer(route))
// Re-box the service, to restore `Route` type
.map(|svc| Box::new(svc))
// Collect the iterator back into a collection (`Vec` or `TinyMap`)
.collect();
Comparison to Axum
Historically, smithy-rs
has borrowed from axum. Despite various divergences the code bases still have much in common:
- Reliance on
Handler
trait to abstract over different closure signatures: - A mechanism for turning
H: Handler
into atower::Service
: - A
Router
to route requests to various handlers:
To identify where the implementations should differ we should classify in what ways the use cases differ. There are two primary areas which we describe below.
Extractors and Responses
In axum
there is a notion of Extractor, which allows the customer to easily define a decomposition of an incoming http::Request
by specifying the arguments to the handlers. For example,
async fn request(Json(payload): Json<Value>, Query(params): Query<HashMap<String, String>>, headers: HeaderMap) {
todo!()
}
is a valid handler - each argument satisfies the axum::extract::FromRequest trait, therefore satisfies one of axum
s blanket Handler
implementations:
#![allow(unused)] fn main() { macro_rules! impl_handler { ( $($ty:ident),* $(,)? ) => { impl<F, Fut, Res, $($ty,)*> Handler<($($ty,)*)> for F where F: FnOnce($($ty,)*) -> Fut + Clone + Send + 'static, Fut: Future<Output = Res> + Send, Res: IntoResponse, $( $ty: FromRequest + Send,)* { fn call(self, req: http::Request) -> Self::Future { async { let mut req = RequestParts::new(req); $( let $ty = match $ty::from_request(&mut req).await { Ok(value) => value, Err(rejection) => return rejection.into_response(), }; )* let res = self($($ty,)*).await; res.into_response() } } } }; } }
The implementations of Handler
in axum
and smithy-rs
follow a similar pattern - convert http::Request
into the closure's input, run the closure, convert the output of the closure to http::Response
.
In smithy-rs
we do not need a general notion of "extractor" - the http::Request
decomposition is specified by the Smithy model, whereas in axum
it's defined by the handlers signature. Despite the Smithy specification the customer may still want an "escape hatch" to allow them access to data outside of the Smithy service inputs, for this reason we should continue to support a restricted notion of extractor. This will help support use cases such as passing lambda_http::Context through to the handler despite it not being modeled in the Smithy model.
Dual to FromRequest
is the axum::response::IntoResponse trait. This plays the role of converting the output of the handler to http::Response
. Again, the difference between axum
and smithy-rs
is that smithy-rs
has the conversion from {Operation}Output
to http::Response
specified by the Smithy model, whereas in axum
the customer is free to specify a return type which implements axum::response::IntoResponse
.
Routing
The Smithy model not only specifies the http::Request
decomposition and http::Response
composition for a given service, it also determines the routing. The From<OperationRegistry>
implementation, described in Builder, yields a fully formed router based on the protocol and http traits specified.
This is in contrast to axum
, where the user specifies the routing by use of various combinators included on the axum::Router
, applied to other tower::Service
s. In an axum
application one might encounter the following code:
let user_routes = Router::new().route("/:id", /* service */);
let team_routes = Router::new().route("/", /* service */);
let api_routes = Router::new()
.nest("/users", user_routes)
.nest("/teams", team_routes);
let app = Router::new().nest("/api", api_routes);
Note that, in axum
handlers are eagerly converted to a tower::Service
(via IntoService
) before they are passed into the Router
. In contrast, in smithy-rs
, handlers are passed into a builder and then the conversion to tower::Service
is performed (via OperationHandler
).
Introducing state to handlers in axum
is done in the same way as smithy-rs
, described briefly in Handlers - a layer is used to insert state into incoming http::Request
s and the Handler
implementation pops it out of the type map layer. In axum
, if a customer wanted to scope state to all routes within /users/
they are able to do the following:
async fn handler(Extension(state): Extension</* State */>) -> /* Return Type */ {}
let api_routes = Router::new()
.nest("/users", user_routes.layer(Extension(/* state */)))
.nest("/teams", team_routes);
In smithy-rs
a customer is only able to apply a layer around the aws_smithy_http::routing::Router
or around every route via the layer method described above.
Proposal
The proposal is presented as a series of compatible transforms to the existing service builder, each paired with a motivation. Most of these can be independently implemented, and it is stated in the cases where an interdependency exists.
Although presented as a mutation to the existing service builder, the actual implementation should exist as an entirely separate builder, living in a separate namespace, reusing code generation from the old builder, while exposing a new Rust API. Preserving the old API surface will prevent breakage and make it easier to perform comparative benchmarks and testing.
Remove two-step build procedure
As described in Builder, the customer is required to perform two conversions. One from OperationRegistryBuilder
via OperationRegistryBuilder::build
, the second from OperationRegistryBuilder
to Router
via the From<OperationRegistry> for Router
implementation. The intermediary stop at OperationRegistry
is not required and can be removed.
Statically check for missing Handlers
As described in Builder, the OperationRegistryBuilder::build
method is fallible - it yields a runtime error when one of the handlers has not been set.
pub fn build(
self,
) -> Result<OperationRegistry<Op0, In0, Op1, In1>, OperationRegistryBuilderError> {
Ok(OperationRegistry {
operation0: self.operation0.ok_or(/* OperationRegistryBuilderError */)?,
operation1: self.operation1.ok_or(/* OperationRegistryBuilderError */)?,
})
}
We can do away with fallibility if we allow for on Op0
, Op1
to switch types during build and remove the Option
from around the fields. The OperationRegistryBuilder
then becomes
struct OperationRegistryBuilder<Op0, Op1> {
operation_0: Op0,
operation_1: Op1
}
impl OperationRegistryBuilder<Op0, In0, Op1, In1> {
pub fn operation0<NewOp0>(mut self, value: NewOp0) -> OperationRegistryBuilder<NewOp0, In0, Op1, In1> {
OperationRegistryBuilder {
operation0: value,
operation1: self.operation1
}
}
pub fn operation1<NewOp1>(mut self, value: NewOp1) -> OperationRegistryBuilder<Op0, In0, NewOp1, In1> {
OperationRegistryBuilder {
operation0: self.operation0,
operation1: value
}
}
}
impl OperationRegistryBuilder<Op0, In0, Op1, In1>
where
Op0: Handler<B, In0, Operation0Input>,
Op1: Handler<B, In1, Operation1Input>,
{
pub fn build(self) -> OperationRegistry<Op0, In0, Op1, In1> {
OperationRegistry {
operation0: self.operation0,
operation1: self.operation1,
}
}
}
The customer will now get a compile time error rather than a runtime error when they fail to specify a handler.
Switch From<OperationRegistry> for Router
to an OperationRegistry::build
method
To construct a Router
, the customer must either give a type ascription
let app: Router = /* Service builder */.into();
or be explicit about the Router
namespace
let app = Router::from(/* Service builder */);
If we switch from a From<OperationRegistry> for Router
to a build
method on OperationRegistry
the customer may simply
let app = /* Service builder */.build();
There already exists a build
method taking OperationRegistryBuilder
to OperationRegistry
, this is removed in Remove two-step build procedure. These two transforms pair well together for this reason.
Operations as Middleware Constructors
As mentioned in Comparison to Axum: Routing and Handlers, the smithy-rs
service builder accepts handlers and only converts them into a tower::Service
during the final conversion into a Router
. There are downsides to this:
- The customer has no opportunity to apply middleware to a specific operation before they are all collected into
Router
. TheRouter
does have alayer
method, described in Router, but this applies the middleware uniformly across all operations. - The builder has no way to apply middleware around customer applied middleware. A concrete example of where this would be useful is described in the Middleware Position section of RFC: Logging in the Presence of Sensitive Data.
- The customer has no way of expressing readiness of the underlying operation - all handlers are converted to services with Service::poll_ready returning
Poll::Ready(Ok(()))
.
The three use cases described above are supported by axum
by virtue of the Router::route method accepting a tower::Service
. The reader should consider a similar approach where the service builder setters accept a tower::Service<http::Request, Response = http::Response>
rather than the Handler
.
Throughout this section we purposely ignore the existence of handlers accepting state alongside the {Operation}Input
, this class of handlers serve as a distraction and can be accommodated with small perturbations from each approach.
Approach A: Customer uses OperationHandler::new
It's possible to make progress with a small changeset, by requiring the customer eagerly uses OperationHandler::new
rather than it being applied internally within From<OperationRegistry> for Router
(see Handlers). The setter would then become:
pub struct OperationRegistryBuilder<Op0, Op1> {
operation1: Option<Op0>,
operation2: Option<Op1>
}
impl<Op0, Op1> OperationRegistryBuilder<Op0, Op1> {
pub fn operation0(self, value: Op0) -> Self {
self.operation1 = Some(value);
self
}
}
The API usage would then become
async fn handler0(input: Operation0Input) -> Operation0Output {
todo!()
}
// Create a `Service<http::Request, Response = http::Response, Error = Infallible>` eagerly
let svc = OperationHandler::new(handler0);
// Middleware can be applied at this point
let operation0 = /* A HTTP `tower::Layer` */.layer(op1_svc);
OperationRegistryBuilder::default()
.operation0(operation0)
/* ... */
Note that this requires that the OperationRegistryBuilder
stores services, rather than Handler
s. An unintended and superficial benefit of this is that we are able to drop In{n}
from the OperationRegistryBuilder<Op0, In0, Op1, In1>
- only Op{n}
remains and it parametrizes each operation's tower::Service
.
It is still possible to retain the original API which accepts Handler
by introducing the following setters:
impl<Op1, Op2> OperationRegistryBuilder<Op1, Op2> {
fn operation0_handler<H: Handler>(self, handler: H) -> OperationRegistryBuilder<OperationHandler<H>, Op2> {
OperationRegistryBuilder {
operation0: OperationHandler::new(handler),
operation1: self.operation1
}
}
}
There are two points at which the customer might want to apply middleware: around tower::Service<{Operation}Input, Response = {Operation}Output>
and tower::Service<http::Request, Response = http::Response>
, that is, before and after the serialization/deserialization is performed. The change described only succeeds in the latter, and therefore is only a partial solution to (1).
This solves (2), the service builder may apply additional middleware around the service.
This does not solve (3), as the customer is not able to provide a tower::Service<{Operation}Input, Response = {Operation}Output>
.
Approach B: Operations as Middleware
In order to achieve all three we model operations as middleware:
pub struct Operation0<S> {
inner: S,
}
impl<S> Service<http::Request> for Operation0<S>
where
S: Service<Operation0Input, Response = Operation0Output, Error = Infallible>
{
type Response = http::Response;
type Error = Infallible;
fn poll_ready(&mut self, cx: &mut Context) -> Poll<Result<(), Self::Error>> {
// We defer to the inner service for readiness
self.inner.poll_ready(cx)
}
async fn call(&mut self, request: http::Request) -> Result<Self::Response, Self::Error> {
let input = /* Create `Operation0Input` from `request: http::Request` */;
self.inner.call(input).await;
let response = /* Create `http::Response` from `output: Operation0Output` */
response
}
}
Notice the similarity between this and the OperationHandler
, the only real difference being that we hold an inner service rather than a closure. In this way we have separated all model aware serialization/deserialization, we noted in Handlers, into this middleware.
A consequence of this is that the user Operation0
must have two constructors:
from_service
, which takes atower::Service<Operation0Input, Response = Operation0Output>
.from_handler
, which takes an asyncOperation0Input -> Operation0Output
.
A brief example of how this might look:
use tower::util::{ServiceFn, service_fn};
impl<S> Operation0<S> {
pub fn from_service(inner: S) -> Self {
Self {
inner,
}
}
}
impl<F> Operation0<ServiceFn<F>> {
pub fn from_handler(inner: F) -> Self {
// Using `service_fn` here isn't strictly correct - there is slight misalignment of closure signatures. This
// still serves to illustrate the proposal.
Operation0::from_service(service_fn(inner))
}
}
The API usage then becomes:
async fn handler(input: Operation0Input) -> Operation0Output {
todo!()
}
// These are both `tower::Service` and hence can have middleware applied to them
let operation_0 = Operation0::from_handler(handler);
let operation_1 = Operation1::from_service(/* some service */);
OperationRegistryBuilder::default()
.operation0(operation_0)
.operation1(operation_1)
/* ... */
Approach C: Operations as Middleware Constructors
While Attempt B solves all three problems, it fails to adequately model the Smithy semantics. An operation cannot uniquely define a tower::Service
without reference to a parent Smithy service - information concerning the serialization/deserialization, error modes are all inherited from the Smithy service an operation is used within. In this way, Operation0
should not be a standalone middleware, but become middleware once accepted by the service builder.
Any solution which provides an {Operation}
structure and wishes it to be accepted by multiple service builders must deal with this problem. We currently build one library per service and hence have duplicate structures when service closures overlap. This means we wouldn't run into this problem today, but it would be a future obstruction if we wanted to reduce the amount of generated code.
use tower::layer::util::{Stack, Identity};
use tower::util::{ServiceFn, service_fn};
// This takes the same form as `Operation0` defined in the previous attempt. The difference being that this is now
// private.
struct Service0Operation0<S> {
inner: S
}
impl<S> Service<http::Request> for ServiceOperation0<S>
where
S: Service<Operation0Input, Response = Operation0Output, Error = Infallible>
{
/* Same as above */
}
pub struct Operation0<S, L> {
inner: S,
layer: L
}
impl<S> Operation0<S, Identity> {
pub fn from_service(inner: S) -> Self {
Self {
inner,
layer: Identity
}
}
}
impl<F> Operation0<ServiceFn<F>, Identity> {
pub fn from_handler(inner: F) -> Self {
Operation0::from_service(service_fn(inner))
}
}
impl<S, L> Operation0<S, L> {
pub fn layer<NewL>(self, layer: L) -> Operation0<S, Stack<L, NewL>> {
Operation0 {
inner: self.inner,
layer: Stack::new(self.layer, layer)
}
}
pub fn logging(self, /* args */) -> Operation0<S, Stack<L, LoggingLayer>> {
Operation0 {
inner: self.inner,
layer: Stack::new(self.layer, LoggingLayer::new(/* args */))
}
}
pub fn auth(self, /* args */) -> Operation0<S, Stack<L, AuthLayer>> {
Operation0 {
inner: self.inner,
layer: Stack::new(self.layer, /* Construct auth middleware */)
}
}
}
impl<Op1, Op2> OperationRegistryBuilder<Op1, Op2> {
pub fn operation0<S, L>(self, operation: Operation0<S, L>) -> OperationRegistryBuilder<<L as Layer<Service0Operation0<S>>::Service, Op2>
where
L: Layer<Service0Operation0<S>>
{
// Convert `Operation0` to a `tower::Service`.
let http_svc = Service0Operation0 { inner: operation.inner };
// Apply the layers
operation.layer(http_svc)
}
}
Notice that we get some additional type safety here when compared to Approach A and Approach B - operation0
accepts a Operation0
rather than a general tower::Service
. We also get a namespace to include utility methods - notice the logging
and auth
methods.
The RFC favours this approach out of all those presented.
Approach D: Add more methods to the Service Builder
An alternative to Approach C is to simply add more methods to the service builder while internally storing a tower::Service
:
operation0_from_service
, accepts atower::Service<Operation0Input, Response = Operation0Output>
.operation0_from_handler
, accepts an asyncFn(Operation0Input) -> Operation0Output
.operation0_layer
, accepts atower::Layer<Op0>
.
This is functionally similar to Attempt C except that all composition is done internal to the service builder and the namespace exists in the method name, rather than the {Operation}
struct.
Service parameterized Routers
Currently the Router
stores Box<dyn tower::Service<http::Request, Response = http::Response>
. As a result the Router::layer
method, seen in Router, must re-box a service after every tower::Layer
applied. The heap allocation Box::new
itself is not cause for concern because Router
s are typically constructed once at startup, however one might expect the indirection to regress performance when the server is running.
Having the service type parameterized as Router<S>
, allows us to write:
impl<S> Router<S> {
fn layer<L>(self, layer: &L) -> Router<L::Service>
where
L: Layer<S>
{
/* Same internal implementation without boxing */
}
}
Protocol specific Routers
Currently there is a single Router
structure, described in Router, situated in the rust-runtime/aws-smithy-http-server
crate, which is output by the service builder. This, roughly, takes the form of an enum
listing the different protocols.
#![allow(unused)] fn main() { #[derive(Debug)] enum Routes { RestXml(/* Container */), RestJson1(/* Container */), AwsJson1_0(/* Container */), AwsJson1_1(/* Container */), } }
Recall the form of the Service::call
method, given in Router, which involved matching on the protocol and then performing protocol specific logic.
Two downsides of modelling Router
in this way are:
Router
is larger and has more branches than a protocol specific implementation.- If a third-party wanted to extend
smithy-rs
to additional protocolsRoutes
would have to be extended. A synopsis of this obstruction is presented in Should we generate theRouter
type issue.
After taking the Switch From<OperationRegistry> for Router
to an OperationRegistry::build
method transform, code generation is free to switch between return types based on the model. This allows for a scenario where a @restJson1
causes the service builder to output a specific RestJson1Router
.
Protocol specific Errors
Currently, protocol specific routing errors are either:
- Converted to
RuntimeError
s and thenhttp::Response
(see unknown_operation). - Converted directly to a
http::Response
(see method_not_allowed). This is an outlier to the common pattern.
The from_request
functions yield protocol specific errors which are converted to RequestRejection
s then RuntimeError
s (see ServerHttpBoundProtocolGenerator.kt).
In these scenarios protocol specific errors are converted into RuntimeError
before being converted to a http::Response
via into_response
method.
Two downsides of this are:
RuntimeError
enumerates all possible errors across all existing protocols, so is larger than modelling the errors for a specific protocol.- If a third-party wanted to extend
smithy-rs
to additional protocols with differing failure modesRuntimeError
would have to be extended. As in Protocol specific Errors, a synopsis of this obstruction is presented in Should we generate theRouter
type issue.
Switching from using RuntimeError
to protocol specific errors which satisfy a common interface, IntoResponse
, would resolve these problem.
Type erasure with the name of the Smithy service
Currently the service builder is named OperationRegistryBuilder
. Despite the name being model agnostic, the OperationRegistryBuilder
mutates when the associated service mutates. Renaming OperationRegistryBuilder
to {Service}Builder
would reflect the relationship between the builder and the Smithy service and prevent naming conflicts if multiple service builders are to exist in the same namespace.
Similarly, the output of the service builder is Router
. This ties the output of the service builder to a structure in rust-runtime
. Introducing a type erasure here around Router
using a newtype named {Service}
would:
- Ensure we are free to change the implementation of
{Service}
without changing theRouter
implementation. - Hide the router type, which is determined by the protocol specified in the model.
- Allow us to put a
builder
method on{Service}
which returns{Service}Builder
.
This is compatible with Protocol specific Routers, we simply newtype the protocol specific router rather than Router
.
With both of these changes the API would take the form:
let service_0: Service0 = Service0::builder()
/* use the setters */
.build()
.unwrap()
.into();
With Remove two-step build procedure, Switch From<OperationRegistry> for Router
to a OperationRegistry::build
method, and Statically check for missing Handlers we obtain the following API:
let service_0: Service0 = Service0::builder()
/* use the setters */
.build();
Combined Proposal
A combination of all the proposed transformations results in the following API:
struct Context {
/* fields */
}
async fn handler(input: Operation0Input) -> Operation0Output {
todo!()
}
async fn handler_with_ext(input: Operation0Input, extension: Extension<Context>) -> Operation0Output {
todo!()
}
struct Operation1Service {
/* fields */
}
impl Service<Operation1Input> for Operation1Service {
type Response = Operation1Output;
/* implementation */
}
struct Operation1ServiceWithExt {
/* fields */
}
impl Service<(Operation1Input, Extension<Context>)> for Operation1Service {
type Response = Operation1Output;
/* implementation */
}
// Create an operation from a handler
let operation_0 = Operation0::from_handler(handler);
// Create an operation from a handler with extension
let operation_0 = Operation::from_handler(handler_with_ext);
// Create an operation from a `tower::Service`
let operation_1_svc = Operation1Service { /* initialize */ };
let operation_1 = Operation::from_service(operation_1_svc);
// Create an operation from a `tower::Service` with extension
let operation_1_svc = Operation1ServiceWithExtension { /* initialize */ };
let operation_1 = Operation::from_service(operation_1_svc);
// Apply a layer
let operation_0 = operation_0.layer(/* layer */);
// Use the service builder
let service_0 = Service0::builder()
.operation_0(operation_0)
.operation_1(operation_1)
.build();
A toy implementation of the combined proposal is presented in this PR.
Changes Checklist
-
Add protocol specific routers to
rust-runtime/aws-smithy-http-server
. -
Add middleware primitives and error types to
rust-runtime/aws-smithy-http-server
. - Add code generation which outputs new service builder.
-
Deprecate
OperationRegistryBuilder
,OperationRegistry
andRouter
.
RFC: Dependency Versions
Status: Accepted
Applies to: Client and Server
This RFC outlines how Rust dependency versions are selected for the smithy-rs project, and strives to meet the following semi-conflicting goals:
- Dependencies are secure
- Vended libraries have dependency ranges that overlap other Rust libraries as much as possible
When in conflict, the security goal takes priority over the compatibility goal.
Categorization of Crates
The Rust crates within smithy-rs can be divided up into two categories:
- Library Crates: Crates that are published to crates.io with the intention that other projects
will depend on them via their
Cargo.toml
files. This category does NOT include binaries that are published to crates.io with the intention of being installed withcargo install
. - Application Crates: All examples, binaries, tools, standalone tests, or other crates that are not published to crates.io with the intent of being depended on by other projects.
All generated crates must be considered library crates even if they're not published since they are intended to be pulled into other Rust projects with other dependencies.
Support crates for Applications
The aws-smithy-http-server-python
crate doesn't fit the categorization rules well since
it is a runtime crate for a generated Rust application with bindings to Python. This RFC
establishes this crate as an application crate since it needs to pull in application-specific
dependencies such as tracing-subscriber
in order to implement its full feature set.
Dependency Version Rules
Application crates should use the latest versions of dependencies, but must use a version greater than or equal to the minimum secure version as determined by the RUSTSEC advisories database. Library crates must use the minimum secure version. This is illustrated at a high level below:
graph TD S[Add Dependency] --> T{Crate Type?} T -->|Application Crate?| A[Use latest version] T -->|Library Crate?| L[Use minimum secure version]
What is a minimum secure version when there are multiple major versions?
If a dependency has multiple supported major versions, then the latest major version must be selected unless there is a compelling reason to do otherwise (such as the previous major version having been previously exposed in our public API). Choosing newer major versions will reduce the amount of upgrade work that needs to be done at a later date when support for the older version is inevitably dropped.
Changes Checklist
Some work needs to be done to establish these guidelines:
- Establish automation for enforcing minimum secure versions for the direct dependencies of library crates
RFC: Error Context and Compatibility
Status: Implemented
Applies to: Generated clients and shared rust-runtime crates
This RFC proposes a pattern for writing Rust errors to provide consistent error context AND forwards/backwards compatibility. The goal is to strike a balance between these four goals:
- Errors are forwards compatible, and changes to errors are backwards compatible
- Errors are idiomatic and ergonomic. It is easy to match on them and extract additional information for cases where that's useful. The type system prevents errors from being used incorrectly (for example, incorrectly retrieving context for a different error variant)
- Error messages are easy to debug
- Errors implement best practices with Rust's
Error
trait (for example, implementing the optionalsource()
function where possible)
Note: This RFC is not about error backwards compatibility when it comes to error serialization/deserialization for transfer over the wire. The Smithy protocols cover that aspect.
Past approaches in smithy-rs
This section examines some examples found in aws-config
that illustrate different problems
that this RFC will attempt to solve, and calls out what was done well, and what could be improved upon.
Case study: InvalidFullUriError
To start, let's examine InvalidFullUriError
(doc comments omitted):
#[derive(Debug)]
#[non_exhaustive]
pub enum InvalidFullUriError {
#[non_exhaustive] InvalidUri(InvalidUri),
#[non_exhaustive] NoDnsService,
#[non_exhaustive] MissingHost,
#[non_exhaustive] NotLoopback,
DnsLookupFailed(io::Error),
}
impl Display for InvalidFullUriError {
fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
match self {
InvalidFullUriError::InvalidUri(err) => write!(f, "URI was invalid: {}", err),
InvalidFullUriError::MissingHost => write!(f, "URI did not specify a host"),
// ... omitted ...
}
}
}
impl Error for InvalidFullUriError {
fn source(&self) -> Option<&(dyn Error + 'static)> {
match self {
InvalidFullUriError::InvalidUri(err) => Some(err),
InvalidFullUriError::DnsLookupFailed(err) => Some(err),
_ => None,
}
}
}
This error does a few things well:
- Using
#[non_exhaustive]
on the enum allows new errors to be added in the future. - Breaking out different error types allows for more useful error messages, potentially with error-specific context. Customers can match on these different error variants to change their program flow, although it's not immediately obvious if such use cases exist for this error.
- The error cause is available through the
Error::source()
impl for variants that have a cause.
However, there are also a number of things that could be improved:
- All tuple/struct enum members are public, and
InvalidUri
is an error from thehttp
crate. Exposing a type from another crate can potentially lock the GA SDK into a specific crate version if breaking changes are ever made to the exposed types. In this specific case, it prevents using alternate HTTP implementations that don't use thehttp
crate. DnsLookupFailed
is missing#[non_exhaustive]
, so new members can never be added to it.- Use of enum tuples, even with
#[non_exhaustive]
, adds friction to evolving the API since the tuple members cannot be named. - Printing the source error in the
Display
impl leads to error repetition by reporters that examine the full source chain. - The
source()
impl has a_
match arm, which means future implementers could forget to propagate a source when adding new error variants. - The error source can be downcasted to
InvalidUri
type fromhttp
in customer code. This is a leaky abstraction where customers can start to rely on the underlying library the SDK uses in its implementation, and if that library is replaced/changed, it can silently break the customer's application. Note: later in the RFC, I'll demonstrate why fixing this issue is not practical.
Case study: ProfileParseError
Next, let's look at a much simpler error. The ProfileParseError
is focused purely on the parsing
logic for the SDK config file:
#[derive(Debug, Clone)]
pub struct ProfileParseError {
location: Location,
message: String,
}
impl Display for ProfileParseError {
fn fmt(&self, f: &mut Formatter<'_>) -> fmt::Result {
write!(
f,
"error parsing {} on line {}:\n {}",
self.location.path, self.location.line_number, self.message
)
}
}
impl Error for ProfileParseError {}
What this error does well:
- The members are private, so
#[non_exhaustive]
isn't even necessary - The error is completely opaque (maximizing compatibility) while still being debuggable thanks to the flexible messaging
What could be improved:
- It needlessly implements
Clone
, which may prevent it from holding an error source in the future since errors are often notClone
. - In the future, if more error variants are needed, a private inner error kind enum could be added to change messaging, but there's not a nice way to expose new variant-specific information to the customer.
- Programmatic access to the error
Location
may be desired, but this can be trivially added in the future without a breaking change by adding an accessor method.
Case study: code generated client errors
The SDK currently generates errors such as the following (from S3):
#[non_exhaustive]
pub enum Error {
BucketAlreadyExists(BucketAlreadyExists),
BucketAlreadyOwnedByYou(BucketAlreadyOwnedByYou),
InvalidObjectState(InvalidObjectState),
NoSuchBucket(NoSuchBucket),
NoSuchKey(NoSuchKey),
NoSuchUpload(NoSuchUpload),
NotFound(NotFound),
ObjectAlreadyInActiveTierError(ObjectAlreadyInActiveTierError),
ObjectNotInActiveTierError(ObjectNotInActiveTierError),
Unhandled(Box<dyn Error + Send + Sync + 'static>),
}
Each error variant gets its own struct, which can hold error-specific contextual information.
Except for the Unhandled
variant, both the error enum and the details on each variant are extensible.
The Unhandled
variant should move the error source into a struct so that its type can be hidden.
Otherwise, the code generated errors are already aligned with the goals of this RFC.
Approaches from other projects
std::io::Error
The standard library uses an Error
struct with an accompanying ErrorKind
enum
for its IO error. Roughly:
#![allow(unused)] fn main() { #[derive(Debug)] #[non_exhaustive] pub enum ErrorKind { NotFound, // ... omitted ... Other, } #[derive(Debug)] pub struct Error { kind: ErrorKind, source: Box<dyn std::error::Error + Send + Sync>, } }
What this error does well:
- It is extensible since the
ErrorKind
is non-exhaustive - It has an
Other
error type that can be instantiated by users in unit tests, making it easier to unit test error handling
What could be improved:
- There isn't an ergonomic way to add programmatically accessible error-specific context to this error in the future
- The source error can be downcasted, which could be a trap for backwards compatibility.
Hyper 1.0
Hyper has outlined some problems they want to address with errors for the coming 1.0 release. To summarize:
- It's difficult to match on specific errors (Hyper 0.x's
Error
relies onis_x
methods for error matching rather than enum matching). - Error reporters duplicate information since the hyper 0.x errors include the display of their error sources
Error::source()
can leak internal dependencies
Opaque Error Sources
There is discussion in the errors working group about how to avoid leaking internal dependency error types through error source downcasting. One option is to create an opaque error wrapping new-type that removes the ability to downcast to the other library's error. This, however, can be circumvented via unsafe code, and also breaks the ability for error reporters to properly display the error (for example, if the error has backtrace information, that would be inaccessible to the reporter).
This situation might improve if the nightly request_value
/request_ref
/provide
functions on
std::error::Error
are stabilized, since then contextual information needed for including things
such as a backtrace could still be retrieved through the opaque error new-type.
This RFC proposes that error types from other libraries not be directly exposed in the API, but rather,
be exposed indirectly through Error::source
as &dyn Error + 'static
.
Errors should not require downcasting to be useful. Downcasting the error's source should be a last resort, and with the understanding that the type could change at a later date with no compile-time guarantees.
Error Proposal
Taking a customer's perspective, there are two broad categories of errors:
- Actionable: Errors that can/should influence program flow; where it's useful to do different work based on additional error context or error variant information
- Informative: Errors that inform that something went wrong, but where it's not useful to match on the error to change program flow
This RFC proposes that a consistent pattern be introduced to cover these two use cases for all errors in the public API for the Rust runtime crates and generated client crates.
Actionable error pattern
Actionable errors are represented as enums. If an error variant has an error source or additional contextual information, it must use a separate context struct that is referenced via tuple in the enum. For example:
// Good: new error types can be added in the future
#[non_exhaustive]
pub enum Error {
// Good: This is exhaustive and uses a tuple, but its sole member is an extensible struct with private fields
VariantA(VariantA),
// Bad: The fields are directly exposed and can't have accessor methods. The error
// source type also can't be changed at a later date since.
#[non_exhaustive]
VariantB {
some_additional_info: u32,
source: AnotherError // AnotherError is from this crate
},
// Bad: There's no way to add additional contextual information to this error in the future, even
// though it is non-exhaustive. Changing it to a tuple or struct later leads to compile errors in existing
// match statements.
#[non_exhaustive]
VariantC,
// Bad: Not extensible if additional context is added later (unless that context can be added to `AnotherError`)
#[non_exhaustive]
VariantD(AnotherError),
// Bad: Not extensible. If new context is added later (for example, a second endpoint), there's no way to name it.
#[non_exhaustive]
VariantE(Endpoint, AnotherError),
// Bad: Exposes another library's error type in the public API,
// which makes upgrading or replacing that library a breaking change
#[non_exhaustive]
VariantF {
source: http::uri::InvalidUri
},
// Bad: The error source type is public, and even though its a boxed error, it won't
// be possible to change it to an opaque error type later (for example, if/when
// opaque errors become practical due to standard library stabilizations).
#[non_exhaustive]
VariantG {
source: Box<dyn Error + Send + Sync + 'static>,
}
}
pub struct VariantA {
some_field: u32,
// This is private, so it's fine to reference the external library's error type
source: http::uri::InvalidUri
}
impl VariantA {
fn some_field(&self) -> u32 {
self.some_field
}
}
Error variants that contain a source must return it from the Error::source
method.
The source
implementation should not use the catch all (_
) match arm, as this makes it easy to miss
adding a new error variant's source at a later date.
The error Display
implementation must not include the source in its output:
// Good
impl fmt::Display for Error {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
match self {
Self::VariantA => write!(f, "variant a"),
Self::VariantB { some_additional_info, .. } => write!(f, "variant b ({some_additional_info})"),
// ... and so on
}
}
}
// Bad
impl fmt::Display for Error {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
match self {
Self::VariantA => write!(f, "variant a"),
// Bad: includes the source in the `Display` output, which leads to duplicate error information
Self::VariantB { some_additional_info, source } => write!(f, "variant b ({some_additional_info}): {source}"),
// ... and so on
}
}
}
Informative error pattern
Informative errors must be represented as structs. If error messaging changes based on an underlying cause, then a private error kind enum can be used internally for this purpose. For example:
#[derive(Debug)]
pub struct InformativeError {
some_additional_info: u32,
source: AnotherError,
}
impl fmt::Display for InformativeError {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
write!(f, "some informative message with {}", self.some_additional_info)
}
}
impl Error for InformativeError {
fn source(&self) -> Option<&(dyn Error + 'static)> {
Some(&self.source)
}
}
In general, informative errors should be referenced by variants in actionable errors since they cannot be converted to actionable errors at a later date without a breaking change. This is not a hard rule, however. Use your best judgement for the situation.
Displaying full error context
In code where errors are logged rather than returned to the customer, the full error source chain
must be displayed. This will be made easy by placing a DisplayErrorContext
struct in aws-smithy-types
that
is used as a wrapper to get the better error formatting:
tracing::warn!(err = %DisplayErrorContext(err), "some message");
This might be implemented as follows:
#[derive(Debug)]
pub struct DisplayErrorContext<E: Error>(pub E);
impl<E: Error> fmt::Display for DisplayErrorContext<E> {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
write_err(f, &self.0)?;
// Also add a debug version of the error at the end
write!(f, " ({:?})", self)
}
}
fn write_err(f: &mut fmt::Formatter<'_>, err: &dyn Error) -> fmt::Result {
write!(f, "{}", err)?;
if let Some(source) = err.source() {
write!(f, ": ")?;
write_err(f, source)?;
}
Ok(())
}
Changes Checklist
-
Update every struct/enum that implements
Error
in all the non-server Rust runtime crates -
Hide error source type in
Unhandled
variant in code generated errors -
Remove
Clone
fromProfileParseError
and any others that have it
Error Code Review Checklist
This is a checklist meant to aid code review of new errors:
- The error fits either the actionable or informative pattern
- If the error is informative, it's clear that it will never be expanded with additional variants in the future
-
The
Display
impl does not write the error source to the formatter -
The catch all
_
match arm is not used in theDisplay
orError::source
implementations - Error types from external libraries are not exposed in the public API
-
Error enums are
#[non_exhaustive]
-
Error enum variants that don't have a separate error context struct are
#[non_exhaustive]
- Error context is exposed via accessors rather than by public fields
-
Actionable errors and their context structs are in an
error
submodule for any given module. They are not mixed with other non-error code
RFC: Evolving the new service builder API
Status: Accepted
Applies to: Server
RFC 20 introduced a new service builder API.
It supports fine-grained configuration at multiple levels (per-handler middlewares, router middlewares, plugins) while trying to prevent some misconfiguration issues at compile-time (i.e. missing operation handlers).
There is consensus that the new API is an improvement over the pre-existing OperationRegistryBuilder
/OperationRegistry
, which is now on its way to deprecation in one of the next releases.
This RFC builds on top of RFC 20 to explore an alternative API design prior to its stabilisation. The API proposed in this RFC has been manually implemented for the Pokemon service. You can find the code here.
Overview
Type-heavy builders can lead to a poor developer experience when it comes to writing function signatures, conditional branches and clarity of error messages.
This RFC provides examples for the issues we are trying to mitigate and showcases an alternative design for the service builder, cutting generic parameters from 2*(N+1) to 2, where N
is the number of operations on the service.
We rely on eagerly upgrading the registered handlers and operations to Route<B>
to achieve this reduction.
Goals:
- Maximise API ergonomics, with a particular focus on the developer experience for Rust beginners.
Strategy:
- Reduce type complexity, exposing a less generic API;
- Provide clearer errors when the service builder is misconfigured.
Trade-offs:
- Reduce compile-time safety. Missing handlers will be detected at runtime instead of compile-time.
Constraints:
- There should be no significant degradation in runtime performance (i.e. startup time for applications).
Handling missing operations
Let's start by reviewing the API proposed in RFC 20. We will use the Pokemon service as our driving example throughout the RFC. This is what the startup code looks like:
#[tokio::main]
pub async fn main() {
// [...]
let app = PokemonService::builder()
.get_pokemon_species(get_pokemon_species)
.get_storage(get_storage)
.get_server_statistics(get_server_statistics)
.capture_pokemon(capture_pokemon)
.do_nothing(do_nothing)
.check_health(check_health)
.build();
// Setup shared state and middlewares.
let shared_state = Arc::new(State::default());
let app = app.layer(&AddExtensionLayer::new(shared_state));
// Start the [`hyper::Server`].
let bind: SocketAddr = /* */;
let server = hyper::Server::bind(&bind).serve(app.into_make_service());
// [...]
}
The builder is infallible: we are able to verify at compile-time that all handlers have been provided using the typestate builder pattern.
Compiler errors cannot be tuned
What happens if we stray away from the happy path? We might forget, for example, to add the check_health
handler.
The compiler greets us with this error:
error[E0277]: the trait bound `MissingOperation: Upgradable<AwsRestJson1, CheckHealth, (), _, IdentityPlugin>` is not satisfied
--> pokemon-service/src/bin/pokemon-service.rs:38:10
|
38 | .build();
| ^^^^^ the trait `Upgradable<AwsRestJson1, CheckHealth, (), _, IdentityPlugin>` is not implemented for `MissingOperation`
|
= help: the following other types implement trait `Upgradable<Protocol, Operation, Exts, B, Plugin>`:
FailOnMissingOperation
Operation<S, L>
The compiler complains that MissingOperation
does not implement the Upgradable
trait. Neither MissingOperation
nor Upgradable
appear in the startup code we looked at. This is likely to be the first time the developer sees those traits, assuming they haven't spent time getting familiar with aws-smithy-http-server
's internals.
The help
section is unhelpful, if not actively misdirecting.
How can the developer figure out that the issue lies with check_health
?
They need to inspect the generic parameters attached to Upgradable
in the code label or the top-level error message - we see, among other things, a CheckHealth
parameter. That is the hint they need to follow to move forward.
We unfortunately do not have agency on the compiler error we just examined. Rust does not expose hooks for crate authors to tweak the errors returned when a type does not implement a trait we defined. All implementations of the typestate builder pattern accept this shortcoming in exchange for compile-time safety.
Is it a good tradeoff in our case?
The cost of a runtime error
If build
returns an error, the HTTP server is never launched. The application fails to start.
Let's examine the cost of this runtime error along two dimensions:
- Impact on developer productivity;
- Impact on end users.
We'd love for this issue to be caught on the developer machine - it provides the shortest feedback loop.
The issue won't be surfaced by a cargo check
or cargo build
invocation, as it happens with the typestate builder approach.
It should be surfaced by executing the application test suite, assuming that the developer has written at least a single integration test - e.g. a test that passes a request to the call
method exposed by PokemonService
or launches a full-blown instance of the application which is then probed via an HTTP client.
If there are no integration tests, the issue won't be detected on the developer machine nor in CI. Nonetheless, it is unlikely to cause any end-user impact even if it manages to escape detection and reach production. The deployment will never complete if they are using a progressive rollout strategy: instances of the new version will crash as soon as they are launched, never getting a chance to mark themselves as healthy; all traffic will keep being handled by the old version, with no visible impact on end users of the application.
Given the above, we think that the impact of a runtime error is low enough to be worth exploring designs that do not guarantee compile-safety for the builder API1.
Providing clear feedback
Moving from a compile-time error to a runtime error does not require extensive refactoring.
The definition of PokemonServiceBuilder
goes from:
pub struct PokemonServiceBuilder<
Op1,
Op2,
Op3,
Op4,
Op5,
Op6,
Exts1 = (),
Exts2 = (),
Exts3 = (),
Exts4 = (),
Exts5 = (),
Exts6 = (),
Pl = aws_smithy_http_server::plugin::IdentityPlugin,
> {
check_health: Op1,
do_nothing: Op2,
get_pokemon_species: Op3,
get_server_statistics: Op4,
capture_pokemon: Op5,
get_storage: Op6,
#[allow(unused_parens)]
_exts: std::marker::PhantomData<(Exts1, Exts2, Exts3, Exts4, Exts5, Exts6)>,
plugin: Pl,
}
to:
pub struct PokemonServiceBuilder<
Op1,
Op2,
Op3,
Op4,
Op5,
Op6,
Exts1 = (),
Exts2 = (),
Exts3 = (),
Exts4 = (),
Exts5 = (),
Exts6 = (),
Pl = aws_smithy_http_server::plugin::IdentityPlugin,
> {
check_health: Option<Op1>,
do_nothing: Option<Op2>,
get_pokemon_species: Option<Op3>,
get_server_statistics: Option<Op4>,
capture_pokemon: Option<Op5>,
get_storage: Option<Op6>,
#[allow(unused_parens)]
_exts: std::marker::PhantomData<(Exts1, Exts2, Exts3, Exts4, Exts5, Exts6)>,
plugin: Pl,
}
All operation fields are now Option
-wrapped.
We introduce a new MissingOperationsError
error to hold the names of the missing operations and their respective setter methods:
#[derive(Debug)]
pub struct MissingOperationsError {
service_name: &'static str,
operation_names2setter_methods: HashMap<&'static str, &'static str>,
}
impl Display for MissingOperationsError { /* */ }
impl std::error::Error for MissingOperationsError {}
which is then used in build
as error type (not shown here for brevity).
We can now try again to stray away from the happy path by forgetting to register a handler for the CheckHealth
operation.
The code compiles just fine this time, but the application fails when launched via cargo run
:
<timestamp> ERROR pokemon_service: You must specify a handler for all operations attached to the `Pokemon` service.
We are missing handlers for the following operations:
- com.aws.example#CheckHealth
Use the dedicated methods on `PokemonServiceBuilder` to register the missing handlers:
- PokemonServiceBuilder::check_health
The error speaks the language of the domain, Smithy's interface definition language: it mentions operations, services, handlers.
Understanding the error requires no familiarity with smithy-rs
' internal type machinery or advanced trait patterns in Rust.
We can also provide actionable suggestions: Rust beginners should be able to easily process the information, rectify the mistake and move on quickly.
Simplifying PokemonServiceBuilder
's signature
Let's take a second look at the (updated) definition of PokemonServiceBuilder
:
pub struct PokemonServiceBuilder<
Op1,
Op2,
Op3,
Op4,
Op5,
Op6,
Exts1 = (),
Exts2 = (),
Exts3 = (),
Exts4 = (),
Exts5 = (),
Exts6 = (),
Pl = aws_smithy_http_server::plugin::IdentityPlugin,
> {
check_health: Option<Op1>,
do_nothing: Option<Op2>,
get_pokemon_species: Option<Op3>,
get_server_statistics: Option<Op4>,
capture_pokemon: Option<Op5>,
get_storage: Option<Op6>,
#[allow(unused_parens)]
_exts: std::marker::PhantomData<(Exts1, Exts2, Exts3, Exts4, Exts5, Exts6)>,
plugin: Pl,
}
We have 13 generic parameters:
- 1 for plugins (
Pl
); - 2 for each operation (
OpX
andExtsX
);
All those generic parameters were necessary when we were using the typestate builder pattern. They kept track of which operation handlers were missing: if any OpX
was set to MissingOperation
when calling build
-> compilation error!
Do we still need all those generic parameters if we move forward with this RFC? You might be asking yourselves: why do those generics bother us? Is there any harm in keeping them around? We'll look at the impact of those generic parameters on two scenarios:
- Branching in startup logic;
- Breaking down a monolithic startup function into multiple smaller functions.
Branching -> "Incompatible types"
Conditional statements appear quite often in the startup logic for an application (or in the setup code for its integration tests).
Let's consider a toy example: if a check_database
flag is set to true
, we want to register a different check_health
handler - one that takes care of pinging the database to make sure it's up.
The "obvious" solution would look somewhat like this:
let check_database: bool = /* */;
let app = if check_database {
app.check_health(check_health)
} else {
app.check_health(check_health_with_database)
};
app.build();
The compiler is not pleased:
error[E0308]: `if` and `else` have incompatible types
--> pokemon-service/src/bin/pokemon-service.rs:39:9
|
36 | let app = if check_database {
| _______________-
37 | | app.check_health(check_health)
| | ------------------------------ expected because of this
38 | | } else {
39 | | app.check_health(check_health_with_database)
| | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected fn item, found a different fn item
40 | | };
| |_____- `if` and `else` have incompatible types
|
= note: expected struct `PokemonServiceBuilder<Operation<IntoService<_, fn(CheckHealthInput) -> impl Future<Output =
CheckHealthOutput> {check_health}>>, _, _, _, _, _, _, _, _, _, _, _>`
found struct `PokemonServiceBuilder<Operation<IntoService<_, fn(CheckHealthInput) -> impl Future<Output =
CheckHealthOutput> {check_health_with_database}>>, _, _, _, _, _, _, _, _, _, _, _>`
The developer must be aware of the following facts to unpack the error message:
- The two branches of an
if
/else
statement need to return the same type. - Each function closure has a new unique type (represented as
fn(CheckHealthInput) -> impl Future<Output = CheckHealthOutput> {check_health}
forcheck_health
); - The handler function type becomes part of the overall
PokemonServiceBuilder
type, a cog in the largerOp1
generic parameter used to hold the handler for theCheckHealth
operation (i.e.Operation<IntoService<_, fn(CheckHealthInput) -> impl Future<Output = CheckHealthOutput> {check_health}>>
);
The second fact requires an intermediate understanding of Rust's closures and opaque types (impl Trait
). It's quite likely to confuse Rust beginners.
The developer has three options to move forward:
- Convert
check_health
andcheck_health_with_database
into a common type that can be passed as a handler toPokemonServiceBuilder::check_health
; - Invoke the
build
method inside the two branches in order to return a "plain"PokemonService<Route<B>>
from both branches. - Embed the configuration parameter (
check_database
) in the application state, retrieve it insidecheck_health
and perform the branching there.
I can't easily see a way to accomplish 1) using the current API. Pursuing 2) is straight-forward with a single conditional:
let check_database: bool = /* */;
let app = if check_database {
app.check_health(check_health).build()
} else {
app.check_health(check_health_with_database).build()
};
It becomes more cumbersome when we have more than a single conditional:
let check_database: bool = /* */;
let include_cpu_statics: bool = /* */;
match (check_database, include_cpu_statics) {
(true, true) => app
.check_health(check_health_with_database)
.get_server_statistics(get_server_statistics_with_cpu)
.build(),
(true, false) => app
.check_health(check_health_with_database)
.get_server_statistics(get_server_statistics)
.build(),
(false, true) => app
.check_health(check_health)
.get_server_statistics(get_server_statistics_with_cpu())
.build(),
(false, false) => app
.check_health(check_health)
.get_server_statistics(get_server_statistics)
.build(),
}
A lot of repetition compared to the code for the "obvious" approach:
let check_database: bool = /* */;
let include_cpu_statics: bool = /* */;
let app = if check_database {
app.check_health(check_health)
} else {
app.check_health(check_health_with_database)
};
let app = if include_cpu_statistics {
app.get_server_statistics(get_server_statistics_with_cpu)
} else {
app.get_server_statistics(get_server_statistics)
};
app.build();
The obvious approach becomes viable if we stop embedding the handler function type in PokemonServiceBuilder
's overall type.
Refactoring into smaller functions -> Prepare for some type juggling!
Services with a high number of routes can lead to fairly long startup routines. Developers might be tempted to break down the startup routine into smaller functions, grouping together operations with common requirements (similar domain, same middlewares, etc.).
What does the signature of those smaller functions look like?
The service builder must be one of the arguments if we want to register handlers. We must also return it to allow the orchestrating function to finish the application setup (our setters take ownership of self
).
A first sketch:
fn partial_setup(builder: PokemonServiceBuilder) -> PokemonServiceBuilder {
/* */
}
The compiler demands to see those generic parameters in the signature:
error[E0107]: missing generics for struct `PokemonServiceBuilder`
--> pokemon-service/src/bin/pokemon-service.rs:28:27
|
28 | fn partial_setup(builder: PokemonServiceBuilder) -> PokemonServiceBuilder {
| ^^^^^^^^^^^^^^^^^^^^^ expected at least 6 generic arguments
|
note: struct defined here, with at least 6 generic parameters: `Op1`, `Op2`, `Op3`, `Op4`, `Op5`, `Op6`
error[E0107]: missing generics for struct `PokemonServiceBuilder`
--> pokemon-service/src/bin/pokemon-service.rs:28:53
|
28 | fn partial_setup(builder: PokemonServiceBuilder) -> PokemonServiceBuilder {
| ^^^^^^^^^^^^^^^^^^^^^ expected at least 6 generic arguments
|
note: struct defined here, with at least 6 generic parameters: `Op1`, `Op2`, `Op3`, `Op4`, `Op5`, `Op6`
We could try to nudge the compiler into inferring them:
fn partial_setup(
builder: PokemonServiceBuilder<_, _, _, _, _, _>,
) -> PokemonServiceBuilder<_, _, _, _, _, _> {
/* */
}
but that won't fly either:
error[E0121]: the placeholder `_` is not allowed within types on item signatures for return types
--> pokemon-service/src/bin/pokemon-service.rs:30:28
|
30 | ) -> PokemonServiceBuilder<_, _, _, _, _, _> {
| ^ ^ ^ ^ ^ ^ not allowed in type signatures
| | | | | |
| | | | | not allowed in type signatures
| | | | not allowed in type signatures
| | | not allowed in type signatures
| | not allowed in type signatures
| not allowed in type signatures
We must type it all out:
fn partial_setup<Op1, Op2, Op3, Op4, Op5, Op6>(
builder: PokemonServiceBuilder<Op1, Op2, Op3, Op4, Op5, Op6>,
) -> PokemonServiceBuilder<Op1, Op2, Op3, Op4, Op5, Op6> {
builder
}
That compiles, at last. Let's try to register an operation handler now:
fn partial_setup<Op1, Op2, Op3, Op4, Op5, Op6>(
builder: PokemonServiceBuilder<Op1, Op2, Op3, Op4, Op5, Op6>,
) -> PokemonServiceBuilder<Op1, Op2, Op3, Op4, Op5, Op6> {
builder.get_server_statistics(get_server_statistics)
}
That looks innocent, but it doesn't fly:
error[E0308]: mismatched types
--> pokemon-service/src/bin/pokemon-service.rs:31:5
|
28 | fn partial_setup<Op1, Op2, Op3, Op4, Op5, Op6>(
| --- this type parameter
29 | builder: PokemonServiceBuilder<Op1, Op2, Op3, Op4, Op5, Op6>,
30 | ) -> PokemonServiceBuilder<Op1, Op2, Op3, Op4, Op5, Op6> {
| --------------------------------------------------- expected `PokemonServiceBuilder<Op1, Op2, Op3, Op4, Op5, Op6>` because of return type
31 | builder.get_server_statistics(get_server_statistics)
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected type parameter `Op4`, found struct `Operation`
|
= note: expected struct `PokemonServiceBuilder<_, _, _, Op4, _, _, _>`
found struct `PokemonServiceBuilder<_, _, _, Operation<IntoService<GetServerStatistics, fn(GetServerStatisticsInput, Extension<Arc<State>>) -> impl Future<Output = GetServerStatisticsOutput> {get_server_statistics}>>, _, _, _>
By registering a handler we have changed the corresponding OpX
generic parameter.
Fixing this error requires some non-trivial type gymnastic - I gave up after trying for ~15 minutes.
Cut them down: going from 2N+1 to 2 generic parameters
The previous two examples should have convinced you that the 2N+1 generic parameters on PokemonServiceBuilder
harm the ergonomics of our API.
Can we get rid of them?
Yes! Let's look at one possible approach:
pub struct PokemonServiceBuilder<Body, Plugin> {
check_health: Option<Route<Body>>,
do_nothing: Option<Route<Body>>,
get_pokemon_species: Option<Route<Body>>,
get_server_statistics: Option<Route<Body>>,
capture_pokemon: Option<Route<Body>>,
get_storage: Option<Route<Body>>,
plugin: Plugin,
}
We no longer store the raw handlers inside PokemonServiceBuilder
.
We eagerly upgrade the operation handlers to a Route
instance when they are registered with the builder.
impl<Body, Plugin> PokemonServiceBuilder<Body, Plugin> {
pub fn get_pokemon_species<Handler, Extensions>(mut self, handler: Handler) -> Self
/* Complex trait bounds */
{
let route = Route::new(Operation::from_handler(handler).upgrade(&self.plugin));
self.get_pokemon_species = Some(route);
self
}
/* other setters and methods */
}
The existing API performs the upgrade when build
is called, forcing PokemonServiceBuilder
to store the raw handlers and keep two generic parameters around (OpX
and ExtsX
) for each operation.
The proposed API requires plugins to be specified upfront, when creating an instance of the builder. They cannot be modified after a PokemonServiceBuilder
instance has been built:
impl PokemonService<()> {
/// Constructs a builder for [`PokemonService`].
pub fn builder<Body, Plugin>(plugin: Plugin) -> PokemonServiceBuilder<Body, Plugin> {
PokemonServiceBuilder {
check_health: None,
do_nothing: None,
get_pokemon_species: None,
get_server_statistics: None,
capture_pokemon: None,
get_storage: None,
plugin,
}
}
}
This constraint guarantees that all operation handlers are upgraded to a Route
using the same set of plugins.
Having to specify all plugins upfront is unlikely to have a negative impact on developers currently using smithy-rs
.
We have seen how cumbersome it is to break the startup logic into different functions using the current service builder API. Developers are most likely specifying all plugins and routes in the same function even if the current API allows them to intersperse route registrations and plugin registrations: they would simply have to re-order their registration statements to adopt the API proposed in this RFC.
Alternatives: allow new plugins to be registered after builder creation
The new design prohibits the following invocation style:
let plugin = ColorPlugin::new();
PokemonService::builder(plugin)
// [...]
.get_pokemon_species(get_pokemon_species)
// Add PrintPlugin
.print()
.get_storage(get_storage)
.build()
We could choose to remove this limitation and allow handlers to be upgraded using a different set of plugins depending on where they were registered. In the snippet above, for example, we would have:
get_pokemon_species
is upgraded using just theColorPlugin
;get_storage
is upgraded using both theColorPlugin
and thePrintPlugin
.
There are no technical obstacles preventing us from implementing this API, but I believe it could easily lead to confusion and runtime surprises due to a mismatch between what the developer might expect PrintPlugin
to apply to (all handlers) and what it actually applies to (handlers registered after .print()
).
We can provide developers with other mechanisms to register plugins for a single operation or a subset of operations without introducing ambiguity.
For attaching additional plugins to a single operation, we could introduce a blanket Pluggable
implementation for all operations in aws-smithy-http-server
:
impl<P, Op, Pl, S, L> Pluggable<Pl> for Operation<S, L> where Pl: Plugin<P, Op, S, L> {
type Output = Operation<Pl::Service, Pl::Layer>;
fn apply(self, new_plugin: Pl) -> Self::Output {
new_plugin.map(self)
}
}
which would allow developers to invoke op.apply(MyPlugin)
or call extensions methods such as op.print()
where op
is an Operation
.
For attaching additional plugins to a subgroup of operations, instead, we could introduce nested builders:
let initial_plugins = ColorPlugin;
let mut builder = PokemonService::builder(initial_plugins)
.get_pokemon_species(get_pokemon_species);
let additional_plugins = PrintPlugin;
// PrintPlugin will be applied to all handlers registered on the scoped builder returned by `scope`.
let nested_builder = builder.scoped(additional_plugins)
.get_storage(get_storage)
.capture_pokemon(capture_pokemon)
// Register all the routes on the scoped builder with the parent builder.
// API names are definitely provisional and bikesheddable.
.attach(builder);
let app = builder.build();
Both proposals are outside the scope of this RFC, but they are shown here for illustrative purposes.
Alternatives: lazy and eager-on-demand type erasure
A lot of our issues stem from type mismatch errors: we are encoding the type of our handlers into the overall type of the service builder and, as a consequence, we end up modifying that type every time we set a handler or modify its state.
Type erasure is a common approach for mitigating these issues - reduce those generic parameters to a common type to avoid the mismatch errors.
This whole RFC can be seen as a type erasure proposal - done eagerly, as soon as the handler is registered, using Option<Route<B>>
as our "common type" after erasure.
We could try to strike a different balance - i.e. avoid performing type erasure eagerly, but allow developers to erase types on demand. Based on my analysis, this could happen in two ways:
- We cast handlers into a
Box<dyn Upgradable<Protocol, Operation, Exts, Body, Plugin>>
to which we can later apply plugins (lazy type erasure); - We upgrade registered handlers to
Route<B>
and apply plugins in the process (eager type erasure on-demand).
Let's ignore these implementation issues for the time being to focus on what the ergonomics would look like assuming we can actually perform type erasure. In practice, we are going to assume that:
- In approach 1), we can call
.boxed()
on a registered operation and get aBox<dyn Upgradable>
back; - In approach 2), we can call
.erase()
on the entire service builder and convert all registered operations toRoute<B>
while keeping theMissingOperation
entries as they are. Aftererase
has been called, you can no longer register plugins (or, alternatively, the plugins you register will only apply new handlers).
We are going to explore both approaches under the assumption that we want to preserve compile-time verification for missing handlers. If we are willing to abandon compile-time verification, we get better ergonomics since all OpX
and ExtsX
generic parameters can be erased (i.e. we no longer need to worry about MissingOperation
).
On Box<dyn Upgradable<Protocol, Operation, Exts, Body, Plugin>>
This is the current definition of the Upgradable
trait:
/// Provides an interface to convert a representation of an operation to a HTTP [`Service`](tower::Service) with
/// canonical associated types.
pub trait Upgradable<Protocol, Operation, Exts, Body, Plugin> {
type Service: Service<http::Request<Body>, Response = http::Response<BoxBody>>;
/// Performs an upgrade from a representation of an operation to a HTTP [`Service`](tower::Service).
fn upgrade(self, plugin: &Plugin) -> Self::Service;
}
In order to perform type erasure, we need to determine:
- what type parameters we are going to pass as generic arguments to
Upgradable
; - what type we are going to use for the associated type
Service
.
We have:
- there is a single known protocol for a service, therefore we can set
Protocol
to its concrete type (e.g.AwsRestJson1
); - each handler refers to a different operation, therefore we cannot erase the
Operation
and theExts
parameters; - both
Body
andPlugin
appear as generic parameters on the service builder itself, therefore we can set them to the same type; - we can use
Route<B>
to normalize theService
associated type.
The above leaves us with two unconstrained type parameters, Operation
and Exts
, for each operation. Those unconstrained type parameters leak into the type signature of the service builder itself. We therefore find ourselves having, again, 2N+2 type parameters.
Branching
Going back to the branching example:
let check_database: bool = /* */;
let builder = if check_database {
builder.check_health(check_health)
} else {
builder.check_health(check_health_with_database)
};
let app = builder.build();
In approach 1), we could leverage the .boxed()
method to convert the actual OpX
type into a Box<dyn Upgradable>
, thus ensuring that both branches return the same type:
let check_database: bool = /* */;
let builder = if check_database {
builder.check_health_operation(Operation::from_handler(check_health).boxed())
} else {
builder.check_health_operation(Operation::from_handler(check_health_with_database).boxed())
};
let app = builder.build();
The same cannot be done when conditionally registering a route, because on the else
branch we cannot convert MissingOperation
into a Box<dyn Upgradable>
since MissingOperation
doesn't implement Upgradable
- the pillar on which we built all our compile-time safety story.
// This won't compile!
let builder = if check_database {
builder.check_health_operation(Operation::from_handler(check_health).boxed())
} else {
builder
};
In approach 2), we can erase the whole builder in both branches when they both register a route:
let check_database: bool = /* */;
let boxed_builder = if check_database {
builder.check_health(check_health).erase()
} else {
builder.check_health(check_health_with_database).erase()
};
let app = boxed_builder.build();
but, like in approach 1), we will still get a type mismatch error if one of the two branches leaves the route unset.
Refactoring into smaller functions
Developers would still have to spell out all generic parameters when writing a function that takes in a builder as a parameter:
fn partial_setup<Op1, Op2, Op3, Op4, Op5, Op6, Body, Plugin>(
builder: PokemonServiceBuilder<Op1, Op2, Op3, Op4, Op5, Op6, Body, Plugin>,
) -> PokemonServiceBuilder<Op1, Op2, Op3, Op4, Op5, Op6, Body, Plugin> {
builder
}
Writing the signature after having modified the builder becomes easier though. In approach 1), they can explicitly change the touched operation parameters to the boxed variant:
fn partial_setup<Op1, Op2, Op3, Op4, Op5, Op6, Exts4, Body, Plugin>(
builder: PokemonServiceBuilder<Op1, Op2, Op3, Op4, Op5, Op6, Body, Plugin, Exts4=Exts4>,
) -> PokemonServiceBuilder<
Op1, Op2, Op3, Box<dyn Upgradable<AwsRestJson1, GetServerStatistics, Exts4, Body, Plugin>>,
Op5, Op6, Body, Plugin, Body, Plugin, Exts4=Exts
> {
builder.get_server_statistics(get_server_statistics)
}
It becomes trickier in approach 2), since to retain compile-time safety on the builder we expect erase
to map MissingOperation
into MissingOperation
. Therefore, we can't write something like this:
fn partial_setup<Body, Op1, Op2, Op3, Op4, Op5, Op6>(
builder: PokemonServiceBuilder<Op1, Op2, Op3, Op4, Op5, Op6>,
) -> PokemonServiceBuilder<Route<B>, Route<B>, Route<B>, Route<B>, Route<B>, Route<B>> {
builder.get_server_statistics(get_server_statistics).()
}
The compiler would reject it since it can't guarantee that all other operations can be erased to a Route<B>
. This is likely to require something along the lines of:
fn partial_setup<Body, Op1, Op2, Op3, Op4, Op5, Op6>(
builder: PokemonServiceBuilder<Op1, Op2, Op3, Op4, Op5, Op6>,
) -> PokemonServiceBuilder<<Op1 as TypeErase>::Erased, <Op2 as TypeErase>::Erased, <Op3 as TypeErase>::Erased, <Op4 as TypeErase>::Erased, <Op5 as TypeErase>::Erased, <Op6 as TypeErase>::Erased>
where
// Omitting a bunch of likely needed additional generic parameters and bounds here
Op1: TypeErase,
Op2: TypeErase,
Op3: TypeErase,
Op4: TypeErase,
Op5: TypeErase,
Op6: TypeErase,
{
builder.get_server_statistics(get_server_statistics).()
}
Summary
Both approaches force us to have a number of generic parameters that scales linearly with the number of operations on the service, affecting the ergonomics of the resulting API in both the branching and the refactoring scenarios. We believe that the ergonomics advantages of the proposal advanced by this RFC outweigh the limitation of having to specify your plugins upfront, when creating the builder instance.
Builder extensions: what now?
The Pluggable
trait was an interesting development out of RFC 20: it allows you to attach methods to a service builder using an extension trait.
/// An extension to service builders to add the `print()` function.
pub trait PrintExt: aws_smithy_http_server::plugin::Pluggable<PrintPlugin> {
/// Causes all operations to print the operation name when called.
///
/// This works by applying the [`PrintPlugin`].
fn print(self) -> Self::Output
where
Self: Sized,
{
self.apply(PrintPlugin)
}
}
This pattern needs to be revisited if we want to move forward with this RFC, since new plugins cannot be registered after the builder has been instantiated.
My recommendation would be to implement Pluggable
for PluginStack
, providing the same pattern ahead of the creation of the builder:
// Currently you'd have to go for `PluginStack::new(IdentityPlugin, IdentityPlugin)`,
// but that can be smoothed out even if this RFC isn't approved.
let plugin_stack = PluginStack::default()
// Use the extension method
.print();
let app = PokemonService::builder(plugin_stack)
.get_pokemon_species(get_pokemon_species)
.get_storage(get_storage)
.get_server_statistics(get_server_statistics)
.capture_pokemon(capture_pokemon)
.do_nothing(do_nothing)
.build()?;
Playing around with the design
The API proposed in this RFC has been manually implemented for the Pokemon service. You can find the code here.
Changes checklist
-
Update
codegen-server
to generate the proposed service builder API -
Implement
Pluggable
forPluginStack
-
Evaluate the introduction of a
PluginBuilder
as the primary API to compose multiple plugins (instead ofPluginStack::new(IdentityPlugin, IdentityPlugin).apply(...)
)
The impact of a runtime error on developer productivity can be further minimised by encouraging adoption of integration testing; this can be achieved, among other options, by authoring guides that highlight its benefits and provide implementation guidance.
RFC: RequestID in business logic handlers
Status: Implemented
Applies to: server
For a summarized list of proposed changes, see the Changes Checklist section.
Terminology
- RequestID: a service-wide request's unique identifier
- UUID: a universally unique identifier
RequestID is an element that uniquely identifies a client request. RequestID is used by services to map all logs, events and specific data to a single operation. This RFC discusses whether and how smithy-rs can make that value available to customers.
Services use a RequestID to collect logs related to the same request and see its flow through the various operations, help clients debug requests by sharing this value and, in some cases, use this value to perform their business logic. RequestID is unique across a service at least within a certain timeframe.
This value for the purposes above must be set by the service.
Having the client send the value brings the following challenges:
- The client could repeatedly send the same RequestID
- The client could send no RequestID
- The client could send a malformed or malicious RequestID (like in 1 and 2).
To minimise the attack surface and provide a uniform experience to customers, servers should generate the value. However, services should be free to read the ID sent by clients in HTTP headers: it is common for services to read the request ID a client sends, record it and send it back upon success. A client may want to send the same value to multiple services. Services should still decide to have their own unique request ID per actual call.
RequestIDs are not to be used by multiple services, but only within a single service.
The user experience if this RFC is implemented
The proposal is to implement a RequestId
type and make it available to middleware and business logic handlers, through FromParts and as a Service
.
To aid customers already relying on clients' request IDs, there will be two types: ClientRequestId
and ServerRequestId
.
- Implementing
FromParts
forExtension<RequestId>
gives customers the ability to write their handlers:
pub async fn handler(
input: input::Input,
request_id: Extension<ServerRequestId>,
) -> ...
pub async fn handler(
input: input::Input,
request_id: Extension<ClientRequestId>,
) -> ...
ServerRequestId
and ClientRequestId
will be injected into the extensions by a layer.
This layer can also be used to open a span that will log the request ID: subsequent logs will be in the scope of that span.
- ServerRequestId format:
Common formats for RequestIDs are:
- UUID: a random string, represented in hex, of 128 bits from IETF RFC 4122:
7c038a43-e499-4162-8e70-2d4d38595930
- The hash of a sequence such as
date+thread+server
:734678902ea938783a7200d7b2c0b487
- A verbose description:
current_ms+hostname+increasing_id
For privacy reasons, any format that provides service details should be avoided. A random string is preferred. The proposed format is to use UUID, version 4.
A Service
that inserts a RequestId in the extensions will be implemented as follows:
impl<R, S> Service<http::Request<R>> for ServerRequestIdProvider<S>
where
S: Service<http::Request<R>>,
{
type Response = S::Response;
type Error = S::Error;
type Future = S::Future;
fn poll_ready(&mut self, cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> {
self.inner.poll_ready(cx)
}
fn call(&mut self, mut req: http::Request<R>) -> Self::Future {
req.extensions_mut().insert(ServerRequestId::new());
self.inner.call(req)
}
}
For client request IDs, the process will be, in order:
- If a header is found matching one of the possible ones, use it
- Otherwise, None
Option
is used to distinguish whether a client had provided an ID or not.
impl<R, S> Service<http::Request<R>> for ClientRequestIdProvider<S>
where
S: Service<http::Request<R>>,
{
type Response = S::Response;
type Error = S::Error;
type Future = S::Future;
fn poll_ready(&mut self, cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> {
self.inner.poll_ready(cx)
}
fn call(&mut self, mut req: http::Request<R>) -> Self::Future {
for possible_header in self.possible_headers {
if let Some(id) = req.headers.get(possible_header) {
req.extensions_mut().insert(Some(ClientRequestId::new(id)));
return self.inner.call(req)
}
}
req.extensions_mut().insert(None);
self.inner.call(req)
}
}
The string representation of a generated ID will be valid for this regex:
- For
ServerRequestId
:/^[A-Za-z0-9_-]{,48}$/
- For
ClientRequestId
: see the spec
Although the generated ID is opaque, this will give guarantees to customers as to what they can expect, if the server ID is ever updated to a different format.
Changes checklist
-
Implement
ServerRequestId
: anew()
function that generates a UUID, withDisplay
,Debug
andToStr
implementations -
Implement
ClientRequestId
:new()
that wraps a string (the header value) and the header in which the value could be found, withDisplay
,Debug
andToStr
implementations -
Implement
FromParts
forExtension<ServerRequestId>
-
Implement
FromParts
forExtension<ClientRequestId>
Changes since the RFC has been approved
This RFC has been changed to only implement ServerRequestId
.
RFC: Constraint traits
Status: Implemented.
See the description of the PR that laid the foundation for the implementation of constraint traits for a complete reference. See the Better Constraint Violations RFC too for subsequent improvements to this design.
See the uber tracking issue for pending work.
Constraint traits are used to constrain the values that can be provided for a shape.
For example, given the following Smithy model,
@length(min: 18)
Integer Age
the integer Age
must take values greater than or equal to 18.
Constraint traits are most useful when enforced as part of input model validation to a service. When a server receives a request whose contents deserialize to input data that violates the modeled constraints, the operation execution's preconditions are not met, and as such rejecting the request without executing the operation is expected behavior.
Constraint traits can also be applied to operation output member shapes, but the expectation is that service implementations not fail to render a response when an output value does not meet the specified constraints. From awslabs/smithy#1039:
This might seem counterintuitive, but our philosophy is that a change in server-side state should not be hidden from the caller unless absolutely necessary. Refusing to service an invalid request should always prevent server-side state changes, but refusing to send a response will not, as there's generally no reasonable route for a server implementation to unwind state changes due to a response serialization failure.
In general, clients should not enforce constraint traits in generated code. Clients must also never enforce constraint traits when sending requests. This is because:
- addition and removal of constraint traits are backwards-compatible from a client's perspective (although this is not documented anywhere in the Smithy specification),
- the client may have been generated with an older version of the model; and
- the most recent model version might have lifted some constraints.
On the other hand, server SDKs constitute the source of truth for the service's behavior, so they interpret the model in all its strictness.
The Smithy spec defines 8 constraint traits:
The idRef
and private
traits are enforced at SDK generation time by the
awslabs/smithy
libraries and bear no relation to generated Rust code.
The only constraint trait enforcement that is generated by smithy-rs clients
should be and is the enum
trait, which renders Rust enum
s.
The required
trait is already and only enforced by smithy-rs servers since
#1148.
That leaves 4 traits: length
, pattern
, range
, and uniqueItems
.
Implementation
This section addresses how to implement and enforce the length
, pattern
,
range
, and uniqueItems
traits. We will use the length
trait applied to a
string
shape as a running example. The implementation of this trait mostly
carries over to the other three.
Example implementation for the length
trait
Consider the following Smithy model:
@length(min: 1, max: 69)
string NiceString
The central idea to the implementation of constraint traits is: parse, don't
validate. Instead of code-generating a Rust String
to represent NiceString
values and perform the validation at request deserialization, we can
leverage Rust's type system to guarantee domain invariants. We can generate a
wrapper tuple struct that parses the string's value and is "tight" in the
set of values it can accept:
pub struct NiceString(String);
impl TryFrom<String> for NiceString {
type Error = nice_string::ConstraintViolation;
fn try_from(value: String) -> Result<Self, Self::Error> {
let num_code_points = value.chars().count();
if 1 <= num_code_points && num_code_points <= 69 {
Ok(Self(value))
} else {
Err(nice_string::ConstraintViolation::Length(num_code_points))
}
}
}
(Note that we're using the linear time check chars().count()
instead of
len()
on the input value, since the Smithy specification says the length
trait counts the number of Unicode code points when applied to string
shapes.)
The goal is to enforce, at the type-system level, that these constrained
structs always hold valid data. It should be impossible for the service
implementer, without resorting to unsafe
Rust, to construct a NiceString
that violates the model. The actual check is performed in the implementation of
TryFrom
<InnerType>
for the generated struct, which makes it convenient to use
the ?
operator for error propagation. Each constrained struct will have a
related std::error::Error
enum type to signal the first parsing failure,
with one enum variant per applied constraint trait:
pub mod nice_string {
pub enum ConstraintViolation {
/// Validation error holding the number of Unicode code points found, when a value between `1` and
/// `69` (inclusive) was expected.
Length(usize),
}
impl std::error::Error for ConstraintViolation {}
}
std::error::Error
requires Display
and Debug
. We will
#[derive(Debug)]
, unless the shape also has the sensitive
trait, in which
case we will just print the name of the struct:
impl std::fmt::Debug for ConstraintViolation {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
let mut formatter = f.debug_struct("ConstraintViolation");
formatter.finish()
}
}
Display
is used to produce human-friendlier representations. Its
implementation might be called when formatting a 400 HTTP response message in
certain protocols, for example.
Request deserialization
We will continue to deserialize the different parts of the HTTP message into
the regular Rust standard library types. However, just before the
deserialization function returns, we will convert the type into the wrapper
tuple struct that will eventually be handed over to the operation handler. This
is what we're already doing when deserializing strings into enums
. For
example, given the Smithy model:
@enum([
{ name: "Spanish", value: "es" },
{ name: "English", value: "en" },
{ name: "Japanese", value: "jp" },
])
string Language
the code the client generates when deserializing a string from a JSON document
into the Language
enum is (excerpt):
...
match key.to_unescaped()?.as_ref() {
"language" => {
builder = builder.set_language(
aws_smithy_json::deserialize::token::expect_string_or_null(
tokens.next(),
)?
.map(|s| {
s.to_unescaped()
.map(|u| crate::model::Language::from(u.as_ref()))
})
.transpose()?,
);
}
_ => aws_smithy_json::deserialize::token::skip_value(tokens)?,
}
...
Note how the String
gets converted to the enum via Language::from()
.
impl std::convert::From<&str> for Language {
fn from(s: &str) -> Self {
match s {
"es" => Language::Spanish,
"en" => Language::English,
"jp" => Language::Japanese,
other => Language::Unknown(other.to_owned()),
}
}
}
For constrained shapes we would do the same to parse the inner deserialized value into the wrapper tuple struct, except for these differences:
- For enums, the client generates an
Unknown
variant that "contains new variants that have been added since this code was generated". The server does not need such a variant (#1187). - Conversions into the tuple struct are fallible (
try_from()
instead offrom()
). These errors will result in amy_struct::ConstraintViolation
.
length
trait
We will enforce the length constraint by calling len()
on Rust's Vec
(list
and set
shapes), HashMap
(map
shapes) and our
aws_smithy_types::Blob
(bytes
shapes).
We will enforce the length constraint trait on String
(string
shapes) by
calling .chars().count()
.
pattern
trait
The pattern
trait
restricts string shape values to a specified regular expression.
We will implement this by using the regex
's crate is_match
. We will use
once_cell
to compile the regex only the first time it is required.
uniqueItems
trait
indicates that the items in a
List
MUST be unique.
If the list shape is sparse
, more than one null
value violates this
constraint.
We will enforce this by copying references to the Vec
's elements into a
HashSet
and checking that the sizes of both containers coincide.
Trait precedence and naming of the tuple struct
From the spec:
Some constraints can be applied to shapes as well as structure members. If a constraint of the same type is applied to a structure member and the shape that the member targets, the trait applied to the member takes precedence.
structure ShoppingCart {
@range(min: 7, max:12)
numberOfItems: PositiveInteger
}
@range(min: 1)
integer PositiveInteger
In the above example,
the range trait applied to numberOfItems takes precedence over the one applied to PositiveInteger. The resolved minimum will be 7, and the maximum 12.
When the constraint trait is applied to a member shape, the tuple struct's name
will be the PascalCased name of the member shape, NumberOfItems
.
Unresolved questions
- Should we code-generate unsigned integer types (
u16
,u32
,u64
) when therange
trait is applied withmin
set to a value greater than or equal to 0?- A user has even suggested to use the
std::num::NonZeroUX
types (e.g.NonZeroU64
) whenrange
is applied withmin
set to a value greater than 0. - UPDATE: This requires further design work. There are interoperability
concerns: for example, the positive range of a
u32
is strictly greater than that of ani32
, so clients wouldn't be able to receive values within the non-overlapping range.
- A user has even suggested to use the
- In request deserialization, should we fail with the first violation and
immediately render a response, or attempt to parse the entire request and
provide a complete and structured report?
- UPDATE: We will provide a response containing all violations. See the "Collecting Constraint Violations" section in the Better Constraint Violations RFC.
- Should we provide a mechanism for the service implementer to construct a
Rust type violating the modeled constraints in their business logic e.g. a
T::new_unchecked()
constructor? This could be useful (1) when the user knows the provided inner value does not violate the constraints and doesn't want to incur the performance penalty of the check; (2) when the struct is in a transient invalid state. However:- (2) is arguably a modelling mistake and a separate struct to represent the transient state would be a better approach,
- the user could use
unsafe
Rust to bypass the validation; and - adding this constructor is a backwards-compatible change, so it can always be added later if this feature is requested.
- UPDATE: We decided to punt on this until users express interest.
Alternative design
An alternative design with less public API surface would be to perform
constraint validation at request deserialization, but hand over a regular
"loose" type (e.g. String
instead of NiceString
) that allows for values
violating the constraints. If we were to implement this approach, we can
implement it by wrapping the incoming value in the aforementioned tuple struct
to perform the validation, and immediately unwrap it.
Comparative advantages:
- Validation remains an internal detail of the framework. If the semantics of a constraint trait change, the behavior of the service is still backwards-incompatibly affected, but user code is not.
- Less "invasive". Baking validation in the generated type might be deemed as the service framework overreaching responsibilities.
Comparative disadvantages:
- It becomes possible to send responses with invalid operation outputs. All the service framework could do is log the validation errors.
- Baking validation at the type-system level gets rid of an entire class of logic errors.
- Less idiomatic (this is subjective). The pattern of wrapping a more primitive type to guarantee domain invariants is widespread in the Rust ecosystem. The standard library makes use of it extensively.
Note that both designs are backwards incompatible in the sense that you can't migrate from one to the other without breaking user code.
UPDATE: We ended up implementing both designs, adding a flag to opt into the
alternative design. Refer to the mentions of the publicConstrainedTypes
flag
in the description of the Builders of builders PR.
RFC: Client Crate Organization
Status: Implemented
Applies to: clients (and may impact servers due to shared codegen)
This RFC proposes changing the organization structure of the generated client crates to:
- Make discovery in the crate documentation easier.
- Facilitate re-exporting types from runtime crates in related modules without name collisions.
- Facilitate feature gating operations for faster compile times in the future.
Previous Organization
Previously, crates were organized as such:
.
├── client
| ├── fluent_builders
| | └── <One fluent builder per operation>
| ├── Builder (*)
| └── Client
├── config
| ├── retry
| | ├── RetryConfig (*)
| | ├── RetryConfigBuilder (*)
| | └── RetryMode (*)
| ├── timeout
| | ├── TimeoutConfig (*)
| | └── TimeoutConfigBuilder (*)
| ├── AsyncSleep (*)
| ├── Builder
| ├── Config
| └── Sleep (*)
├── error
| ├── <One module per error to contain a single struct named `Builder`>
| ├── <One struct per error named `${error}`>
| ├── <One struct per operation named `${operation}Error`>
| └── <One enum per operation named `${operation}ErrorKind`>
├── http_body_checksum (empty)
├── input
| ├── <One module per input to contain a single struct named `Builder`>
| └── <One struct per input named `${operation}Input`>
├── lens (empty)
├── middleware
| └── DefaultMiddleware
├── model
| ├── <One module per shape to contain a single struct named `Builder`>
| └── <One struct per shape>
├── operation
| ├── customize
| | ├── ClassifyRetry (*)
| | ├── CustomizableOperation
| | ├── Operation (*)
| | ├── RetryKind (*)
| └── <One struct per operation>
├── output
| ├── <One module per output to contain a single struct named `Builder`>
| └── <One struct per output named `${operation}Input`>
├── paginator
| ├── <One struct per paginated operation named `${operation}Paginator`>
| └── <Zero to one struct(s) per paginated operation named `${operation}PaginatorItems`>
├── presigning
| ├── config
| | ├── Builder
| | ├── Error
| | └── PresigningConfig
| └── request
| └── PresignedRequest
├── types
| ├── AggregatedBytes (*)
| ├── Blob (*)
| ├── ByteStream (*)
| ├── DateTime (*)
| └── SdkError (*)
├── AppName (*)
├── Client
├── Config
├── Credentials (*)
├── Endpoint (*)
├── Error
├── ErrorExt (for some services)
├── PKG_VERSION
└── Region (*)
(*)
- signifies that a type is re-exported from one of the runtime crates
Proposed Changes
This RFC proposes reorganizing types by operation first and foremost, and then rearranging other pieces to reduce codegen collision risk.
Establish a pattern for builder organization
Builders (distinct from fluent builders) are generated alongside all inputs, outputs, models, and errors.
They all follow the same overall pattern (where shapeType
is Input
, Output
, or empty for models/errors):
.
└── module
├── <One module per shape to contain a single struct named `Builder`>
└── <One struct per shape named `${prefix}${shapeType}`>
This results in large lists of modules that all have exactly one item in them, which makes browsing the documentation difficult, and introduces the possibility of name collisions when re-exporting modules from the runtime crates.
Builders should adopt a prefix and go into a single builders
module, similar to how the fluent builders
currently work:
.
├── module
| └── builders
| └── <One struct per shape named `${prefix}${shapeType}Builder`>
└──---- <One struct per shape named `${prefix}${shapeType}`>
Organize code generated types by operation
All code generated for an operation that isn't shared between operations will go into operation-specific modules. This includes inputs, outputs, errors, parsers, and paginators. Types shared across operations will remain in another module (discussed below), and serialization/deserialization logic for those common types will also reside in that common location for now. If operation feature gating occurs in the future, further optimization can be done to track which of these are used by feature, or they can be reorganized (this would be discussed in a future RFC and is out of scope here).
With code generated operations living in crate::operation
, there is a high chance of name
collision with the customize
module. To resolve this, customize
will be moved into
crate::client
.
The new crate::operation
module will look as follows:
.
└── operation
└── <One module per operation named after the operation in lower_snake_case>
├── paginator
| ├── `${operation}Paginator`
| └── `${operation}PaginatorItems`
├── builders
| ├── `${operation}FluentBuilder`
| ├── `${operation}InputBuilder`
| └── `${operation}OutputBuilder`
├── `${operation}Error`
├── `${operation}Input`
├── `${operation}Output`
└── `${operation}Parser` (private/doc hidden)
Reorganize the crate root
The crate root should only host the most frequently used types, or phrased differently, the types that are critical to making a service call with default configuration, or that are required for the most frequent config changes (such as setting credentials, or changing the region/endpoint).
Previously, the following were exported in root:
.
├── AppName
├── Client
├── Config
├── Credentials
├── Endpoint
├── Error
├── ErrorExt (for some services)
├── PKG_VERSION
└── Region
The AppName
is infrequently set, and will be moved into crate::config
. Customers are encouraged
to use aws-config
crate to resolve credentials, region, and endpoint. Thus, these types no longer
need to be at the top-level, and will be moved into crate::config
. ErrorExt
will be moved into
crate::error
, but Error
will stay in the crate root so that customers that alias the SDK crate
can easily reference it in their Result
s:
use aws_sdk_s3 as s3;
fn some_function(/* ... */) -> Result<(), s3::Error> {
/* ... */
}
The PKG_VERSION
should move into a new meta
module, which can also include other values in the future
such as the SHA-256 hash of the model used to produce the crate, or the version of smithy-rs that generated it.
Conditionally remove Builder
from crate::client
Previously, the Smithy Client
builder was re-exported alongside the SDK fluent Client
so that non-SDK clients could easily customize the underlying Smithy client by using
the fluent client's Client::with_config
function or From<aws_smithy_client::client::Client<C, M, R>>
trait implementation.
This makes sense for non-SDK clients where customization of the connector and middleware types is supported
generically, but less sense for SDKs since the SDK clients are hardcoded to use DynConnector
and DynMiddleware
.
Thus, the Smithy client Builder
should not be re-exported for SDKs.
Create a primitives
module
Previously, crate::types
held re-exported types from aws-smithy-types
that are used
by code generated structs/enums.
This module will be renamed to crate::primitives
so that the name types
can be
repurposed in the next section.
Repurpose the types
module
The name model
is meaningless outside the context of code generation (although there is precedent
since both the Java V2 and Kotlin SDKs use the term). Previously, this module held all the generated
structs/enums that are referenced by inputs, outputs, and errors.
This RFC proposes that this module be renamed to types
, and that all code generated types
for shapes that are reused between operations (basically anything that is not an input, output,
or error) be moved here. This would look as follows:
.
└── types
├── error
| ├── builders
| | └── <One struct per error named `${error}Builder`>
| └── <One struct per error named `${error}`>
├── builders
| └── <One struct per shape named `${shape}Builder`>
└── <One struct per shape>
Customers using the fluent builder should be able to just use ${crate}::types::*;
to immediately
get access to all the shared types needed by the operations they are calling.
Additionally, moving the top-level code generated error types into crate::types
will eliminate a name
collision issue in the crate::error
module.
Repurpose the original crate::error
module
The error
module is significantly smaller after all the code generated error types
are moved out of it. This top-level module is now available for re-exports and utilities.
The following will be re-exported in crate::error
:
aws_smithy_http::result::SdkError
aws_smithy_types::error::display::DisplayErrorContext
For crates that have an ErrorExt
, it will also be moved into crate::error
.
Flatten the presigning
module
The crate::presigning
module only has four members, so it should be flattened from:
.
└── presigning
├── config
| ├── Builder
| ├── Error
| └── PresigningConfig
└── request
└── PresignedRequest
to:
.
└── presigning
├── PresigningConfigBuilder
├── PresigningConfigError
├── PresigningConfig
└── PresignedRequest
At the same time, Builder
and Error
will be renamed to PresigningConfigBuilder
and PresigningConfigError
respectively since these will rarely be referred to directly (preferring PresigningConfig::builder()
instead; the
error will almost always be unwrapped).
Remove the empty modules
The lens
and http_body_checksum
modules have nothing inside them,
and their documentation descriptions are not useful to customers:
lens
: Generated accessors for nested fields
http_body_checksum
: Functions for modifying requests and responses for the purposes of checksum validation
These modules hold private functions that are used by other generated code, and should just be made
private or #[doc(hidden)]
if necessary.
New Organization
All combined, the following is the new publicly visible organization:
.
├── client
| ├── customize
| | ├── ClassifyRetry (*)
| | ├── CustomizableOperation
| | ├── Operation (*)
| | └── RetryKind (*)
| ├── Builder (only in non-SDK crates) (*)
| └── Client
├── config
| ├── retry
| | ├── RetryConfig (*)
| | ├── RetryConfigBuilder (*)
| | └── RetryMode (*)
| ├── timeout
| | ├── TimeoutConfig (*)
| | └── TimeoutConfigBuilder (*)
| ├── AppName (*)
| ├── AsyncSleep (*)
| ├── Builder
| ├── Config
| ├── Credentials (*)
| ├── Endpoint (*)
| ├── Region (*)
| └── Sleep (*)
├── error
| ├── DisplayErrorContext (*)
| ├── ErrorExt (for some services)
| └── SdkError (*)
├── meta
| └── PKG_VERSION
├── middleware
| └── DefaultMiddleware
├── operation
| └── <One module per operation named after the operation in lower_snake_case>
| ├── paginator
| | ├── `${operation}Paginator`
| | └── `${operation}PaginatorItems`
| ├── builders
| | ├── `${operation}FluentBuilder`
| | ├── `${operation}InputBuilder`
| | └── `${operation}OutputBuilder`
| ├── `${operation}Error`
| ├── `${operation}Input`
| ├── `${operation}Output`
| └── `${operation}Parser` (private/doc hidden)
├── presigning
| ├── PresigningConfigBuilder
| ├── PresigningConfigError
| ├── PresigningConfig
| └── PresignedRequest
├── primitives
| ├── AggregatedBytes (*)
| ├── Blob (*)
| ├── ByteStream (*)
| └── DateTime (*)
├── types
| ├── error
| | ├── builders
| | | └── <One struct per error named `${error}Builder`>
| | └── <One struct per error named `${error}`>
| ├── builders
| | └── <One struct per shape named `${shape}Builder`>
| └── <One struct per shape>
├── Client
├── Config
└── Error
(*)
- signifies that a type is re-exported from one of the runtime crates
Changes Checklist
-
Move
crate::AppName
intocrate::config
-
Move
crate::PKG_VERSION
into a newcrate::meta
module -
Move
crate::Endpoint
intocrate::config
-
Move
crate::Credentials
intocrate::config
-
Move
crate::Region
intocrate::config
-
Move
crate::operation::customize
intocrate::client
- Finish refactor to decouple client/server modules
- Organize code generated types by operation
- Reorganize builders
-
Rename
crate::types
tocrate::primitives
-
Rename
crate::model
tocrate::types
-
Move
crate::error
intocrate::types
-
Only re-export
aws_smithy_client::client::Builder
for non-SDK clients (remove from SDK clients) -
Move
crate::ErrorExt
intocrate::error
-
Re-export
aws_smithy_types::error::display::DisplayErrorContext
andaws_smithy_http::result::SdkError
incrate::error
-
Move
crate::paginator
intocrate::operation
-
Flatten
crate::presigning
-
Hide or remove
crate::lens
andcrate::http_body_checksum
-
Move fluent builders into
crate::operation::x::builders
-
Remove/hide operation
ParseResponse
implementations incrate::operation
- Update "Crate Organization" top-level section in generated crate docs
- Update all module docs
-
Break up modules/files so that they're not 30k lines of code
- models/types; each struct/enum should probably get its own file with pub-use
- models/types::builders: now this needs to get split up
-
client.rs
- Fix examples
- Write changelog
RFC: Endpoints 2.0
Status: RFC
For a summarized list of proposed changes, see the Changes Checklist section.
This RFC defines how the Rust SDK will integrate with the next generation of endpoint resolution logic (Endpoints 2.0). Endpoints 2.0 defines a rules language for resolving endpoints. The Rust SDK will code-generate Rust code from this intermediate language and use this to create service-specific endpoint resolvers.
Endpoints 2.0 will be a core feature and be available for generic clients as well as the AWS SDK.
Terminology
- Generic client: In reference to features/code that is not AWS specific and is supported for all Smithy clients.
- Rules language: A JSON-based rules language used to resolve endpoints
- Smithy Endpoint: An endpoint, as returned from the rules-language. This contains a URI,
headers, and configuration map of
String -> Document
(properties
). This must undergo another level of transformation before it can be used as anAwsEndpoint
. - AWS Endpoint: An endpoint with explicit signing configuration applied. AWS Endpoints need to contain region & service metadata to control signing.
- Middleware: A transformation applied to a request, prior to request dispatch
- Endpoint Parameters: A code-generated structure for each service which contains service-specific (and general) endpoint parameters.
The user experience if this RFC is implemented
Overview
SDKs will generate a new, public, endpoint
module. The module will contain a Params
structure
and
a DefaultResolver
. Supporting these modules, a private endpoints_impl
module will
be generated.
Why generate two modules?
Generating two separate modules,
endpoint
andendpoint_impl
ensures that we don't have namespace collisions between hand-written and generated code.
SDK middleware will be updated to use the new smithy_types::Endpoint
. During request
construction in make_operation
, a smithy endpoint will be inserted into the property bag. The
endpoint middleware will be updated to extract the Smithy endpoint from the property bag and set the request endpoint &
signing information accordingly (see: Converting to AWS Endpoint.
The following flow chart traces the endpoints 2.0 influence on a request via the green boxes.
flowchart TD globalConfig("SDK global configuration (e.g. region provider, UseFIPS, etc.)") serviceConfig("Modeled, service specific configuration information (clientContextParams)") operationConfig("Operation-specific configuration (S3 Bucket, accountId, etc.)") getObject["S3::GetObject"] params["Create endpoint parameters"] evaluate["Evaluate ruleset"] rules["Generated Endpoint Ruleset for S3"] middleware["Apply endpoint & properties to request via endpoint middleware"] style getObject fill:green,stroke:#333,stroke-width:4px style params fill:green,stroke:#333,stroke-width:4px style evaluate fill:green,stroke:#333,stroke-width:4px style middleware fill:green,stroke:#333,stroke-width:4px getObject ==> params globalConfig ---> params operationConfig --> params serviceConfig ---> params rules --> evaluate params --> evaluate evaluate --> middleware
Overriding Endpoints
In the general case, users will not be impacted by Endpoints 2.0 with one exception: today, users can provide a
global endpoint provider that can override different services. There is a single ResolveAwsEndpoint
trait that
is shared across all services. However, this isn't the case for Endpoints 2.0
where the trait actually has a generic
parameter:
pub trait ResolveEndpoint<T>: Send + Sync {
fn resolve_endpoint(&self, params: &T) -> Result<Endpoint, BoxError>;
}
The trait itself would then be parameterized by service-specific endpoint parameter, eg:
aws_sdk_s3::endpoint::Params
. The endpoint parameters we would use for S3 (e.g. including Bucket
) are different
from the endpoint parameters we might use for a service like DynamoDB which, today, doesn't have any custom endpoint
behavior.
Going forward we will to provide two different avenues for customers to customize endpoints:
- Configuration driven URL override. This mechanism hasn't been specified, but suppose that the Rust SDK supported
an
SDK_ENDPOINT
environment variable. This variable would be an input to the existing endpoint resolver. machinery and would be backwards compatible with other SDKs (e.g. by prefixing the bucket as a host label for S3). - Wholesale endpoint resolver override. In this case, customers would gain access to all endpoint parameters and be able to write their own resolver.
This RFC proposes making the following changes:
- For the current global ability to override an endpoint, instead of accepting an
AwsEndpoint
, accept a URI. This will simplify the interface for most customers who don't actually need logic-driven endpoint construction. The Endpoint that can be set will be passed in as theSDK::Endpoint
built-in. This will be renamed toendpoint_url
for clarity. All AWS services MUST accept theSDK::Endpoint
built-in. - For complex, service-specific behavior, customers will be able to provide a service specific endpoint resolver at
client construction time. This resolver will be parameterized with the service-specific parameters type, (
eg.
aws_sdk_s3::endpoint::Params
). Finally, customers will be able to access thedefault_resolver()
for AWS services directly. This will enable them to utilize the default S3 endpoint resolver in their resolver implementation.
Example: overriding the endpoint URI globally
async fn main() {
let sdk_conf = aws_config::from_env().endpoint_url("http://localhost:8123").load().await;
let dynamo = aws_sdk_dynamodb::Client::new(&sdk_conf);
// snip ...
}
Example: overriding the endpoint resolver for a service
/// Resolve to Localhost when an environment variable is set
struct CustomDdbResolver;
impl ResolveEndpoint<aws_sdk_dynamodb::endpoint::Params> for CustomDdbResolver {
fn resolve_endpoint(&self, params: &Params) -> Result<Endpoint, EndpointResolutionError> {
// custom resolver to redirect to DDB local if a flag is set
let base_endpoint = aws_sdk_dynamodb::endpoint::default_resolver().resolve_endpoint(params).expect("valid endpoint should be resolved");
if env::var("LOCAL") == Ok("true") {
// update the URI on the returned endpoint to localhost while preserving the other properties
Ok(base_endpoint.builder().uri("http://localhost:8888").build())
} else {
Ok(base_endpoint)
}
}
}
async fn main() {
let conf = aws_config::load_from_env().await;
let ddb_conf = aws_sdk_dynamodb::config::Builder::from(&conf).endpoint_resolver(CustomDdbResolver);
let dynamodb = aws_sdk_dynamodb::Client::from_conf(ddb_conf);
}
Note: for generic clients, they cannot use endpoint_url
—this is because endpoint_url
is dependent on rules and
generic
clients do not necessarily rules. However, they can use the impl<T> ResolveEndpoint<T> for &'static str { ... }
implementation.
What about alternative S3 implementations? How do we say "don't put prefix bucket on this?"
For cases where users want to use the provided URL directly with no modification users will need to rely on service specific configuration, like forcing path style addressing for S3.
Alternative Design: Context Aware Endpoint Trait
Optional addition: We could add an additional
EndpointResolver
parameter toSdkConfig
that exposed a global trait whereParams
is&dyn Any
similar to Context Aware Endpoint Trait. If these were both set, a runtime panic would alert users to the misconfiguration.
New Endpoint Traits
The new endpoint resolution trait and Endpoint
struct will be available for generic clients. AWS endpoint middleware
will pull the Endpoint
out of the property bag and read the properties to determine auth/signing + any other AWS
metadata that may be required.
An example of the Endpoint
struct is below. This struct will be in aws-smithy-types
, however, it should initially be
gated with documentation warning about stability.
The Endpoint Struct
// module: `aws_smithy_types::endpoint`
// potential optimization to reduce / remove allocations for keys which are almost always static
// this can also just be `String`
type MaybeStatic<T> = Cow<'static, T>;
/// Endpoint
#[derive(Debug, PartialEq)]
pub struct Endpoint {
// Note that this allows `Endpoint` to contain an invalid URI. During conversion to an actual endpoint, the
// the middleware can fail, returning a `ConstructionFailure` to the user
url: MaybeStatic<str>,
headers: HashMap<MaybeStatic<str>, Vec<MaybeStatic<str>>>,
properties: HashMap<MaybeStatic<str>, aws_smithy_types::Document>,
}
// not shown:
// - impl block with standard accessors
// - builder, designed to be invoked / used by generated code
What's an Endpoint property?
Endpoint properties, on their own, have no intrinsic meaning. Endpoint properties have established conventions for AWS SDKs. Other Smithy implementors may choose a different pattern. For AWS SDKs, the
authSchemes
key is an ordered list of authentication/signing schemes supported by the Endpoint that the SDK should use.
To perform produce an Endpoint
struct we have a generic ResolveEndpoint
trait which will be both generic in terms of
parameters and being "smithy-generic:
// module: `smithy_types::endpoint` or `aws_smithy_client`??
pub trait ResolveEndpoint<Params>: Send + Sync {
/// Resolves an `Endpoint` for `Params`
fn resolve_endpoint(&self, params: &Params) -> Result<aws_smithy_types::Endpoint, EndpointResolutionError>;
}
All Smithy services that have the @endpointRuleSet
trait applied to the service shape will code generate a default
endpoint resolver implementation. The default endpoint resolver MUST be public, so that customers can delegate to it
if they wish to override the endpoint resolver.
Endpoint Params
We've mentioned "service specific endpoint parameters" a few times. In Endpoints 2.0, we will code generate Endpoint Parameters for every service based on their rules. Note: the endpoint parameters themselves are generated solely from the ruleset. The Smithy model provides additional information about parameter binding, but that only influences how the parameters are set, not how they are generated.
Example Params
struct for S3:
#[non_exhaustive]
#[derive(std::clone::Clone, std::cmp::PartialEq, std::fmt::Debug)]
/// Configuration parameters for resolving the correct endpoint
pub struct Params {
pub(crate) bucket: std::option::Option<std::string::String>,
pub(crate) region: std::option::Option<std::string::String>,
pub(crate) use_fips: bool,
pub(crate) use_dual_stack: bool,
pub(crate) endpoint: std::option::Option<std::string::String>,
pub(crate) force_path_style: std::option::Option<bool>,
pub(crate) accelerate: bool,
pub(crate) disable_access_points: std::option::Option<bool>,
pub(crate) disable_mrap: std::option::Option<bool>,
}
impl Params {
/// Create a builder for [`Params`]
pub fn builder() -> crate::endpoint_resolver::Builder {
crate::endpoint_resolver::Builder::default()
}
/// Gets the value for bucket
pub fn bucket(&self) -> std::option::Option<&str> {
self.bucket.as_deref()
}
/// Gets the value for region
pub fn region(&self) -> std::option::Option<&str> {
self.region.as_deref()
}
/// Gets the value for use_fips
pub fn use_fips(&self) -> std::option::Option<bool> {
Some(self.use_fips)
}
/// Gets the value for use_dual_stack
pub fn use_dual_stack(&self) -> std::option::Option<bool> {
Some(self.use_dual_stack)
}
// ... more accessors
}
The default endpoint resolver
When an endpoint ruleset is present, Smithy will code generate an endpoint resolver from that ruleset. The endpoint resolver MUST be a struct so that it can store/cache computations (such as a partition resolver that has compiled regexes).
pub struct DefaultEndpointResolver {
partition_resolver: PartitionResolver
}
impl ResolveEndpoint<crate::endpoint::Params> for DefaultEndpointResolver {
fn resolve_endpoint(&self, params: &Params) -> Result<aws_smithy_types::Endpoint, EndpointResolutionError> {
// delegate to private impl
crate::endpoints_impl::resolve_endpoint(params)
}
}
DefaultEndpointResolver
MUST be publicly accessible and offer both a default constructor and the ability to
configure resolution behavior (e.g. by supporting adding additional partitions.)
How to actually implement this RFC
To describe how this feature will work, let's take a step-by-step path through endpoint resolution.
-
A user defines a service client, possibly with some client specific configuration like region.
@clientContextParams
are code generated onto the clientConfig
. Code generating@clientContextParams
-
A user invokes an operation like
s3::GetObject
. A params object is created. In the body ofmake_operation()
, this is passed toconfig.endpoint_resolver
to load a generic endpoint. TheResult
of the of the endpoint resolution is written into the property bag. -
The generic smithy middleware (
SmithyEndpointStage
) sets the request endpoint. -
The AWS auth middleware (
AwsAuthStage
) reads the endpoint out of the property bag and applies signing overrides. -
The request is signed & dispatched
The other major piece of implementation required is actually implementing the rules engine. To learn more about rules-engine internals, skip to implementing the rules engine.
Code generating client context params
When a smithy model uses the @clientContextParams
trait, we need to generate client params onto the Rust SDK. This is
a Smithy-native feature. This should be implemented as a "standard" config decorator that reads traits from the
current model.
Kotlin Snippet for Client context params
class ClientContextDecorator(ctx: ClientCodegenContext) : NamedSectionGenerator<ServiceConfig>() {
private val contextParams = ctx.serviceShape.getTrait<ClientContextParamsTrait>()?.parameters.orEmpty().toList()
.map { (key, value) -> ContextParam.fromClientParam(key, value, ctx.symbolProvider) }
data class ContextParam(val name: String, val type: Symbol, val docs: String?) {
companion object {
private fun toSymbol(shapeType: ShapeType, symbolProvider: RustSymbolProvider): Symbol =
symbolProvider.toSymbol(
when (shapeType) {
ShapeType.STRING -> StringShape.builder().id("smithy.api#String").build()
ShapeType.BOOLEAN -> BooleanShape.builder().id("smithy.api#Boolean").build()
else -> TODO("unsupported type")
}
)
fun fromClientParam(
name: String,
definition: ClientContextParamDefinition,
symbolProvider: RustSymbolProvider
): ContextParam {
return ContextParam(
RustReservedWords.escapeIfNeeded(name.toSnakeCase()),
toSymbol(definition.type, symbolProvider),
definition.documentation.orNull()
)
}
}
}
override fun section(section: ServiceConfig): Writable {
return when (section) {
is ServiceConfig.ConfigStruct -> writable {
contextParams.forEach { param ->
rust("pub (crate) ${param.name}: #T,", param.type.makeOptional())
}
}
ServiceConfig.ConfigImpl -> emptySection
ServiceConfig.BuilderStruct -> writable {
contextParams.forEach { param ->
rust("${param.name}: #T,", param.type.makeOptional())
}
}
ServiceConfig.BuilderImpl -> writable {
contextParams.forEach { param ->
param.docs?.also { docs(it) }
rust(
"""
pub fn ${param.name}(mut self, ${param.name}: #T) -> Self {
self.${param.name} = Some(${param.name});
self
}
""",
param.type
)
}
}
ServiceConfig.BuilderBuild -> writable {
contextParams.forEach { param ->
rust("${param.name}: self.${param.name},")
}
}
else -> emptySection
}
}
}
Creating Params
Params
will be created and utilized in generic code generation.
make_operation()
needs to load the parameters from several configuration sources. These sources have a priority order.
To handle this priority order, we will load from all sources in reverse priority order, with lower priority sources
overriding higher priority ones.
Implementation of operation decorator
class EndpointParamsDecorator(
private val ctx: ClientCodegenContext,
private val operationShape: OperationShape,
) : OperationCustomization() {
val idx = ContextIndex.of(ctx.model)
private val ruleset = EndpointRuleset.fromNode(ctx.serviceShape.expectTrait<EndpointRuleSetTrait>().ruleSet)
override fun section(section: OperationSection): Writable {
return when (section) {
is OperationSection.MutateInput -> writable {
rustTemplate(
"""
let params = #{Params}::builder()
#{builder:W}.expect("invalid endpoint");
""",
"Params" to EndpointParamsGenerator(ruleset).paramsStruct(),
"builder" to builderFields(section)
)
}
is OperationSection.MutateRequest -> writable {
rust("// ${section.request}.properties_mut().insert(params);")
}
else -> emptySection
}
}
private fun builderFields(section: OperationSection.MutateInput) = writable {
val memberParams = idx.getContextParams(operationShape)
val builtInParams = ruleset.parameters.toList().filter { it.isBuiltIn }
// first load builtins and their defaults
builtInParams.forEach { param ->
val defaultProviders = section.endpointCustomizations.mapNotNull { it.defaultFor(param, section.config) }
if (defaultProviders.size > 1) {
error("Multiple providers provided a value for the builtin $param")
}
defaultProviders.firstOrNull()?.also { defaultValue ->
rust(".set_${param.name.rustName()}(#W)", defaultValue)
}
}
// these can be overridden with client context params
idx.getClientContextParams(ctx.serviceShape).forEach { (name, _param) ->
rust(".set_${name.toSnakeCase()}(${section.config}.${name.toSnakeCase()}.as_ref())")
}
// lastly, allow these to be overridden by members
memberParams.forEach { (memberShape, param) ->
rust(".set_${param.name.toSnakeCase()}(${section.input}.${ctx.symbolProvider.toMemberName(memberShape)}.as_ref())")
}
rust(".build()")
}
}
Loading values for builtIns
The fundamental point of builtIn values is enabling other code generators to define where these values come from.
Because of that, we will need to expose the ability to customize AwsBuiltIns. One way to do this is with a new
customization type, EndpointCustomization
:
fun endpointCustomizations(
clientCodegenContext: C,
operation: OperationShape,
baseCustomizations: List<EndpointCustomization>
): List<EndpointCustomization> = baseCustomizations
abstract class EndpointCustomization {
abstract fun defaultFor(parameter: Parameter, config: String): Writable?
}
Customizations have the ability to specify the default value for a parameter. (Of course, these customizations need to be wired in properly.)
Converting a Smithy Endpoint to an AWS Endpoint
A Smithy endpoint has an untyped, string->Document
collection of properties. We need to interpret these properties to
handle actually resolving an endpoint. As part of the AwsAuthStage
, we load authentication schemes from the endpoint
properties and use these to configure signing on the request.
Note: Authentication schemes are NOT required as part of an endpoint. When the auth schemes are not set, the
default
authentication should be used. The Rust SDK will set SigningRegion
and SigningName
in the property bag by default
as part of make_operation
.
Implementing the rules engine
The Rust SDK code converts the rules into Rust code that will be compiled.
Changes checklist
Rules Engine
- Endpoint rules code generator
- Endpoint params code generator
- Endpoint tests code generator
-
Implement ruleset standard library functions as inlineables. Note: pending future refactoring work, the
aws.
functions will need to be integrated into the smithy core endpoint resolver. - Implement partition function & ability to customize partitions SDK Integration
- Add a Smithy endpoint resolver to the service config, with a default that loads the default endpoint resolver.
-
Update
SdkConfig
to accept a URI instead of an implementation ofResolveAwsEndpoint
. This change can be done standalone. -
Remove/deprecate the
ResolveAwsEndpoint
trait and replace it with the vanilla Smithy trait. Potentially, provide a bridge. -
Update
make_operation
to write asmithy::Endpoint
into the property bag -
Update AWS Endpoint middleware to work off of a
smithy::Endpoint
-
Wire the endpoint override to the
SDK::Endpoint
builtIn parameter - Remove the old smithy endpoint
Alternative Designs
Context Aware Endpoint Traits
An alternative design that could provide more flexibility is a context-aware endpoint trait where the return type would give context about the endpoint being returned. This would, for example, allow a customer to say explicitly "don't modify this endpoint":
enum ContextualEndpoint {
/// Just the URI please. Pass it into the default endpoint resolver as a baseline
Uri { uri: Uri, immutable: bool },
/// A fully resolved, ready to rumble endpoint. Don't bother hitting the default endpoint resolver, just use what
/// I've got.
AwsEndpoint(AwsEndpoint)
}
trait ResolveGlobalEndpoint {
fn resolve_endpoint(params: &dyn Any) -> Result<ContextualEndpoint, EndpointResolutionError>;
}
Service clients would then use ResolveGlobalEndpoint
, optional specified from SdkConfig
to perform routing
decisions.
RFC: SDK Credential Cache Type Safety
Status: Implemented in smithy-rs#2122
Applies to: AWS SDK for Rust
At time of writing (2022-10-11), the SDK's credentials provider can be customized by providing:
- A profile credentials file to modify the default provider chain
- An instance of one of the credentials providers implemented in
aws-config
, such as theAssumeRoleCredentialsProvider
,ImdsCredentialsProvider
, and so on. - A custom struct that implements the
ProvideCredentials
The problem this RFC examines is that when options 2 and 3 above are exercised, the customer
needs to be aware of credentials caching and put additional effort to ensure caching is set up
correctly (and that double caching doesn't occur). This is especially difficult to get right
since some built-in credentials providers (such as AssumeRoleCredentialsProvider
) already
have caching, while most others do not and need to be wrapped in LazyCachingCredentialsProvider
.
The goal of this RFC is to create an API where Rust's type system ensures caching is set up correctly, or explicitly opted out of.
CredentialsCache
and ConfigLoader::credentials_cache
A new config method named credentials_cache()
will be added to ConfigLoader
and the
generated service Config
builders that takes a CredentialsCache
instance. This CredentialsCache
will be a struct with several functions on it to create and configure the cache.
Client creation will ultimately be responsible for taking this CredentialsCache
instance
and wrapping the given (or default) credentials provider.
The CredentialsCache
would look as follows:
enum Inner {
Lazy(LazyConfig),
// Eager doesn't exist today, so this is purely for illustration
Eager(EagerConfig),
// Custom may not be implemented right away
// Not naming or specifying the custom cache trait for now since its out of scope
Custom(Box<dyn SomeCacheTrait>),
NoCaching,
}
pub struct CredentialsCache {
inner: Inner,
}
impl CredentialsCache {
// These methods use default cache settings
pub fn lazy() -> Self { /* ... */ }
pub fn eager() -> Self { /* ... */ }
// Unprefixed methods return a builder that can take customizations
pub fn lazy_builder() -> LazyBuilder { /* ... */ }
pub fn eager_builder() -> EagerBuilder { /* ... */ }
// Later, when custom implementations are supported
pub fn custom(cache_impl: Box<dyn SomeCacheTrait>) -> Self { /* ... */ }
pub(crate) fn create_cache(
self,
provider: Box<dyn ProvideCredentials>,
sleep_impl: Arc<dyn AsyncSleep>
) -> SharedCredentialsProvider {
// Note: SharedCredentialsProvider would get renamed to SharedCredentialsCache.
// This code is using the old name to make it clearer that it already exists,
// and the rename is called out in the change checklist.
SharedCredentialsProvider::new(
match self {
Self::Lazy(inner) => LazyCachingCredentialsProvider::new(provider, settings.time, /* ... */),
Self::Eager(_inner) => unimplemented!(),
Self::Custom(_custom) => unimplemented!(),
Self::NoCaching => unimplemented!(),
}
)
}
}
Using a struct over a trait prevents custom caching implementations, but if customization is desired,
a Custom
variant could be added to the inner enum that has its own trait that customers implement.
The SharedCredentialsProvider
needs to be updated to take a cache implementation in addition to
the impl ProvideCredentials + 'static
. A sealed trait could be added to facilitate this.
Customers that don't care about credential caching can configure credential providers without needing to think about it:
let sdk_config = aws_config::from_env()
.credentials_provider(ImdsCredentialsProvider::builder().build())
.load()
.await;
However, if they want to customize the caching, they can do so without modifying the credentials provider at all (in case they want to use the default):
let sdk_config = aws_config::from_env()
.credentials_cache(CredentialsCache::default_eager())
.load()
.await;
The credentials_cache
will default to CredentialsCache::default_lazy()
if not provided.
Changes Checklist
-
Remove cache from
AssumeRoleProvider
-
Implement
CredentialsCache
with itsLazy
variant and builder -
Add
credentials_cache
method toConfigLoader
-
Refactor
ConfigLoader
to takeCredentialsCache
instead ofimpl ProvideCredentials + 'static
-
Refactor
SharedCredentialsProvider
to take a cache implementation in addition to animpl ProvideCredentials + 'static
-
Remove
ProvideCredentials
impl fromLazyCachingCredentialsProvider
-
Rename
LazyCachingCredentialsProvider
->LazyCredentialsCache
-
Refactor the SDK
Config
code generator to be consistent withConfigLoader
- Write changelog upgrade instructions
- Fix examples (if there are any for configuring caching)
Appendix: Alternatives Considered
Alternative A: ProvideCachedCredentials
trait
In this alternative, aws-types
has a ProvideCachedCredentials
in addition to ProvideCredentials
.
All individual credential providers (such as ImdsCredentialsProvider
) implement ProvideCredentials
,
while credential caches (such as LazyCachingCredentialsProvider
) implement the ProvideCachedCredentials
.
The ConfigLoader
would only take impl ProvideCachedCredentials
.
This allows customers to provide their own caching solution by implementing ProvideCachedCredentials
,
while requiring that caching be done correctly through the type system since ProvideCredentials
is
only useful inside the implementation of ProvideCachedCredentials
.
Caching can be opted out by creating a NoCacheCredentialsProvider
that implements ProvideCachedCredentials
without any caching logic, although this wouldn't be recommended and this provider wouldn't be vended
in aws-config
.
Example configuration:
// Compiles
let sdk_config = aws_config::from_env()
.credentials(
LazyCachingCredentialsProvider::builder()
.load(ImdsCredentialsProvider::new())
.build()
)
.load()
.await;
// Doesn't compile
let sdk_config = aws_config::from_env()
// Wrong type: doesn't implement `ProvideCachedCredentials`
.credentials(ImdsCredentialsProvider::new())
.load()
.await;
Another method could be added to ConfigLoader
that makes it easier to use the default cache:
let sdk_config = aws_config::from_env()
.credentials_with_default_cache(ImdsCredentialsProvider::new())
.load()
.await;
Pros/cons
- :+1: It's flexible, and somewhat enforces correct cache setup through types.
- :+1: Removes the possibility of double caching since the cache implementations won't
implement
ProvideCredentials
. - :-1: Customers may unintentionally implement
ProvideCachedCredentials
instead ofProvideCredentials
for a custom provider, and then not realize they're not benefiting from caching. - :-1: The documentation needs to make it very clear what the differences are between
ProvideCredentials
andProvideCachedCredentials
since they will look identical. - :-1: It's possible to implement both
ProvideCachedCredentials
andProvideCredentials
, which breaks the type safety goals.
Alternative B: CacheCredentials
trait
This alternative is similar to alternative A, except that the cache trait is distinct from ProvideCredentials
so
that it's more apparent when mistakenly implementing the wrong trait for a custom credentials provider.
A CacheCredentials
trait would be added that looks as follows:
pub trait CacheCredentials: Send + Sync + Debug {
async fn cached(&self, now: SystemTime) -> Result<Credentials, CredentialsError>;
}
Instances implementing CacheCredentials
need to own the ProvideCredentials
implementation
to make both lazy and eager credentials caching possible.
The configuration examples look identical to Option A.
Pros/cons
- :+1: It's flexible, and enforces correct cache setup through types slightly better than Option A.
- :+1: Removes the possibility of double caching since the cache implementations won't
implement
ProvideCredentials
. - :-1: Customers can still unintentionally implement the wrong trait and miss out on caching when creating custom credentials providers, but it will be more apparent than in Option A.
- :-1: It's possible to implement both
CacheCredentials
andProvideCredentials
, which breaks the type safety goals.
Alternative C: CredentialsCache
struct with composition
The struct approach posits that customers don't need or want to implement custom credential caching, but at the same time, doesn't make it impossible to add custom caching later.
The idea is that there would be a struct called CredentialsCache
that specifies the desired
caching approach for a given credentials provider:
pub struct LazyCache {
credentials_provider: Arc<dyn ProvideCredentials>,
// ...
}
pub struct EagerCache {
credentials_provider: Arc<dyn ProvideCredentials>,
// ...
}
pub struct CustomCache {
credentials_provider: Arc<dyn ProvideCredentials>,
// Not naming or specifying the custom cache trait for now since its out of scope
cache: Arc<dyn SomeCacheTrait>
}
enum CredentialsCacheInner {
Lazy(LazyCache),
// Eager doesn't exist today, so this is purely for illustration
Eager(EagerCache),
// Custom may not be implemented right away
Custom(CustomCache),
}
pub struct CredentialsCache {
inner: CredentialsCacheInner,
}
impl CredentialsCache {
// Methods prefixed with `default_` just use the default cache settings
pub fn default_lazy(provider: impl ProvideCredentials + 'static) -> Self { /* ... */ }
pub fn default_eager(provider: impl ProvideCredentials + 'static) -> Self { /* ... */ }
// Unprefixed methods return a builder that can take customizations
pub fn lazy(provider: impl ProvideCredentials + 'static) -> LazyBuilder { /* ... */ }
pub fn eager(provider: impl ProvideCredentials + 'static) -> EagerBuilder { /* ... */ }
pub(crate) fn create_cache(
self,
sleep_impl: Arc<dyn AsyncSleep>
) -> SharedCredentialsProvider {
// ^ Note: SharedCredentialsProvider would get renamed to SharedCredentialsCache.
// This code is using the old name to make it clearer that it already exists,
// and the rename is called out in the change checklist.
SharedCredentialsProvider::new(
match self {
Self::Lazy(inner) => LazyCachingCredentialsProvider::new(inner.credentials_provider, settings.time, /* ... */),
Self::Eager(_inner) => unimplemented!(),
Self::Custom(_custom) => unimplemented!(),
}
)
}
}
Using a struct over a trait prevents custom caching implementations, but if customization is desired,
a Custom
variant could be added to the inner enum that has its own trait that customers implement.
The SharedCredentialsProvider
needs to be updated to take a cache implementation rather
than impl ProvideCredentials + 'static
. A sealed trait could be added to facilitate this.
Configuration would look as follows:
let sdk_config = aws_config::from_env()
.credentials(CredentialsCache::default_lazy(ImdsCredentialsProvider::builder().build()))
.load()
.await;
The credentials_provider
method on ConfigLoader
would only take CredentialsCache
as an argument
so that the SDK could not be configured without credentials caching, or if opting out of caching becomes
a use case, then a CredentialsCache::NoCache
variant could be made.
Like alternative A, a convenience method can be added to make using the default cache easier:
let sdk_config = aws_config::from_env()
.credentials_with_default_cache(ImdsCredentialsProvider::builder().build())
.load()
.await;
In the future if custom caching is added, it would look as follows:
let sdk_config = aws_config::from_env()
.credentials(
CredentialsCache::custom(ImdsCredentialsProvider::builder().build(), MyCache::new())
)
.load()
.await;
The ConfigLoader
wouldn't be able to immediately set its credentials provider since other values
from the config are needed to construct the cache (such as sleep_impl
). Thus, the credentials
setter would merely save off the CredentialsCache
instance, and then when load
is called,
the complete SharedCredentialsProvider
would be constructed:
pub async fn load(self) -> SdkConfig {
// ...
let credentials_provider = self.credentials_cache.create_cache(sleep_impl);
// ...
}
Pros/cons
- :+1: Removes the possibility of missing out on caching when implementing a custom provider.
- :+1: Removes the possibility of double caching since the cache implementations won't
implement
ProvideCredentials
. - :-1: Requires thinking about caching when only wanting to customize the credentials provider
- :-1: Requires a lot of boilerplate in
aws-config
for the builders, enum variant structs, etc.
RFC: Finding New Home for Credential Types
Status: Implemented in smithy-rs#2108
Applies to: clients
This RFC supplements RFC 28 and discusses for the selected design where to place the types for credentials providers, credentials caching, and everything else that comes with them.
It is assumed that the primary motivation behind the introduction of type safe credentials caching remains the same as the preceding RFC.
Assumptions
This document assumes that the following items in the changes checklist in the preceding RFC have been implemented:
-
Implement
CredentialsCache
with itsLazy
variant and builder -
Add the
credentials_cache
method toConfigLoader
-
Rename
SharedCredentialsProvider
toSharedCredentialsCache
-
Remove
ProvideCredentials
impl fromLazyCachingCredentialsProvider
-
Rename
LazyCachingCredentialsProvider
->LazyCredentialsCache
-
Refactor the SDK
Config
code generator to be consistent withConfigLoader
Problems
Here is how our attempt to implement the selected design in the preceding RFC can lead to an obstacle. Consider this code snippet we are planning to support:
let sdk_config = aws_config::from_env()
.credentials_cache(CredentialsCache::lazy())
.load()
.await;
let client = aws_sdk_s3::Client::new(&sdk_config);
A CredentialsCache
created by CredentialsCache::lazy()
above will internally go through three crates before the variable client
has been created:
aws-config
: after it has been passed toaws_config::ConfigLoader::credentials_cache
// in lib.rs
impl ConfigLoader {
// --snip--
pub fn credentials_cache(mut self, credentials_cache: CredentialsCache) -> Self {
self.credentials_cache = Some(credentials_cache);
self
}
// --snip--
}
aws-types
: afteraws_config::ConfigLoader::load
has passed it toaws_types::sdk_config::Builder::credentials_cache
// in sdk_config.rs
impl Builder {
// --snip--
pub fn credentials_cache(mut self, cache: CredentialsCache) -> Self {
self.set_credentials_cache(Some(cache));
self
}
// --snip--
}
aws-sdk-s3
: afteraws_sdk_s3::Client::new
has been called with the variablesdk_config
// in client.rs
impl Client {
// --snip--
pub fn new(sdk_config: &aws_types::sdk_config::SdkConfig) -> Self {
Self::from_conf(sdk_config.into())
}
// --snip--
}
calls
// in config.rs
impl From<&aws_types::sdk_config::SdkConfig> for Builder {
fn from(input: &aws_types::sdk_config::SdkConfig) -> Self {
let mut builder = Builder::default();
builder = builder.region(input.region().cloned());
builder.set_endpoint_resolver(input.endpoint_resolver().clone());
builder.set_retry_config(input.retry_config().cloned());
builder.set_timeout_config(input.timeout_config().cloned());
builder.set_sleep_impl(input.sleep_impl());
builder.set_credentials_cache(input.credentials_cache().cloned());
builder.set_credentials_provider(input.credentials_provider().cloned());
builder.set_app_name(input.app_name().cloned());
builder.set_http_connector(input.http_connector().cloned());
builder
}
}
impl From<&aws_types::sdk_config::SdkConfig> for Config {
fn from(sdk_config: &aws_types::sdk_config::SdkConfig) -> Self {
Builder::from(sdk_config).build()
}
}
What this all means is that CredentialsCache
needs to be accessible from aws-config
, aws-types
, and aws-sdk-s3
(SDK client crates, to be more generic). We originally assumed that CredentialsCache
would be defined in aws-config
along with LazyCredentialsCache
, but the assumption no longer holds because aws-types
and aws-sdk-s3
do not depend upon aws-config
.
Therefore, we need to find a new place in which to create credentials caches accessible from the aforementioned crates.
Proposed Solution
We propose to move the following items to a new crate called aws-credential-types
:
- All items in
aws_types::credentials
and their dependencies - All items in
aws_config::meta::credentials
and their dependencies
For the first bullet point, we move types and traits associated with credentials out of aws-types
. Crucially, the ProvideCredentials
trait now lives in aws-credential-types
.
For the second bullet point, we move the items related to credentials caching. CredentialsCache
with its Lazy
variant and builder lives in aws-credential-types
and CredentialsCache::create_cache
will be marked as pub
. One area where we make an adjustment, though, is that LazyCredentialsCache
depends on aws_types::os_shim_internal::TimeSource
so we need to move TimeSource
into aws-credentials-types
as well.
A result of the above arrangement will give us the following module dependencies (only showing what's relevant):
- :+1:
aws_types::sdk_config::Builder
and a service clientconfig::Builder
can create aSharedCredentialsCache
with a concrete type of credentials cache. - :+1: It avoids cyclic crate dependencies.
- :-1: There is one more AWS runtime crate to maintain and version.
Rejected Alternative
An alternative design is to move the following items to a separate crate (tentatively called aws-XXX
):
- All items in
aws_types::sdk_config
, i.e.SdkConfig
and its builder - All items in
aws_types::credentials
and their dependencies - All items in
aws_config::meta::credentials
and their dependencies
The reason for the first bullet point is that the builder needs to be somewhere it has access to the credentials caching factory function, CredentialsCache::create_cache
. The factory function is in aws-XXX
and if the builder stayed in aws-types
, it would cause a cyclic dependency between those two crates.
A result of the above arrangement will give us the following module dependencies:
We have dismissed this design mainly because we try moving out of the aws-types
create as little as possible. Another downside is that SdkConfig
sitting together with the items for credentials provider & caching does not give us a coherent mental model for the aws-XXX
crate, making it difficult to choose the right name for XXX
.
Changes Checklist
The following list does not repeat what is listed in the preceding RFC but does include those new mentioned in the Assumptions
section:
-
Create
aws-credential-types
-
Move all items in
aws_types::credentials
and their dependencies to theaws-credential-types
crate -
Move all items in
aws_config::meta::credentials
and their dependencies to theaws-credential-types
crate - Update use statements and fully qualified names in the affected places
RFC: Serialization and Deserialization
Status: RFC
Applies to: Output, Input, and Builder types as well as
DateTime
,Document
,Blob
, andNumber
implemented inaws_smithy_types
crate.
Terminology
- Builder
Refers to data types prefixed with
Builder
, which converts itself into a corresponding data type upon being built. e.g.aws_sdk_dynamodb::input::PutItemInput
. - serde
Refers to
serde
crate. Serialize
Refers toSerialize
trait avaialble onserde
crate.Deserialize
Refers toDeserialize
trait available onserde
crate.
Overview
We are going to implement Serialize and Deserialize traits from serde
crate to some data types.
Data types that are going to be affected are;
- builder data types
- operation
Input
types - operation
Output
types - data types that builder types may have on their field(s)
aws_smithy_types::DateTime
aws_smithy_types::Document
aws_smithy_types::Blob
aws_smithy_types::Number
DateTime
and Blob
implements different serialization/deserialization format for human-readable and non-human readable format; We must emphasize that these 2 formats are not compatible with each other. The reason for this is explained in the Blob section and Date Time.
Additionally, we add fn set_fields
to fluent builders to allow users to set the data they deserialized to fluent builders.
Lastly, we emphasize that this RFC does NOT aim to serialize the entire response or request or implement serde
traits on data types for server-side code.
Use Case
Users have requested serde
traits to be implemented on data types implemented in rust SDK.
We have created this RFC with the following use cases in mind.
- [request]: Serialize/Deserialize of models for Lambda events #269
- Tests as suggested in the design FAQ.
- Building tools
Feature Gate
Enabling Feature
To enable any of the features from this RFC, user must pass --cfg aws-sdk-unstable
to rustc.
You can do this by specifying it on env-variable or by config.toml.
- specifying it on .cargo/config.toml
[build]
rustflags = ["--cfg", "aws-sdk-unstable"]
- As an environment variable
export RUSTFLAGS="--cfg aws-sdk-unstable"
cargo build
We considered allowing users to enable this feature on a crate-level.
e.g.
[dependencies]
aws_sdk_dynamodb = { version = "0.22.0", features = ["unstable", "serialize"] }
Compared to the cfg approach, it is lot easier for the users to enable this feature. However, we believe that cfg approach ensure users won't enable this feature by surprise, and communicate to users that features behind this feature gate can be taken-away or exprience breaking changes any time in future.
Feature Gate for Serialization and De-serialization
Serde
traits are implemented behind feature gates.
Serialize
is implemented behind serde-serialize
, while Deserialize
is implemented behind serde-deserialize
.
Users must enable the unstable
feature to expose those features.
We considered giving each feature a dedicated feature gate such as unstable-serde-serialize
.
In this case, we will need to change the name of feature gates entirely once it leaves the unstable status which will cause users to make changes to their code base.
We conclude that this brings no benefit to the users.
Furthermore, we considered naming the fature-gate serialize
/deserialize
.
However, this way it would be confusing for the users when we add support for different serialization/deserialization framework such as deser
.
Thus, to emphasize that the traits is from serde
crate, we decided to name it serde-serialize
/serde-deserialize
Keeping both features behind the same feature gate
We considered keeping both features behind the same feature gate. There is no significant difference in the complexity of implementation. We do not see any benefit in keeping them behind the same feature gate as this will only increase compile time when users do not need one of the features.
Different feature gates for different data types
We considered implementing different feature gates for output, input, and their corresponding data types.
For example, output and input types can have output-serde-*
and input-serde-*
.
We are unable to do this as relevant metadata is not available during the code-gen.
Implementation
Smithy Types
aws_smithy_types
is a crate that implements smithy's data types.
These data types must implement serde traits as well since SDK uses the data types.
Blob
Serialize
and Deserialize
is not implemented with derive macro.
In human-readable format, Blob
is serialized as a base64 encoded string and any data to be deserialized as this data type must be encoded in base 64.
Encoding must be carried out by base64::encode
function available from aws_smithy_types
crate.
Non-human readable format serializes Blob
with fn serialize_bytes
.
- Reason behind the implementation of human-readable format
aws_smithy_types
crate comes with functions for encoding/decoding base 64, which makes the implementation simpler.
Additionally, AWS CLI and AWS SDK for other languages require data to be encoded in base 64 when it requires Blob
type as input.
We also considered serializing them with serialize_bytes
, without encoding them with serialize_bytes
.
In this case, the implementation will depend on the implementation of the library author.
There are many different crates, so we decided to survey how some of the most popular crates implement this feature.
library | version | implementation | all-time downloads on crate.io as of writing (Dec 2022) |
---|---|---|---|
serde_json | 1.0 | Array of number | 109,491,713 |
toml | 0.5.9 | Array of number | 63,601,994 |
serde_yaml | 0.9.14 | Unsupported | 23,767,300 |
First of all, bytes could have hundreds of elements; reading an array of hundreds of numbers will never be a pleasing experience, and it is especially troubling when you are writing data for test cases. Additionally, it has come to our attention that some crates just doesn't support it, which would hinder users' ability to be productive and tie users' hand.
For the reasons described above, we believe that it is crucial to encode them to string and base64 is favourable over other encoding schemes such as base 16, 32, or Ascii85.
- Reason behind the implementation of a non-human readable format We considered using the same logic for non-human readable format as well. However, readable-ness is not necessary for non-human readable format. Additionally, non-human readable format tends to emphasize resource efficiency over human-readable format; Base64 encoded string would take up more space, which is not what the users would want.
Thus, we believe that implementing a tailored serialization logic would be beneficial to the users.
DateTime
Serialize
and Deserialize
is not implemented with derive macro.
For human-readable format, DateTime
is serialized in RFC-3339 format;
It expects the value to be in RFC-3339 format when it is Deserialized.
Non-human readable implements DateTime
as a tuple of u32
and i64
; the latter corresponds to seconds
field and the first is the seubsecond_nanos
.
- Reason behind the implementation of a human-readable format
For serialization, DateTime
format already implements a function to encode itself into RFC-3339 format.
For deserialization, it is possible to accept other formats, we can add this later if we find it reasonable.
- Reason behind the implementation of a non-human readable format
Serializing them as tuples of two integers results in a smaller data size and requires less computing power than any string-based format. Tuple will be smaller in size as it does not require tagging like in maps.
Document
Serialize
and Deserialize
is implemented with derive macro.
Additionally, it implements container attribute #[serde(untagged)]
.
Serde can distinguish each variant without tagging thanks to the difference in each variant's datatypes.
Number
Serialize
and Deserialize
is implemented with derive macro.
Additionally, it implements container attribute #[serde(untagged)]
.
Serde can distinguish each variant without a tag as each variant's content is different.
Builder Types and Non-Builder Types
Builder types and non Builder types implement Serialize
and Deserialize
with derive macro.
Example:
#[cfg_attr(
all(aws-sdk-unstable, feature = "serialize"),
derive(serde::Serialize)
)]
#[cfg_attr(
all(aws-sdk-unstable, feature = "deserialize"),
derive(serde::Deserialize)
)]
#[non_exhaustive]
#[derive(std::clone::Clone, std::cmp::PartialEq)]
pub struct UploadPartCopyOutput {
...
}
Enum Representation
serde
allows programmers to use one of four different tagging (internal, external, adjacent, and untagged) when serializing an enum.
untagged
You cannot deserialize serialized data in some cases.
For example, aws_sdk_dynamodb::model::AttributeValue has Null(bool)
and Bool(bool)
, which you cannot distinguish serialized values without a tag.
internal
This results in compile time error. Using a #[serde(tag = "...")] attribute on an enum containing a tuple variant is an error at compile time.
external and adjacent
We are left with external
and adjacent
tagging.
External tagging is the default way.
This RFC can be achieved either way.
The resulting size of the serialized data is smaller when tagged externally, as adjacent tagging will require a tag even when a variant has no content.
For the reasons mentioned above, we implement an enum that is externally tagged.
Data Types to Skip Serialization/Deserialization
We are going to skip serialization and deserialization of fields that have the datatype that corresponds to @streaming blob
from smithy.
Any fields with these data types are tagged with #[serde(skip)]
.
By skipping, corresponding field's value will be assigned the value generated by Default
trait.
As of writing, aws_smithy_http::byte_stream::ByteStream
is the only data type that is affected by this decision.
Here is an example of data types affected by this decision:
aws_sdk_s3::input::put_object_input::PutObjectInput
We considered serializing them as bytes, however, it could take some time for a stream to reach the end, and the resulting serialized data may be too big for itself to fit into the ram.
Here is an example snippet.
#[allow(missing_docs)]
#[cfg_attr(
all(aws-sdk-unstable, feature = "serde-serialize"),
derive(serde::Serialize)
)]
#[cfg_attr(
all(aws-sdk-unstable, feature = "serde-deserialize"),
derive(serde::Deserialize)
)]
#[non_exhaustive]
#[derive(std::fmt::Debug)]
pub struct PutObjectInput {
pub acl: std::option::Option<crate::model::ObjectCannedAcl>,
pub body: aws_smithy_http::byte_stream::ByteStream,
// ... other fields
}
Data types to exclude from ser/de code generation
For data types that include @streaming union
in any of their fields, we do NOT implement serde
traits.
As of writing, following Rust data types corresponds to @streaming union
.
aws_smithy_http::event_stream::Receiver
aws_smithy_http::event_stream::EventStreamSender
Here is an example of data type affected by this decision;
aws_sdk_transcribestreaming::client::fluent_builders::StartMedicalStreamTranscription
We considered skipping relevant fields on serialization and creating a custom de-serialization function which creates event stream that will always result in error when a user tries to send/receive data. However, we believe that our decision is justified for following reason.
- All for operations that feature event streams since the stream is ephemeral (tied to the HTTP connection), and is effectively unusable after serialization and deserialization
- Most event stream operations don't have fields that go along with them, making the stream the sole component in them, which makes ser/de not so useful
- SDK that uses event stream, such as
aws-sdk-transcribestreaming
only has just over 5000 all-time downloads with recent downloads of just under 1000 as of writing (2023/01/21); It makes it difficult to justify since the implementation impacts smaller number of people.
Serde
traits implemented on Builder of Output Types
Output data, such as aws_sdk_dynamodb::output::UpdateTableOutput
has builder types.
These builder types are available to users, however, no API requires users to build data types by themselves.
We considered removing traits from these data types.
Removing serde traits on these types will help reduce compile time, however, builder type can be useful, for example, for testing. We have prepared examples here.
fn set_fields
to allow users to use externally created Input
Currently, to set the value to fluent builders, users must call setter methods for each field.
SDK does not have a method that allows users to use deserialized Input
.
Thus, we add a new method fn set_fields
to Client
types.
This method accepts inputs and replaces all parameters that Client
has with the new one.
pub fn set_fields(mut self, input_type: path::to::input_type) -> path::to::input_type {
self.inner = input_type;
self
}
Users can use fn set_fields
to replace the parameters in fluent builders.
You can find examples here.
Other Concerns
Model evolution
SDK will introduce new fields and we may see new data types in the future.
We believe that this will not be a problem.
Introduction of New Fields
Most fields are Option<T>
type.
When the user de-serializes data written for a format before the new fields were introduced, new fields will be assigned with None
type.
If a field isn't Option
, serde
uses Default
trait unless a custom de-serialization/serialization is specified to generate data to fill the field.
If the new field is not an Option<T>
type and has no Default
implementation, we must implement a custom de-serialization logic.
In the case of serialization, the introduction of new fields will not be an issue unless the data format requires a schema. (e.g. parquet, avro) However, this is outside the scope of this RFC.
Introduction of New Data Type
If a new field introduces a new data type, it will not require any additional work if the data type can derive serde
traits.
If the data cannot derive serde
traits on its own, then we have two options.
To clarify, this is the same approach we took on Data Type to skip
section.
- skip
We will simply skip serializing/de-serializing. However, we may need to implement custom serialization/de-serialization logic if a value is not wrapped with
Option
. - custom serialization/de-serialization logic We can implement tailored serialization/de-serialization logic.
Either way, we will mention this on the generated docs to avoid surprising users.
e.g.
#[derive(serde::Serialize, serde::Deserialize)]
struct OutputV1 {
string_field: Option<String>
}
#[derive(serde::Serialize, serde::Deserialize)]
struct OutputV2 {
string_field: Option<String>,
// this will always be treated as None value by serde
#[serde(skip)]
skip_not_serializable: Option<SomeComplexDataType>,
// We can implement a custom serialization logic
#[serde(serialize_with = "custom_serilization_logic", deserialize_with = "custom_deserilization_logic")]
not_derive_able: SomeComplexDataType,
// Serialization will be skipped, and de-serialization will be handled with the function provided on default tag
#[serde(skip, default = "default_value")]
skip_with_custom: DataTypeWithoutDefaultTrait,
}
Discussions
Sensitive Information
If serialized data contains sensitive information, it will not be masked. We mention that fields can compromise such information on every struct field to ensure that users know this.
Compile Time
We ran the following benchmark on C6a.2xlarge instance with 50gb of GP2 SSD. The commit hash of the code is a8e2e19129aead4fbc8cf0e3d34df0188a62de9f.
It clearly shows an increase in compile time. Users are advised to consider the use of software such as sccache or mold to reduce the compile time.
-
aws-sdk-dynamodb
-
when compiled with debug profile
command real time user time sys time cargo build 0m35.728s 2m24.243s 0m11.868s cargo build --features unstable-serde-serialize 0m38.079s 2m26.082s 0m11.631s cargo build --features unstable-serde-deserialize 0m45.689s 2m34.000s 0m11.978s cargo build --all-features 0m48.959s 2m45.688s 0m13.359s -
when compiled with release profile
command real time user time sys time cargo build --release 0m52.040s 5m0.841s 0m11.313s cargo build --release --features unstable-serde-serialize 0m53.153s 5m4.069s 0m11.577s cargo build --release --features unstable-serde-deserialize 1m0.107s 5m10.231s 0m11.699s cargo build --release --all-features 1m3.198s 5m26.076s 0m12.311s
-
-
aws-sdk-ec2
-
when compiled with debug profile
command real time user time sys time cargo build 1m20.041s 2m14.592s 0m6.611s cargo build --features unstable-serde-serialize 2m0.555s 4m24.881s 0m16.131s cargo build --features unstable-serde-deserialize 3m10.857s 5m34.246s 0m18.844s cargo build --all-features 3m31.473s 6m1.052s 0m19.681s -
when compiled with release profile
command real time user time sys time cargo build --release 2m29.480s 9m19.530s 0m15.957s cargo build --release --features unstable-serde-serialize 2m45.002s 9m43.098s 0m16.886s cargo build --release --features unstable-serde-deserialize 3m47.531s 10m52.017s 0m18.404s cargo build --release --all-features 3m45.208s 8m46.168s 0m10.211s
-
Misleading Results
SDK team previously expressed concern that serialized data may be misleading. We believe that features implemented as part of this RFC do not produce a misleading result as we focus on builder types and it's corresponding data types which are mapped to serde's data type model with the derive macro.
Appendix
Use Case Examples
use aws_sdk_dynamodb::{Client, Error};
async fn example(read_builder: bool) -> Result<(), Error> {
// getting the client
let shared_config = aws_config::load_from_env().await;
let client = Client::new(&shared_config);
// de-serializing input's builder types and input types from json
let deserialized_input = if read_builder {
let mut parameter: aws_sdk_dynamodb::input::list_tables_input::Builder = serde_json::from_str(include_str!("./builder.json"));
parameter.set_exclusive_start_table_name("some_name").build()
} else {
let input: aws_sdk_dynamodb::input::ListTablesInput = serde_json::from_str(include_str!("./input.json"));
input
};
// sending request using the deserialized input
let res = client.list_tables().set_fields(deserialized_input).send().await?;
println!("DynamoDB tables: {:?}", res.table_names);
let out: aws_sdk_dynamodb::output::ListTablesOutput = {
// say you want some of the field to have certain values
let mut out_builder: aws_sdk_dynamodb::output::list_tables_output::Builder = serde_json::from_str(r#"
{
table_names: [ "table1", "table2" ]
}
"#);
// but you don't really care about some other values
out_builder.set_last_evaluated_table_name(res.last_evaluated_table_name()).build()
};
assert_eq!(res, out);
// serializing json output
let json_output = serde_json::to_string(res).unwrap();
// you can save the serialized input
println!(json_output);
Ok(())
}
Changes checklist
-
Implement human-redable serialization for
DateTime
andBlob
inaws_smithy_types
-
Implement non-human-redable serialization for
DateTime
andBlob
inaws_smithy_types
-
Implement
Serialize
andDeserialize
for relevant data types inaws_smithy_types
-
Modify Kotlin's codegen so that generated Builder and non-Builder types implement
Serialize
andDeserialize
-
Add feature gate for
Serialize
andDeserialize
- Prepare examples
- Prepare reproducible compile time benchmark
RFC: Providing fallback credentials on external timeout
Status: Implemented in smithy-rs#2246
Applies to: client
For a summarized list of proposed changes, see the Changes Checklist section.
This RFC proposes a fallback mechanism for credentials providers on external timeout (see the Terminology section), allowing them to continue serving (possibly expired) credentials for the sake of overall reliability of the intended service; The IMDS credentials provider is an example that must fulfill such a requirement to support static stability.
Terminology
- External timeout: The name of the timeout that occurs when a duration elapses before an async call to
provide_credentials
returns. In this case,provide_credentials
returns no credentials. - Internal timeout: The name of the timeout that occurs when a duration elapses before an async call to some function, inside the implementation of
provide_credentials
, returns. Examples include connection timeouts, TLS negotiation timeouts, and HTTP request timeouts. Implementations ofprovide_credentials
may handle these failures at their own discretion e.g. by returning (possibly expired) credentials or aCredentialsError
. - Static stability: Continued availability of a service in the face of impaired dependencies.
Assumption
This RFC is concerned only with external timeouts, as the cost of poor API design is much higher in this case than for internal timeouts. The former will affect a public trait implemented by all credentials providers whereas the latter can be handled locally by individual credentials providers without affecting one another.
Problem
We have mentioned static stability. Supporting it calls for the following functional requirement, among others:
- REQ 1: Once a credentials provider has served credentials, it should continue serving them in the event of a timeout (whether internal or external) while obtaining refreshed credentials.
Today, we have the following trait method to obtain credentials:
fn provide_credentials<'a>(&'a self) -> future::ProvideCredentials<'a>
where
Self: 'a,
This method returns a future, which can be raced against a timeout future as demonstrated by the following code snippet from LazyCredentialsCache
:
let timeout_future = self.sleeper.sleep(self.load_timeout); // by default self.load_timeout is 5 seconds.
// --snip--
let future = Timeout::new(provider.provide_credentials(), timeout_future);
let result = cache
.get_or_load(|| async move {
let credentials = future.await.map_err(|_err| {
CredentialsError::provider_timed_out(load_timeout)
})??;
// --snip--
}).await;
// --snip--
This creates an external timeout for provide_credentials
. If timeout_future
wins the race, a future for provide_credentials
gets dropped, timeout_future
returns an error, and the error is mapped to CredentialsError::ProviderTimedOut
and returned. This makes it impossible for the variable provider
above to serve credentials as stated in REQ 1.
A more complex use case involves CredentialsProviderChain
. It is a manifestation of the chain of responsibility pattern and keeps calling the provide_credentials
method on each credentials provider down the chain until credentials are returned by one of them. In addition to REQ 1, we have the following functional requirement with respect to CredentialsProviderChain
:
- REQ 2: Once a credentials provider in the chain has returned credentials, it should continue serving them even in the event of a timeout (whether internal or external) without falling back to another credentials provider.
Referring back to the code snippet above, we analyze two relevant cases (and suppose provider 2 below must meet REQ 1 and REQ 2 in each case):
Case 1: Provider 2 successfully loaded credentials but later failed to do so because an external timeout kicked in.
The figure above illustrates an example. This CredentialsProviderChain
consists of three credentials providers. When CredentialsProviderChain::provide_credentials
is called, provider 1's provide_credentials
is called but does not find credentials so passes the torch to provider 2, which in turn successfully loads credentials and returns them. The next time the method is called, provider 1 does not find credentials but neither does provider 2 this time, because an external timeout by timeout_future
given to the whole chain kicked in and the future is dropped while provider 2's provide_credentials
was running. Given the functional requirements, provider 2 should return the previously available credentials but today the code snippet from LazyCredentialsCache
returns a CredentialsError::ProviderTimedOut
instead.
Case 2: Provider 2 successfully loaded credentials but later was not reached because its preceding provider was still running when an external timeout kicked in.
The figure above illustrates an example with the same setting as the previous figure. Again, when CredentialsProviderChain::provide_credentials
is called the first time, provider 1 does not find credentials but provider 2 does. The next time the method is called, provider 1 is still executing provide_credentials
and then an external timeout by timeout_future
kicked in. Consequently, the execution of CredentialsProviderChain::provide_credentials
has been terminated. Given the functional requirements, provider 2 should return the previously available credentials but today the code snippet from LazyCredentialsCache
returns CredentialsError::ProviderTimedOut
instead.
Proposal
To address the problem in the previous section, we propose to add a new method to the ProvideCredentials
trait called fallback_on_interrupt
. This method allows credentials providers to have a fallback mechanism on an external timeout and to serve credentials to users if needed. There are two options as to how it is implemented, either as a synchronous primitive or as an asynchronous primitive.
Option A: Synchronous primitive
pub trait ProvideCredentials: Send + Sync + std::fmt::Debug {
// --snip--
fn fallback_on_interrupt(&self) -> Option<Credentials> {
None
}
}
- :+1: Users can be guided to use only synchronous primitives when implementing
fallback_on_interrupt
. - :-1: It cannot support cases where fallback credentials are asynchronously retrieved.
- :-1: It may turn into a blocking operation if it takes longer than it should.
Option B: Asynchronous primitive
mod future {
// --snip--
// This cannot use `OnlyReady` in place of `BoxFuture` because
// when a chain of credentials providers implements its own
// `fallback_on_interrupt`, it needs to await fallback credentials
// in its inner providers. Thus, `BoxFuture` is required.
pub struct FallbackOnInterrupt<'a>(NowOrLater<Option<Credentials>, BoxFuture<'a, Option<Credentials>>>);
// impls for FallbackOnInterrupt similar to those for the ProvideCredentials future newtype
}
pub trait ProvideCredentials: Send + Sync + std::fmt::Debug {
// --snip--
fn fallback_on_interrupt<'a>(&'a self) -> future::FallbackOnInterrupt<'a> {
future::FallbackOnInterrupt::ready(None)
}
}
- :+1: It is async from the beginning, so less likely to introduce a breaking change.
- :-1: We may have to consider yet another timeout for
fallback_on_interrupt
itself.
Option A cannot be reversible in the future if we are to support the use case for asynchronously retrieving the fallback credentials, whereas option B allows us to continue supporting both ready and pending futures when retrieving the fallback credentials. However, fallback_on_interrupt
is supposed to return credentials that have been set aside in case provide_credentials
is timed out. To express that intent, we choose option A and document that users should NOT go fetch new credentials in fallback_on_interrupt
.
The user experience for the code snippet in question will look like this once this proposal is implemented:
let timeout_future = self.sleeper.sleep(self.load_timeout); // by default self.load_timeout is 5 seconds.
// --snip--
let future = Timeout::new(provider.provide_credentials(), timeout_future);
let result = cache
.get_or_load(|| {
async move {
let credentials = match future.await {
Ok(creds) => creds?,
Err(_err) => match provider.fallback_on_interrupt() { // can provide fallback credentials
Some(creds) => creds,
None => return Err(CredentialsError::provider_timed_out(load_timeout)),
}
};
// --snip--
}
}).await;
// --snip--
How to actually implement this RFC
Almost all credentials providers do not have to implement their own fallback_on_interrupt
except for CredentialsProviderChain
(ImdsCredentialsProvider
also needs to implement fallback_on_interrupt
when we are adding static stability support to it but that is outside the scope of this RFC).
Considering the two cases we analyzed above, implementing CredentialsProviderChain::fallback_on_interrupt
is not so straightforward. Keeping track of whose turn in the chain it is to call provide_credentials
when an external timeout has occurred is a challenging task. Even if we figured it out, that would still not satisfy Case 2
above, because it was provider 1 that was actively running when the external timeout kicked in, but the chain should return credentials from provider 2, not from provider 1.
With that in mind, consider instead the following approach:
impl ProvideCredentials for CredentialsProviderChain {
// --snip--
fn fallback_on_interrupt(&self) -> Option<Credentials> { {
for (_, provider) in &self.providers {
match provider.fallback_on_interrupt() {
creds @ Some(_) => return creds,
None => {}
}
}
None
}
}
CredentialsProviderChain::fallback_on_interrupt
will invoke each provider's fallback_on_interrupt
method until credentials are returned by one of them. It ensures that the updated code snippet for LazyCredentialsCache
can return credentials from provider 2 in both Case 1
and Case 2
. Even if timeout_future
wins the race, the execution subsequently calls provider.fallback_on_interrupt()
to obtain fallback credentials from provider 2, assuming provider 2's fallback_on_interrupt
is implemented to return fallback credentials accordingly.
The downside of this simple approach is that the behavior is not clear if more than one credentials provider in the chain can return credentials from their fallback_on_interrupt
. Note, however, that it is the exception rather than the norm for a provider's fallback_on_interrupt
to return fallback credentials, at least at the time of writing (01/13/2023). The fact that it returns fallback credentials means that the provider successfully loaded credentials at least once, and it usually continues serving credentials on subsequent calls to provide_credentials
.
Should we have more than one provider in the chain that can potentially return fallback credentials from fallback_on_interrupt
, we could configure the behavior of CredentialsProviderChain
managing in what order and how each fallback_on_interrupt
should be executed. See the Possible enhancement
section for more details. The use case described there is an extreme edge case, but it's worth exploring what options are available to us with the proposed design.
Alternative
In this section, we will describe an alternative approach that we ended up dismissing as unworkable.
Instead of fallback_on_interrupt
, we considered the following method to be added to the ProvideCredentials
trait:
pub trait ProvideCredentials: Send + Sync + std::fmt::Debug {
// --snip--
/// Returns a future that provides credentials within the given `timeout`.
///
/// The default implementation races `provide_credentials` against
/// a timeout future created from `timeout`.
fn provide_credentials_with_timeout<'a>(
&'a self,
sleeper: Arc<dyn AsyncSleep>,
timeout: Duration,
) -> future::ProvideCredentials<'a>
where
Self: 'a,
{
let timeout_future = sleeper.sleep(timeout);
let future = Timeout::new(self.provide_credentials(), timeout_future);
future::ProvideCredentials::new(async move {
let credentials = future
.await
.map_err(|_err| CredentialsError::provider_timed_out(timeout))?;
credentials
})
}
provide_credentials_with_timeout
encapsulated the timeout race and allowed users to specify how long the external timeout for provide_credentials
would be. The code snippet from LazyCredentialsCache
then looked like
let sleeper = Arc::clone(&self.sleeper);
let load_timeout = self.load_timeout; // by default self.load_timeout is 5 seconds.
// --snip--
let result = cache
.get_or_load(|| {
async move {
let credentials = provider
.provide_credentials_with_timeout(sleeper, load_timeout)
.await?;
// --snip--
}
}).await;
// --snip--
However, implementing CredentialsProviderChain::provide_credentials_with_timeout
quickly ran into the following problem:
impl ProvideCredentials for CredentialsProviderChain {
// --snip--
fn provide_credentials_with_timeout<'a>(
&'a self,
sleeper: Arc<dyn AsyncSleep>,
timeout: Duration,
) -> future::ProvideCredentials<'a>
where
Self: 'a,
{
future::ProvideCredentials::new(self.credentials_with_timeout(sleeper, timeout))
}
}
impl CredentialsProviderChain {
// --snip--
async fn credentials_with_timeout(
&self,
sleeper: Arc<dyn AsyncSleep>,
timeout: Duration,
) -> provider::Result {
for (_, provider) in &self.providers {
match provider
.provide_credentials_with_timeout(Arc::clone(&sleeper), /* how do we calculate timeout for each provider ? */)
.await
{
Ok(credentials) => {
return Ok(credentials);
}
Err(CredentialsError::ProviderTimedOut(_)) => {
// --snip--
}
Err(err) => {
// --snip--
}
}
}
Err(CredentialsError::provider_timed_out(timeout))
}
There are mainly two problems with this approach. The first problem is that as shown above, there is no sensible way to calculate a timeout for each provider in the chain. The second problem is that exposing a parameter like timeout
at a public trait's level is giving too much control to users; delegating overall timeout to the individual provider means each provider has to get it right.
Changes checklist
-
Add
fallback_on_interrupt
method to theProvideCredentials
trait with the default implementation -
Implement
CredentialsProviderChain::fallback_on_interrupt
-
Implement
DefaultCredentialsChain::fallback_on_interrupt
-
Add unit tests for
Case 1
andCase 2
Possible enhancement
We will describe how to customize the behavior for CredentialsProviderChain::fallback_on_interrupt
. We are only demonstrating how much the proposed design can be extended and currently do not have concrete use cases to implement using what we present in this section.
As described in the Proposal section, CredentialsProviderChain::fallback_on_interrupt
traverses the chain from the head to the tail and returns the first fallback credentials found. This precedence policy works most of the time, but when we have more than one provider in the chain that can potentially return fallback credentials, it could break in the following edge case (we are still basing our discussion on the code snippet from LazyCredentialsCache
but forget REQ 1 and REQ 2 for the sake of simplicity).
During the first call to CredentialsProviderChain::provide_credentials
, provider 1 fails to load credentials, maybe due to an internal timeout, and then provider 2 succeeds in loading its credentials (call them credentials 2) and internally stores them for Provider2::fallback_on_interrupt
to return them subsequently. During the second call, provider 1 succeeds in loading credentials (call them credentials 1) and internally stores them for Provider1::fallback_on_interrupt
to return them subsequently. Suppose, however, that credentials 1's expiry is earlier than credentials 2's expiry. Finally, during the third call, CredentialsProviderChain::provide_credentials
did not complete due to an external timeout. CredentialsProviderChain::fallback_on_interrupt
then returns credentials 1, when it should return credentials 2 whose expiry is later, because of the precedence policy.
This a case where CredentialsProviderChain::fallback_on_interrupt
requires the recency policy for fallback credentials found in provider 1 and provider 2, not the precedence policy. The following figure shows how we can set up such a chain:
The outermost chain is a CredentialsProviderChain
and follows the precedence policy for fallback_on_interrupt
. It contains a sub-chain that, in turn, contains provider 1 and provider 2. This sub-chain implements its own fallback_on_interrupt
to realize the recency policy for fallback credentials found in provider 1 and provider 2. Conceptually, we have
pub struct FallbackRecencyChain {
provider_chain: CredentialsProviderChain,
}
impl ProvideCredentials for FallbackRecencyChain {
fn provide_credentials<'a>(&'a self) -> future::ProvideCredentials<'a>
where
Self: 'a,
{
// Can follow the precedence policy for loading credentials
// if it chooses to do so.
}
fn fallback_on_interrupt(&self) -> Option<Credentials> {
// Iterate over `self.provider_chain` and return
// fallback credentials whose expiry is the most recent.
}
}
We can then compose the entire chain like so:
let provider_1 = /* ... */
let provider_2 = /* ... */
let provider_3 = /* ... */
let sub_chain = CredentialsProviderChain::first_try("Provider1", provider_1)
.or_else("Provider2", provider_2);
let recency_chain = /* Create a FallbackRecencyChain with sub_chain */
let final_chain = CredentialsProviderChain::first_try("fallback_recency", recency_chain)
.or_else("Provider3", provider_3);
The fallback_on_interrupt
method on final_chain
still traverses from the head to the tail, but once it hits recency_chain
, fallback_on_interrupt
on recency_chain
respects the expiry of fallback credentials found in its inner providers.
What we have presented in this section can be generalized thanks to chain composability. We could have different sub-chains, each implementing its own policy for fallback_on_interrupt
.
RFC: Better Constraint Violations
Status: Accepted
Applies to: server
During and after the design and the core implementation of constraint traits in the server SDK, some problems relating to constraint violations were identified. This RFC sets out to explain and address three of them: impossible constraint violations, collecting constraint violations, and "tightness" of constraint violations. The RFC explains each of them in turn, solving them in an iterative and pedagogical manner, i.e. the solution of a problem depends on the previous ones having been solved with their proposed solutions. The three problems are meant to be addressed atomically in one changeset (see the Checklist) section.
Note: code snippets from generated SDKs in this document are abridged so as to
be didactic and relevant to the point being made. They are accurate with
regards to commit 2226fe
.
Terminology
The design and the description of the PR where the core implementation of constraint traits was made are recommended prior reading to understand this RFC.
- Shape closure: the set of shapes a shape can "reach", including itself.
- Transitively constrained shape: a shape whose closure includes:
- a shape with a constraint trait attached,
- a (member) shape with a
required
trait attached, - an
enum
shape; or - an
intEnum
shape.
- A directly constrained shape is any of these:
- a shape with a constraint trait attached,
- a (member) shape with a
required
trait attached, - an
enum
shape, - an
intEnum
shape; or - a
structure
shape with at least onerequired
member shape.
- Constrained type: the Rust type a constrained shape gets rendered as. For
shapes that are not
structure
,union
,enum
orintEnum
shapes, these are wrapper newtypes.
In the absence of a qualifier, "constrained shape" should be interpreted as "transitively constrained shape".
Impossible constraint violations
Background
A constrained type has a fallible constructor by virtue of it implementing the
TryFrom
trait. The error type this constructor may yield is known as a
constraint violation:
impl TryFrom<UnconstrainedType> for ConstrainedType {
type Error = ConstraintViolation;
fn try_from(value: UnconstrainedType) -> Result<Self, Self::Error> {
...
}
}
The ConstraintViolation
type is a Rust enum
with one variant per way
"constraining" the input value may fail. So, for example, the following Smithy
model:
structure A {
@required
member: String,
}
Yields:
/// See [`A`](crate::model::A).
pub mod a {
#[derive(std::cmp::PartialEq, std::fmt::Debug)]
/// Holds one variant for each of the ways the builder can fail.
pub enum ConstraintViolation {
/// `member` was not provided but it is required when building `A`.
MissingMember,
}
}
Constraint violations are always Rust enum
s, even if they only have one
variant.
Constraint violations can occur in application code:
use my_server_sdk::model
let res = model::a::Builder::default().build(); // We forgot to set `member`.
match res {
Ok(a) => { ... },
Err(e) => {
assert_eq!(model::a::ConstraintViolation::MissingMember, e);
}
}
Problem
Currently, the constraint violation types we generate are used by both:
- the server framework upon request deserialization; and
- by users in application code.
However, the kinds of constraint violations that can occur in application code can sometimes be a strict subset of those that can occur during request deserialization.
Consider the following model:
@length(min: 1, max: 69)
map LengthMap {
key: String,
value: LengthString
}
@length(min: 2, max: 69)
string LengthString
This produces:
pub struct LengthMap(
pub(crate) std::collections::HashMap<std::string::String, crate::model::LengthString>,
);
impl
std::convert::TryFrom<
std::collections::HashMap<std::string::String, crate::model::LengthString>,
> for LengthMap
{
type Error = crate::model::length_map::ConstraintViolation;
/// Constructs a `LengthMap` from an
/// [`std::collections::HashMap<std::string::String,
/// crate::model::LengthString>`], failing when the provided value does not
/// satisfy the modeled constraints.
fn try_from(
value: std::collections::HashMap<std::string::String, crate::model::LengthString>,
) -> Result<Self, Self::Error> {
let length = value.len();
if (1..=69).contains(&length) {
Ok(Self(value))
} else {
Err(crate::model::length_map::ConstraintViolation::Length(length))
}
}
}
pub mod length_map {
pub enum ConstraintViolation {
Length(usize),
Value(
std::string::String,
crate::model::length_string::ConstraintViolation,
),
}
...
}
Observe how the ConstraintViolation::Value
variant is never constructed.
Indeed, this variant is impossible to be constructed in application code: a
user has to provide a map whose values are already constrained LengthString
s
to the try_from
constructor, which only enforces the map's @length
trait.
The reason why these seemingly "impossible violations" are being generated is because they can arise during request deserialization. Indeed, the server framework deserializes requests into fully unconstrained types. These are types holding unconstrained types all the way through their closures. For instance, in the case of structure shapes, builder types (the unconstrained type corresponding to the structure shape) hold builders all the way down.
In the case of the above model, below is the alternate pub(crate)
constructor
the server framework uses upon deserialization. Observe how
LengthMapOfLengthStringsUnconstrained
is fully unconstrained and how the
try_from
constructor can yield ConstraintViolation::Value
.
pub(crate) mod length_map_of_length_strings_unconstrained {
#[derive(Debug, Clone)]
pub(crate) struct LengthMapOfLengthStringsUnconstrained(
pub(crate) std::collections::HashMap<std::string::String, std::string::String>,
);
impl std::convert::TryFrom<LengthMapOfLengthStringsUnconstrained>
for crate::model::LengthMapOfLengthStrings
{
type Error = crate::model::length_map_of_length_strings::ConstraintViolation;
fn try_from(value: LengthMapOfLengthStringsUnconstrained) -> Result<Self, Self::Error> {
let res: Result<
std::collections::HashMap<std::string::String, crate::model::LengthString>,
Self::Error,
> = value
.0
.into_iter()
.map(|(k, v)| {
let v: crate::model::LengthString = k.try_into().map_err(Self::Error::Key)?;
Ok((k, v))
})
.collect();
let hm = res?;
Self::try_from(hm)
}
}
}
In conclusion, the user is currently exposed to an internal detail of how the
framework operates that has no bearing on their application code. They
shouldn't be exposed to impossible constraint violation variants in their Rust
docs, nor have to match
on these variants when handling errors.
Note: this comment alludes to the problem described above.
Solution proposal
The problem can be mitigated by adding #[doc(hidden)]
to the internal
variants and #[non_exhaustive]
to the enum. We're already doing this in some
constraint violation types.
However, a "less leaky" solution is achieved by splitting the constraint violation type into two types, which this RFC proposes:
- one for use by the framework, with
pub(crate)
visibility, namedConstraintViolationException
; and - one for use by user application code, with
pub
visibility, namedConstraintViolation
.
pub mod length_map {
pub enum ConstraintViolation {
Length(usize),
}
pub (crate) enum ConstraintViolationException {
Length(usize),
Value(
std::string::String,
crate::model::length_string::ConstraintViolation,
),
}
}
Note that, to some extent, the spirit of this approach is already currently
present
in the case of builder types when publicConstrainedTypes
is set to false
:
ServerBuilderGenerator.kt
renders the usual builder type that enforces constraint traits, setting its visibility topub (crate)
, for exclusive use by the framework.ServerBuilderGeneratorWithoutPublicConstrainedTypes.kt
renders the builder type the user is exposed to: this builder does not take in constrained types and does not enforce all modeled constraints.
Collecting constraint violations
Background
Constrained operations are currently required to have
smithy.framework#ValidationException
as a member in their errors
property. This is the
shape that is rendered in responses when a request contains data that violates
the modeled constraints.
The shape is defined in the
smithy-validation-model
Maven package, as
follows:
$version: "2.0"
namespace smithy.framework
/// A standard error for input validation failures.
/// This should be thrown by services when a member of the input structure
/// falls outside of the modeled or documented constraints.
@error("client")
structure ValidationException {
/// A summary of the validation failure.
@required
message: String,
/// A list of specific failures encountered while validating the input.
/// A member can appear in this list more than once if it failed to satisfy multiple constraints.
fieldList: ValidationExceptionFieldList
}
/// Describes one specific validation failure for an input member.
structure ValidationExceptionField {
/// A JSONPointer expression to the structure member whose value failed to satisfy the modeled constraints.
@required
path: String,
/// A detailed description of the validation failure.
@required
message: String
}
list ValidationExceptionFieldList {
member: ValidationExceptionField
}
It was mentioned in the constraint traits
RFC, and
implicit in the definition of Smithy's
smithy.framework.ValidationException
shape, that server frameworks should respond with a complete collection of
errors encountered during constraint trait enforcement to the client.
Problem
As of writing, the TryFrom
constructor of constrained types whose shapes have
more than one constraint trait attached can only yield a single error. For
example, the following shape:
@pattern("[a-f0-5]*")
@length(min: 5, max: 10)
string LengthPatternString
Yields:
pub struct LengthPatternString(pub(crate) std::string::String);
impl LengthPatternString {
fn check_length(
string: &str,
) -> Result<(), crate::model::length_pattern_string::ConstraintViolation> {
let length = string.chars().count();
if (5..=10).contains(&length) {
Ok(())
} else {
Err(crate::model::length_pattern_string::ConstraintViolation::Length(length))
}
}
fn check_pattern(
string: String,
) -> Result<String, crate::model::length_pattern_string::ConstraintViolation> {
let regex = Self::compile_regex();
if regex.is_match(&string) {
Ok(string)
} else {
Err(crate::model::length_pattern_string::ConstraintViolation::Pattern(string))
}
}
pub fn compile_regex() -> &'static regex::Regex {
static REGEX: once_cell::sync::Lazy<regex::Regex> = once_cell::sync::Lazy::new(|| {
regex::Regex::new(r#"[a-f0-5]*"#).expect(r#"The regular expression [a-f0-5]* is not supported by the `regex` crate; feel free to file an issue under https://github.com/smithy-lang/smithy-rs/issues for support"#)
});
®EX
}
}
impl std::convert::TryFrom<std::string::String> for LengthPatternString {
type Error = crate::model::length_pattern_string::ConstraintViolation;
/// Constructs a `LengthPatternString` from an [`std::string::String`],
/// failing when the provided value does not satisfy the modeled constraints.
fn try_from(value: std::string::String) -> Result<Self, Self::Error> {
Self::check_length(&value)?;
let value = Self::check_pattern(value)?;
Ok(Self(value))
}
}
Observe how a failure to adhere to the @length
trait will short-circuit the
evaluation of the constructor, when the value could technically also not adhere
with the @pattern
trait.
Similarly, constrained structures fail upon encountering the first member that violates a constraint.
Additionally, in framework request deserialization code:
- collections whose members are constrained fail upon encountering the first member that violates the constraint,
- maps whose keys and/or values are constrained fail upon encountering the first violation; and
- structures whose members are constrained fail upon encountering the first member that violates the constraint,
In summary, any shape that is transitively constrained yields types whose constructors (both the internal one and the user-facing one) currently short-circuit upon encountering the first violation.
Solution proposal
The deserializing architecture lends itself to be easily refactored so that we
can collect constraint violations before returning them. Indeed, note that
deserializers enforce constraint traits in a two-step phase: first, the
entirety of the unconstrained value is deserialized, then constraint traits
are enforced by feeding the entire value to the TryFrom
constructor.
Let's consider a ConstraintViolations
type (note the plural) that represents
a collection of constraint violations that can occur within user application
code. Roughly:
pub ConstraintViolations<T>(pub(crate) Vec<T>);
impl<T> IntoIterator<Item = T> for ConstraintViolations<T> { ... }
impl std::convert::TryFrom<std::string::String> for LengthPatternString {
type Error = ConstraintViolations<crate::model::length_pattern_string::ConstraintViolation>;
fn try_from(value: std::string::String) -> Result<Self, Self::Error> {
// Check constraints and collect violations.
...
}
}
- The main reason for wrapping a vector in
ConstraintViolations
as opposed to directly returning the vector is forwards-compatibility: we may want to expandConstraintViolations
with conveniences. - If the constrained type can only ever yield a single violation, we will
dispense with
ConstraintViolations
and keep directly returning thecrate::model::shape_name::ConstraintViolation
type.
We will analogously introduce a ConstraintViolationExceptions
type that
represents a collection of constraint violations that can occur within the
framework's request deserialization code. This type will be pub(crate)
and
will be the one the framework will map to Smithy's ValidationException
that
eventually gets serialized into the response.
Collecting constraint violations may constitute a DOS attack vector
This is a problem that already exists as of writing, but that collecting constraint violations highlights, so it is a good opportunity, from a pedagogical perspective, to explain it here. Consider the following model:
@length(max: 3)
list ListOfPatternStrings {
member: PatternString
}
@pattern("expensive regex to evaluate")
string PatternString
Our implementation currently enforces constraints from the leaf to the root:
when enforcing the @length
constraint, the TryFrom
constructor the server
framework uses gets a Vec<String>
and first checks the members adhere to
the @pattern
trait, and only after is the @length
trait checked. This
means that if a client sends a request with n >>> 3
list members, the
expensive check runs n
times, when a constant-time check inspecting the
length of the input vector would have sufficed to reject the request.
Additionally, we may want to avoid serializing n
ValidationExceptionField
s
due to performance concerns.
- A possibility to circumvent this is making the
@length
validator special, having it bound the other validators via effectively permuting the order of the checks and thus short-circuiting.- In general, it's unclear what constraint traits should cause
short-circuiting. A probably reasonable rule of thumb is to include
traits that can be attached directly to aggregate shapes: as of writing,
that would be
@uniqueItems
on list shapes and@length
on list shapes.
- In general, it's unclear what constraint traits should cause
short-circuiting. A probably reasonable rule of thumb is to include
traits that can be attached directly to aggregate shapes: as of writing,
that would be
- Another possiblity is to do nothing and value complete validation exception response messages over trying to mitigate this with special handling. One could argue that these kind of DOS attack vectors should be taken care of with a separate solution e.g. a layer that bounds a request body's size to a reasonable default (see how Axum added this). We will provide a similar request body limiting mechanism regardless.
This RFC advocates for implementing the first option, arguing that it's fair to say that the framework should return an error that is as informative as possible, but it doesn't necessarily have to be complete. However, we will also write a layer, applied by default to all server SDKs, that bounds a request body's size to a reasonable (yet high) default. Relying on users to manually apply the layer is dangerous, since such a configuration is trivially exploitable. Users can always manually apply the layer again to their resulting service if they want to further restrict a request's body size.
"Tightness" of constraint violations
Problem
ConstraintViolationExceptions
is not
"tight" in that there's nothing in the type
system that indicates to the user, when writing the custom validation error
mapping function, that the iterator will not return a sequence of
ConstraintViolationException
s that is actually impossible to occur in
practice.
Recall that ConstraintViolationException
s are enum
s that model both direct
constraint violations as well as transitive ones. For example, given the model:
@length(min: 1, max: 69)
map LengthMap {
key: String,
value: LengthString
}
@length(min: 2, max: 69)
string LengthString
The corresponding ConstraintViolationException
Rust type for the LengthMap
shape is:
pub mod length_map {
pub enum ConstraintViolation {
Length(usize),
}
pub (crate) enum ConstraintViolationException {
Length(usize),
Value(
std::string::String,
crate::model::length_string::ConstraintViolationException,
),
}
}
ConstraintViolationExceptions
is just a container over this type:
pub ConstraintViolationExceptions<T>(pub(crate) Vec<T>);
impl<T> IntoIterator<Item = T> for ConstraintViolationExceptions<T> { ... }
There might be multiple map values that fail to adhere to the constraints in
LengthString
, which would make the iterator yield multiple
length_map::ConstraintViolationException::Value
s; however, at most one
length_map::ConstraintViolationException::Length
can be yielded in
practice. This might be obvious to the service owner when inspecting the model
and the Rust docs, but it's not expressed in the type system.
The above tightness problem has been formulated in terms of
ConstraintViolationExceptions
, because the fact that
ConstraintViolationExceptions
contain transitive constraint violations
highlights the tightness problem. Note, however, that the tightness problem
also afflicts ConstraintViolations
.
Indeed, consider the following model:
@pattern("[a-f0-5]*")
@length(min: 5, max: 10)
string LengthPatternString
This would yield:
pub struct ConstraintViolations<T>(pub(crate) Vec<T>);
impl<T> IntoIterator<Item = T> for ConstraintViolations<T> { ... }
pub mod length_pattern_string {
pub enum ConstraintViolation {
Length(usize),
Pattern(String)
}
}
impl std::convert::TryFrom<std::string::String> for LengthPatternString {
type Error = ConstraintViolations<crate::model::length_pattern_string::ConstraintViolation>;
fn try_from(value: std::string::String) -> Result<Self, Self::Error> {
// Check constraints and collect violations.
...
}
}
Observe how the iterator of an instance of
ConstraintViolations<crate::model::length_pattern_string::ConstraintViolation>
,
may, a priori, yield e.g. the
length_pattern_string::ConstraintViolation::Length
variant twice, when it's
clear that the iterator should contain at most one of each of
length_pattern_string::ConstraintViolation
's variants.
Final solution proposal
We propose a tighter API design.
- We substitute
enum
s forstruct
s whose members are allOption
al, representing all the constraint violations that can occur. - For list shapes and map shapes:
- we implement
IntoIterator
on an additionalstruct
Members
representing only the violations that can occur on the collection's members. - we add a non
Option
-al field to thestruct
representing the constraint violations of typeMembers
.
- we implement
Let's walk through an example. Take the last model:
@pattern("[a-f0-5]*")
@length(min: 5, max: 10)
string LengthPatternString
This would yield, as per the first substitution:
pub mod length_pattern_string {
pub struct ConstraintViolations {
pub length: Option<constraint_violation::Length>,
pub pattern: Option<constraint_violation::Pattern>,
}
pub mod constraint_violation {
pub struct Length(usize);
pub struct Pattern(String);
}
}
impl std::convert::TryFrom<std::string::String> for LengthPatternString {
type Error = length_pattern_string::ConstraintViolations;
// The error type returned by this constructor, `ConstraintViolations`,
// will always have _at least_ one member set.
fn try_from(value: std::string::String) -> Result<Self, Self::Error> {
// Check constraints and collect violations.
...
}
}
We now expand the model to highlight the second step of the algorithm:
@length(min: 1, max: 69)
map LengthMap {
key: String,
value: LengthString
}
This gives us:
pub mod length_map {
pub struct ConstraintViolations {
pub length: Option<constraint_violation::Length>,
// Would be `Option<T>` in the case of an aggregate shape that is _not_ a
// list shape or a map shape.
pub member_violations: constraint_violation::Members,
}
pub mod constraint_violation {
// Note that this could now live outside the `length_map` module and be
// reused across all `@length`-constrained shapes, if we expanded it with
// another `usize` indicating the _modeled_ value in the `@length` trait; by
// keeping it inside `length_map` we can hardcode that value in the
// implementation of e.g. error messages.
pub struct Length(usize);
pub struct Members {
pub(crate) Vec<Member>
}
pub struct Member {
// If the map's key shape were constrained, we'd have a `key`
// field here too.
value: Option<Value>
}
pub struct Value(
std::string::String,
crate::model::length_string::ConstraintViolation,
);
impl IntoIterator<Item = Member> for Members { ... }
}
}
The above examples have featured the tight API design with
ConstraintViolation
s. Of course, we will apply the same design in the case of
ConstraintViolationException
s. For the sake of completeness, let's expand our
model yet again with a structure shape:
structure A {
@required
member: String,
@required
length_map: LengthMap,
}
And this time let's feature both the resulting
ConstraintViolationExceptions
and ConstraintViolations
types:
pub mod a {
pub struct ConstraintViolationExceptions {
// All fields must be `Option`, despite the members being `@required`,
// since no violations for their values might have occurred.
pub missing_member_exception: Option<constraint_violation_exception::MissingMember>,
pub missing_length_map_exception: Option<constraint_violation_exception::MissingLengthMap>,
pub length_map_exceptions: Option<crate::model::length_map::ConstraintViolationExceptions>,
}
pub mod constraint_violation_exception {
pub struct MissingMember;
pub struct MissingLengthMap;
}
pub struct ConstraintViolations {
pub missing_member: Option<constraint_violation::MissingMember>,
pub missing_length_map: Option<constraint_violation::MissingLengthMap>,
}
pub mod constraint_violation {
pub struct MissingMember;
pub struct MissingLengthMap;
}
}
As can be intuited, the only differences are that:
ConstraintViolationExceptions
hold transitive violations whileConstraintViolations
only need to expose direct violations (as explained in the Impossible constraint violations section),ConstraintViolationExceptions
have members suffixed with_exception
, as is the module name.
Note that while the constraint violation (exception) type names are plural, the module names are always singular.
We also make a conscious decision of, in this case of structure shapes, making
the types of all members Option
s, for simplicity. Another choice would have
been to make length_map_exceptions
not Option
-al, and, in the case where no
violations in LengthMap
values occurred, set
length_map::ConstraintViolations::length
to None
and
length_map::ConstraintViolations::member_violations
eventually reach an empty
iterator. However, it's best that we use the expressiveness of Option
s at the
earliest ("highest" in the shape hierarchy) opportunity: if a member is Some
,
it means it (eventually) reaches data.
Checklist
Unfortunately, while this RFC could be implemented iteratively (i.e. solve each of the problems in turn), it would introduce too much churn and throwaway work: solving the tightness problem requires a more or less complete overhaul of the constraint violations code generator. It's best that all three problems be solved in the same changeset.
-
Generate
ConstraintViolations
andConstraintViolationExceptions
types so as to not reify impossible constraint violations, add the ability to collect constraint violations, and solve the "tightness" problem of constraint violations. -
Special-case generated request deserialization code for operations
using
@length
and@uniqueItems
constrained shapes whose closures reach other constrained shapes so that the validators for these two traits short-circuit upon encountering a number of inner constraint violations above a certain threshold. - Write and expose a layer, applied by default to all generated server SDKs, that bounds a request body's size to a reasonable (yet high) default, to prevent trivial DoS attacks.
RFC: Improving access to request IDs in SDK clients
Status: Implemented in #2129
Applies to: AWS SDK clients
At time of writing, customers can retrieve a request ID in one of four ways in the Rust SDK:
- For error cases where the response parsed successfully, the request ID can be retrieved via accessor method on operation error. This also works for unmodeled errors so long as the response parsing succeeds.
- For error cases where a response was received but parsing fails, the response headers can be retrieved from the raw response on the error, but customers have to manually extract the request ID from those headers (there's no convenient accessor method).
- For all error cases where the request ID header was sent in the response, customers can
call
SdkError::into_service_error
to transform theSdkError
into an operation error, which has arequest_id
accessor on it. - For success cases, the customer can't retrieve the request ID at all if they use the fluent
client. Instead, they must manually make the operation and call the underlying Smithy client
so that they have access to
SdkSuccess
, which provides the raw response where the request ID can be manually extracted from headers.
Only one of these mechanisms is convenient and ergonomic. The rest need considerable improvements. Additionally, the request ID should be attached to tracing events where possible so that enabling debug logging reveals the request IDs without any code changes being necessary.
This RFC proposes changes to make the request ID easier to access.
Terminology
- Request ID: A unique identifier assigned to and associated with a request to AWS that is sent back in the response headers. This identifier is useful to customers when requesting support.
- Operation Error: Operation errors are code generated for each operation in a Smithy model.
They are an enum of every possible modeled error that that operation can respond with, as well
as an
Unhandled
variant for any unmodeled or unrecognized errors. - Modeled Errors: Any error that is represented in a Smithy model with the
@error
trait. - Unmodeled Errors: Errors that a service responds with that do not appear in the Smithy model.
- SDK Clients: Clients generated for the AWS SDK, including "adhoc" or "one-off" clients.
- Smithy Clients: Any clients not generated for the AWS SDK, excluding "adhoc" or "one-off" clients.
SDK/Smithy Purity
Before proposing any changes, the topic of purity needs to be covered. Request IDs are not
currently a Smithy concept. However, at time of writing, the request ID concept is leaked into
the non-SDK rust runtime crates and generated code via the generic error struct and the
request_id
functions on generated operation errors (e.g., GetObjectError
example in S3).
This RFC attempts to remove these leaks from Smithy clients.
Proposed Changes
First, we'll explore making it easier to retrieve a request ID from errors, and then look at making it possible to retrieve them from successful responses. To see the customer experience of these changes, see the Example Interactions section below.
Make request ID retrieval on errors consistent
One could argue that customers being able to convert a SdkError
into an operation error
that has a request ID on it is sufficient. However, there's no way to write a function
that takes an error from any operation and logs a request ID, so it's still not ideal.
The aws-http
crate needs to have a RequestId
trait on it to facilitate generic
request ID retrieval:
#![allow(unused)] fn main() { pub trait RequestId { /// Returns the request ID if it's available. fn request_id(&self) -> Option<&str>; } }
This trait will be implemented for SdkError
in aws-http
where it is declared,
complete with logic to pull the request ID header out of the raw HTTP responses
(it will always return None
for event stream Message
responses; an additional
trait may need to be added to aws-smithy-http
to facilitate access to the headers).
This logic will try different request ID header names in order of probability
since AWS services have a couple of header name variations. x-amzn-requestid
is
the most common, with x-amzn-request-id
being the second most common.
aws-http
will also implement RequestId
for aws_smithy_types::error::Error
,
and the request_id
method will be removed from aws_smithy_types::error::Error
.
Places that construct Error
will place the request ID into its extras
field,
where the RequestId
trait implementation can retrieve it.
A codegen decorator will be added to sdk-codegen
to implement RequestId
for
operation errors, and the existing request_id
accessors will be removed from
CombinedErrorGenerator
in codegen-core
.
With these changes, customers can directly access request IDs from SdkError
and
operations errors by importing the RequestId
trait. Additionally, the Smithy/SDK
purity is improved since both places where request IDs are leaked to Smithy clients
will be resolved.
Implement RequestId
for outputs
To make it possible to retrieve request IDs when using the fluent client, the new
RequestId
trait can be implemented for outputs.
Some services (e.g., Transcribe Streaming) model the request ID header in their
outputs, while other services (e.g., Directory Service) model a request ID
field on errors. In some cases, services take RequestId
as a modeled input
(e.g., IoT Event Data). It follows that it is possible, but unlikely, that
a service could have a field named RequestId
that is not the same concept
in the future.
Thus, name collisions are going to be a concern for putting a request ID accessor
on output. However, if it is implemented as a trait, then this concern is partially
resolved. In the vast majority of cases, importing RequestId
will provide the
accessor without any confusion. In cases where it is already modeled and is the
same concept, customers will likely just use it and not even realize they didn't
import the trait. The only concern is future cases where it is modeled as a
separate concept, and as long as customers don't import RequestId
for something
else in the same file, that confusion can be avoided.
In order to implement RequestId
for outputs, either the original response needs
to be stored on the output, or the request ID needs to be extracted earlier and
stored on the output. The latter will lead to a small amount of header lookup
code duplication.
In either case, the StructureGenerator
needs to be customized in sdk-codegen
(Appendix B outlines an alternative approach to this and why it was dismissed).
This will be done by adding customization hooks to StructureGenerator
similar
to the ones for ServiceConfigGenerator
so that a sdk-codegen
decorator can
conditionally add fields and functions to any generated structs. A hook will
also be needed to additional trait impl blocks.
Once the hooks are in place, a decorator will be added to store either the original
response or the request ID on outputs, and then the RequestId
trait will be
implemented for them. The ParseResponse
trait implementation will be customized
to populate this new field.
Note: To avoid name collisions of the request ID or response on the output struct, these fields can be prefixed with an underscore. It shouldn't be possible for SDK fields to code generate with this prefix given the model validation rules in place.
Implement RequestId
for Operation
and operation::Response
In the case that a customer wants to ditch the fluent client, it should still
be easy to retrieve a request ID. To do this, aws-http
will provide RequestId
implementations for Operation
and operation::Response
. These implementations
will likely make the other RequestId
implementations easier to implement as well.
Implement RequestId
for Result
The Result
returned by the SDK should directly implement RequestId
when both
its Ok
and Err
variants implement RequestId
. This will make it possible
for a customer to feed the return value from send()
directly to a request ID logger.
Example Interactions
Generic Handling Case
// A re-export of the RequestId trait
use aws_sdk_service::primitives::RequestId;
fn my_request_id_logging_fn(request_id: &dyn RequestId) {
println!("request ID: {:?}", request_id.request_id());
}
let result = client.some_operation().send().await?;
my_request_id_logging_fn(&result);
Success Case
use aws_sdk_service::primitives::RequestId;
let output = client.some_operation().send().await?;
println!("request ID: {:?}", output.request_id());
Error Case with SdkError
use aws_sdk_service::primitives::RequestId;
match client.some_operation().send().await {
Ok(_) => { /* handle OK */ }
Err(err) => {
println!("request ID: {:?}", output.request_id());
}
}
Error Case with operation error
use aws_sdk_service::primitives::RequestId;
match client.some_operation().send().await {
Ok(_) => { /* handle OK */ }
Err(err) => match err.into_service_err() {
err @ SomeOperationError::SomeError(_) => { println!("request ID: {:?}", err.request_id()); }
_ => { /* don't care */ }
}
}
Changes Checklist
-
Create the
RequestId
trait inaws-http
-
Implement for errors
-
Implement
RequestId
forSdkError
inaws-http
-
Remove
request_id
fromaws_smithy_types::error::Error
, and store request IDs in itsextras
instead -
Implement
RequestId
foraws_smithy_types::error::Error
inaws-http
-
Remove generation of
request_id
accessors fromCombinedErrorGenerator
incodegen-core
-
Implement
-
Implement for outputs
-
Add customization hooks to
StructureGenerator
-
Add customization hook to
ParseResponse
-
Add customization hook to
HttpBoundProtocolGenerator
-
Customize output structure code gen in
sdk-codegen
to add either a request ID or a response field -
Customize
ParseResponse
insdk-codegen
to populate the outputs
-
Add customization hooks to
-
Implement
RequestId
forOperation
andoperation::Response
-
Implement
RequestId
forResult<O, E>
whereO
andE
both implementRequestId
-
Re-export
RequestId
in generated crates - Add integration tests for each request ID access point
Appendix A: Alternate solution for access on successful responses
Alternatively, for successful responses, a second send
method (that is difficult to name)w
be added to the fluent client that has a return value that includes both the output and
the request ID (or entire response).
This solution was dismissed due to difficulty naming, and the risk of name collision.
Appendix B: Adding RequestId
as a string to outputs via model transform
The request ID could be stored on outputs by doing a model transform in sdk-codegen
to add a
RequestId
member field. However, this causes problems when an output already has a RequestId
field,
and requires the addition of a synthetic trait to skip binding the field in the generated
serializers/deserializers.
Smithy Orchestrator
status: implemented applies-to: The smithy client
This RFC proposes a new process for constructing client requests and handling service responses. This new process is intended to:
- Improve the user experience by
- Simplifying several aspects of sending a request
- Adding more extension points to the request/response lifecycle
- Improve the maintainer experience by
- Making our SDK more similar in structure to other AWS SDKs
- Simplifying many aspects of the request/response lifecycle
- Making room for future changes
Additionally, functionality that the SDKs currently provide like retries, logging, and auth with be incorporated into this new process in such a way as to make it more configurable and understandable.
This RFC references but is not the source of truth on:
- Interceptors: To be described in depth in a future RFC.
- Runtime Plugins: To be described in depth in a future RFC.
TLDR;
When a smithy client communicates with a smithy service, messages are handled by an "orchestrator." The orchestrator runs in two main phases:
- Constructing configuration.
- This process is user-configurable with "runtime plugins."
- Configuration is stored in a typemap.
- Transforming a client request into a server response.
- This process is user-configurable with "interceptors."
- Interceptors are functions that are run by "hooks" in the request/response lifecycle.
Terminology
- SDK Client: A high-level abstraction allowing users to make requests to remote services.
- Remote Service: A remote API that a user wants to use. Communication with a remote service usually happens over HTTP. The remote service is usually, but not necessarily, an AWS service.
- Operation: A high-level abstraction representing an interaction between an *SDK Client and a remote service.
- Input Message: A modeled request passed into an SDK client. For example, S3’s
ListObjectsRequest
. - Transport Request Message: A message that can be transmitted to a remote service. For example, an HTTP request.
- Transport Response Message: A message that can be received from a remote service. For example, an HTTP response.
- Output Message: A modeled response or exception returned to an SDK client caller. For example, S3’s
ListObjectsResponse
orNoSuchBucketException
. - The request/response lifecycle: The process by which an SDK client makes requests and receives responses from a remote service. This process is enacted and managed by the orchestrator.
- Orchestrator: The code within an SDK client that handles the process of making requests and receiving responses from remote services. The orchestrator is configurable by modifying the runtime plugins it's built from. The orchestrator is responsible for calling interceptors at the appropriate times in the request/response lifecycle.
- Interceptor/Hook: A generic extension point within the orchestrator. Supports "anything that someone should be able to do", NOT "anything anyone might want to do". These hooks are:
- Either read-only or read/write.
- Able to read and modify the Input, Transport Request, Transport Response, or Output messages.
- Runtime Plugin: Runtime plugins are similar to interceptors, but they act on configuration instead of requests and response. Both users and services may define runtime plugins. Smithy also defines several default runtime plugins used by most clients. See the F.A.Q. for a list of plugins with descriptions.
- ConfigBag: A
typemap
that's equivalent tohttp::Extensions
. Used to store configuration for the orchestrator.
The user experience if this RFC is implemented
For many users, the changes described by this RFC will be invisible. Making a request with an orchestrator-based SDK client looks very similar to the way requests were made pre-RFC:
let sdk_config = aws_config::load_from_env().await;
let client = aws_sdk_s3::Client::new(&sdk_config);
let res = client.get_object()
.bucket("a-bucket")
.key("a-file.txt")
.send()
.await?;
match res {
Ok(res) => println!("success: {:?}"),
Err(err) => eprintln!("failure: {:?}")
};
Users may further configure clients and operations with runtime plugins, and they can modify requests and responses with interceptors. We'll examine each of these concepts in the following sections.
Service clients and operations are configured with runtime plugins
The exact implementation of runtime plugins is left for another RFC. That other RFC will be linked here once it's written. To get an idea of what they may look like, see the "Layered configuration, stored in type maps" section of this RFC.
Runtime plugins construct and modify client configuration. Plugin initialization is the first step of sending a request, and plugins set in later steps can override the actions of earlier plugins. Plugin ordering is deterministic and non-customizable.
While AWS services define a default set of plugins, users may define their own plugins, and set them by calling the appropriate methods on a service's config, client, or operation. Plugins are specifically meant for constructing service and operation configuration. If a user wants to define behavior that should occur at specific points in the request/response lifecycle, then they should instead consider defining an interceptor.
Requests and responses are modified by interceptors
Interceptors are similar to middlewares, in that they are functions that can read and modify request and response state. However, they are more restrictive than middlewares in that they can't modify the "control flow" of the request/response lifecycle. This is intentional. Interceptors can be registered on a client or operation, and the orchestrator is responsible for calling interceptors at the appropriate time. Users MUST NOT perform blocking IO within an interceptor. Interceptors are sync, and are not intended to perform large amounts of work. This makes them easier to reason about and use. Depending on when they are called, interceptors may read and modify input messages, transport request messages, transport response messages, and output messages. Additionally, all interceptors may write to a context object that is shared between all interceptors.
Currently supported hooks
- Read Before Execution (Read-Only): Before anything happens. This is the first thing the SDK calls during operation execution.
- Modify Before Serialization (Read/Write): Before the input message given by the customer is marshalled into a transport request message. Allows modifying the input message.
- Read Before Serialization (Read-Only): The last thing the SDK calls before marshaling the input message into a transport message.
- Read After Serialization (Read-Only): The first thing the SDK calls after marshaling the input message into a transport message.
- (Retry Loop)
- Modify Before Retry Loop (Read/Write): The last thing the SDK calls before entering the retry look. Allows modifying the transport message.
- Read Before Attempt (Read-Only): The first thing the SDK calls “inside” of the retry loop.
- Modify Before Signing (Read/Write): Before the transport request message is signed. Allows modifying the transport message.
- Read Before Signing (Read-Only): The last thing the SDK calls before signing the transport request message.
- **Read After Signing (Read-Only)****: The first thing the SDK calls after signing the transport request message.
- Modify Before Transmit (Read/Write): Before the transport request message is sent to the service. Allows modifying the transport message.
- Read Before Transmit (Read-Only): The last thing the SDK calls before sending the transport request message.
- Read After Transmit (Read-Only): The last thing the SDK calls after receiving the transport response message.
- Modify Before Deserialization (Read/Write): Before the transport response message is unmarshaled. Allows modifying the transport response message.
- Read Before Deserialization (Read-Only): The last thing the SDK calls before unmarshalling the transport response message into an output message.
- Read After Deserialization (Read-Only): The last thing the SDK calls after unmarshaling the transport response message into an output message.
- Modify Before Attempt Completion (Read/Write): Before the retry loop ends. Allows modifying the unmarshaled response (output message or error).
- Read After Attempt (Read-Only): The last thing the SDK calls “inside” of the retry loop.
- Modify Before Execution Completion (Read/Write): Before the execution ends. Allows modifying the unmarshaled response (output message or error).
- Read After Execution (Read-Only): After everything has happened. This is the last thing the SDK calls during operation execution.
Interceptor context
As mentioned above, interceptors may read/write a context object that is shared between all interceptors:
pub struct InterceptorContext<ModReq, TxReq, TxRes, ModRes> {
// a.k.a. the input message
modeled_request: ModReq,
// a.k.a. the transport request message
tx_request: Option<TxReq>,
// a.k.a. the output message
modeled_response: Option<ModRes>,
// a.k.a. the transport response message
tx_response: Option<TxRes>,
// A type-keyed map
properties: SharedPropertyBag,
}
The optional request and response types in the interceptor context can only be accessed by interceptors that are run after specific points in the request/response lifecycle. Rather than go into depth in this RFC, I leave that to a future "Interceptors RFC."
How to implement this RFC
Integrating with the orchestrator
Imagine we have some sort of request signer. This signer doesn't refer to any orchestrator types. All it needs is a HeaderMap
along with two strings, and will return a signature in string form.
struct Signer;
impl Signer {
fn sign(headers: &http::HeaderMap, signing_name: &str, signing_region: &str) -> String {
todo!()
}
}
Now imagine things from the orchestrator's point of view. It requires something that implements an AuthOrchestrator
which will be responsible for resolving the correct auth
scheme, identity, and signer for an operation, as well as signing the request
pub trait AuthOrchestrator<Req>: Send + Sync + Debug {
fn auth_request(&self, req: &mut Req, cfg: &ConfigBag) -> Result<(), BoxError>;
}
// And it calls that `AuthOrchestrator` like so:
fn invoke() {
// code omitted for brevity
// Get the request to be signed
let tx_req_mut = ctx.tx_request_mut().expect("tx_request has been set");
// Fetch the auth orchestrator from the bag
let auth_orchestrator = cfg
.get::<Box<dyn AuthOrchestrator<Req>>>()
.ok_or("missing auth orchestrator")?;
// Auth the request
auth_orchestrator.auth_request(tx_req_mut, cfg)?;
// code omitted for brevity
}
The specific implementation of the AuthOrchestrator
is what brings these two things together:
struct Sigv4AuthOrchestrator;
impl AuthOrchestrator for Sigv4AuthOrchestrator {
fn auth_request(&self, req: &mut http::Request<SdkBody>, cfg: &ConfigBag) -> Result<(), BoxError> {
let signer = Signer;
let signing_name = cfg.get::<SigningName>().ok_or(Error::MissingSigningName)?;
let signing_region = cfg.get::<SigningRegion>().ok_or(Error::MissingSigningRegion)?;
let headers = req.headers_mut();
let signature = signer.sign(headers, signing_name, signing_region);
match cfg.get::<SignatureLocation>() {
Some(SignatureLocation::Query) => req.query.set("sig", signature),
Some(SignatureLocation::Header) => req.headers_mut().insert("sig", signature),
None => return Err(Error::MissingSignatureLocation),
};
Ok(())
}
}
This intermediate code should be free from as much logic as possible. Whenever possible, we must maintain this encapsulation. Doing so will make the Orchestrator more flexible, maintainable, and understandable.
Layered configuration, stored in type maps
Type map: A data structure where stored values are keyed by their type. Hence, only one value can be stored for a given type.
See typemap, type-map, http::Extensions, and actix_http::Extensions for examples.
let conf: ConfigBag = aws_config::from_env()
// Configuration can be common to all smithy clients
.with(RetryConfig::builder().disable_retries().build())
// Or, protocol-specific
.with(HttpClient::builder().build())
// Or, AWS-specific
.with(Region::from("us-east-1"))
// Or, service-specific
.with(S3Config::builder().force_path_style(false).build())
.await;
let client = aws_sdk_s3::Client::new(&conf);
client.list_buckets()
.customize()
// Configuration can be set on operations as well as clients
.with(HttpConfig::builder().conn(some_other_conn).build())
.send()
.await;
Setting configuration that will not be used wastes memory and can make debugging more difficult. Therefore, configuration defaults are only set when they're relevant. For example, if a smithy service doesn't support HTTP, then no HTTP client will be set.
What is "layered" configuration?
Configuration has precedence. Configuration set on an operation will override configuration set on a client, and configuration set on a client will override default configuration. However, configuration with a higher precedence can also augment configuration with a lower precedence. For example:
let conf: ConfigBag = aws_config::from_env()
.with(
SomeConfig::builder()
.option_a(1)
.option_b(2)
.option_c(3)
)
.build()
.await;
let client = aws_sdk_s3::Client::new(&conf);
client.list_buckets()
.customize()
.with(
SomeConfig::builder()
.option_a(0)
.option_b(Value::Inherit)
.option_c(Value::Unset)
)
.build()
.send()
.await;
In the above example, when the option_a
, option_b
, option_c
, values of SomeConfig
are accessed, they'll return:
option_a
:0
option_b
:2
option_c
: No value
Config values are wrapped in a special enum called Value
with three variants:
Value::Set
: A set value that will override values from lower layers.Value::Unset
: An explicitly unset value that will override values from lower layers.Value::Inherit
: An explicitly unset value that will inherit a value from a lower layer.
Builders are defined like this:
struct SomeBuilder {
value: Value<T>,
}
impl struct SomeBuilder<T> {
fn new() -> Self {
// By default, config values inherit from lower-layer configs
Self { value: Value::Inherit }
}
fn some_field(&mut self, value: impl Into<Value<T>>) -> &mut self {
self.value = value.into();
self
}
}
Because of impl Into<Value<T>>
, users don't need to reference the Value
enum unless they want to "unset" a value.
Layer separation and precedence
Codegen defines default sets of interceptors and runtime plugins at various "levels":
- AWS-wide defaults set by codegen.
- Service-wide defaults set by codegen.
- Operation-specific defaults set by codegen.
Likewise, users may mount their own interceptors and runtime plugins:
- The AWS config level, e.g.
aws_types::Config
. - The service config level, e.g.
aws_sdk_s3::Config
. - The operation config level, e.g.
aws_sdk_s3::Client::get_object
.
Configuration is resolved in a fixed manner by reading the "lowest level" of config available, falling back to "higher levels" only when no value has been set. Therefore, at least 3 separate ConfigBag
s are necessary, and user configuration has precedence over codegen-defined default configuration. With that in mind, resolution of configuration would look like this:
- Check user-set operation config.
- Check codegen-defined operation config.
- Check user-set service config.
- Check codegen-defined service config.
- Check user-set AWS config.
- Check codegen-defined AWS config.
The aws-smithy-orchestrator
crate
I've omitted some of the error conversion to shorten this example and make it easier to understand. The real version will be messier.
/// `In`: The input message e.g. `ListObjectsRequest`
/// `Req`: The transport request message e.g. `http::Request<SmithyBody>`
/// `Res`: The transport response message e.g. `http::Response<SmithyBody>`
/// `Out`: The output message. A `Result` containing either:
/// - The 'success' output message e.g. `ListObjectsResponse`
/// - The 'failure' output message e.g. `NoSuchBucketException`
pub async fn invoke<In, Req, Res, T>(
input: In,
interceptors: &mut Interceptors<In, Req, Res, Result<T, BoxError>>,
runtime_plugins: &RuntimePlugins,
cfg: &mut ConfigBag,
) -> Result<T, BoxError>
where
// The input must be Clone in case of retries
In: Clone + 'static,
Req: 'static,
Res: 'static,
T: 'static,
{
let mut ctx: InterceptorContext<In, Req, Res, Result<T, BoxError>> =
InterceptorContext::new(input);
runtime_plugins.apply_client_configuration(cfg)?;
interceptors.client_read_before_execution(&ctx, cfg)?;
runtime_plugins.apply_operation_configuration(cfg)?;
interceptors.operation_read_before_execution(&ctx, cfg)?;
interceptors.read_before_serialization(&ctx, cfg)?;
interceptors.modify_before_serialization(&mut ctx, cfg)?;
let request_serializer = cfg
.get::<Box<dyn RequestSerializer<In, Req>>>()
.ok_or("missing serializer")?;
let req = request_serializer.serialize_request(ctx.modeled_request_mut(), cfg)?;
ctx.set_tx_request(req);
interceptors.read_after_serialization(&ctx, cfg)?;
interceptors.modify_before_retry_loop(&mut ctx, cfg)?;
loop {
make_an_attempt(&mut ctx, cfg, interceptors).await?;
interceptors.read_after_attempt(&ctx, cfg)?;
interceptors.modify_before_attempt_completion(&mut ctx, cfg)?;
let retry_strategy = cfg
.get::<Box<dyn RetryStrategy<Result<T, BoxError>>>>()
.ok_or("missing retry strategy")?;
let mod_res = ctx
.modeled_response()
.expect("it's set during 'make_an_attempt'");
if retry_strategy.should_retry(mod_res, cfg)? {
continue;
}
interceptors.modify_before_completion(&mut ctx, cfg)?;
let trace_probe = cfg
.get::<Box<dyn TraceProbe>>()
.ok_or("missing trace probes")?;
trace_probe.dispatch_events(cfg);
interceptors.read_after_execution(&ctx, cfg)?;
break;
}
let (modeled_response, _) = ctx.into_responses()?;
modeled_response
}
// Making an HTTP request can fail for several reasons, but we still need to
// call lifecycle events when that happens. Therefore, we define this
// `make_an_attempt` function to make error handling simpler.
async fn make_an_attempt<In, Req, Res, T>(
ctx: &mut InterceptorContext<In, Req, Res, Result<T, BoxError>>,
cfg: &mut ConfigBag,
interceptors: &mut Interceptors<In, Req, Res, Result<T, BoxError>>,
) -> Result<(), BoxError>
where
In: Clone + 'static,
Req: 'static,
Res: 'static,
T: 'static,
{
interceptors.read_before_attempt(ctx, cfg)?;
let tx_req_mut = ctx.tx_request_mut().expect("tx_request has been set");
let endpoint_orchestrator = cfg
.get::<Box<dyn EndpointOrchestrator<Req>>>()
.ok_or("missing endpoint orchestrator")?;
endpoint_orchestrator.resolve_and_apply_endpoint(tx_req_mut, cfg)?;
interceptors.modify_before_signing(ctx, cfg)?;
interceptors.read_before_signing(ctx, cfg)?;
let tx_req_mut = ctx.tx_request_mut().expect("tx_request has been set");
let auth_orchestrator = cfg
.get::<Box<dyn AuthOrchestrator<Req>>>()
.ok_or("missing auth orchestrator")?;
auth_orchestrator.auth_request(tx_req_mut, cfg)?;
interceptors.read_after_signing(ctx, cfg)?;
interceptors.modify_before_transmit(ctx, cfg)?;
interceptors.read_before_transmit(ctx, cfg)?;
// The connection consumes the request but we need to keep a copy of it
// within the interceptor context, so we clone it here.
let res = {
let tx_req = ctx.tx_request_mut().expect("tx_request has been set");
let connection = cfg
.get::<Box<dyn Connection<Req, Res>>>()
.ok_or("missing connector")?;
connection.call(tx_req, cfg).await?
};
ctx.set_tx_response(res);
interceptors.read_after_transmit(ctx, cfg)?;
interceptors.modify_before_deserialization(ctx, cfg)?;
interceptors.read_before_deserialization(ctx, cfg)?;
let tx_res = ctx.tx_response_mut().expect("tx_response has been set");
let response_deserializer = cfg
.get::<Box<dyn ResponseDeserializer<Res, Result<T, BoxError>>>>()
.ok_or("missing response deserializer")?;
let res = response_deserializer.deserialize_response(tx_res, cfg)?;
ctx.set_modeled_response(res);
interceptors.read_after_deserialization(ctx, cfg)?;
Ok(())
}
Traits
At various points in the execution of invoke
, trait objects are fetched from the ConfigBag
. These are preliminary definitions of those traits:
pub trait TraceProbe: Send + Sync + Debug {
fn dispatch_events(&self, cfg: &ConfigBag) -> BoxFallibleFut<()>;
}
pub trait RequestSerializer<In, TxReq>: Send + Sync + Debug {
fn serialize_request(&self, req: &mut In, cfg: &ConfigBag) -> Result<TxReq, BoxError>;
}
pub trait ResponseDeserializer<TxRes, Out>: Send + Sync + Debug {
fn deserialize_response(&self, res: &mut TxRes, cfg: &ConfigBag) -> Result<Out, BoxError>;
}
pub trait Connection<TxReq, TxRes>: Send + Sync + Debug {
fn call(&self, req: &mut TxReq, cfg: &ConfigBag) -> BoxFallibleFut<TxRes>;
}
pub trait RetryStrategy<Out>: Send + Sync + Debug {
fn should_retry(&self, res: &Out, cfg: &ConfigBag) -> Result<bool, BoxError>;
}
pub trait AuthOrchestrator<Req>: Send + Sync + Debug {
fn auth_request(&self, req: &mut Req, cfg: &ConfigBag) -> Result<(), BoxError>;
}
pub trait EndpointOrchestrator<Req>: Send + Sync + Debug {
fn resolve_and_apply_endpoint(&self, req: &mut Req, cfg: &ConfigBag) -> Result<(), BoxError>;
fn resolve_auth_schemes(&self) -> Result<Vec<String>, BoxError>;
}
F.A.Q.
- The orchestrator is a large and complex feature, with many moving parts. How can we ensure that multiple people can contribute in parallel?
- By defining the entire orchestrator and agreeing on its structure, we can then move on to working on individual runtime plugins and interceptors.
- What is the precedence of interceptors?
- The precedence of interceptors is as follows:
- Interceptors registered via Smithy default plugins.
- (AWS Services only) Interceptors registered via AWS default plugins.
- Interceptors registered via service-customization plugins.
- Interceptors registered via client-level plugins.
- Interceptors registered via client-level configuration.
- Interceptors registered via operation-level plugins.
- Interceptors registered via operation-level configuration.
- The precedence of interceptors is as follows:
- What runtime plugins will be defined in
smithy-rs
?RetryStrategy
: Configures how requests are retried.TraceProbes
: Configures locations to which SDK metrics are published.EndpointProviders
: Configures which hostname an SDK will call when making a request.HTTPClients
: Configures how remote services are called.IdentityProviders
: Configures how customers identify themselves to remote services.HTTPAuthSchemes
&AuthSchemeResolvers
: Configures how customers authenticate themselves to remote services.Checksum Algorithms
: Configures how an SDK calculates request and response checksums.
Changes checklist
-
Create a new
aws-smithy-runtime
crate.- Add orchestrator implementation
- Define the orchestrator/runtime plugin interface traits
TraceProbe
RequestSerializer<In, TxReq>
ResponseDeserializer<TxRes, Out>
Connection<TxReq, TxRes>
RetryStrategy<Out>
AuthOrchestrator<Req>
EndpointOrchestrator<Req>
-
Create a new
aws-smithy-runtime-api
crate.- Add
ConfigBag
module - Add
retries
module- Add
rate_limiting
sub-module
- Add
- Add
interceptors
moduleInterceptor
traitInterceptorContext
impl
- Add
runtime_plugins
module
- Add
- Create a new integration test that ensures the orchestrator works.
RFC: Collection Defaults
Status: Implemented
Applies to: client
For a summarized list of proposed changes, see the Changes Checklist section.
This RFC proposes a breaking change to how generated clients automatically provide default values for collections. Currently the SDK generated fields for List
generate optional values:
/// <p> Container for elements related to a particular part.
pub fn parts(&self) -> Option<&[crate::types::Part]> {
self.parts.as_deref()
}
This is almost never what users want and leads to code noise when using collections:
async fn get_builds() {
let project = codebuild
.list_builds_for_project()
.project_name(build_project)
.send()
.await?;
let build_ids = project
.ids()
.unwrap_or_default();
// ^^^^^^^^^^^^^^^^^^ this is pure noise
}
This RFC proposes unwrapping into default values in our accessor methods.
Terminology
- Accessor: The Rust SDK defines accessor methods on modeled structures for fields to make them more convenient for users
- Struct field: The accessors point to concrete fields on the struct itself.
The user experience if this RFC is implemented
In the current version of the SDK, users must call .unwrap_or_default()
frequently.
Once this RFC is implemented, users will be able to use these accessors directly. In the rare case where users need to
distinguish be None
and []
, we will direct users towards model.<field>.is_some()
.
async fn get_builds() {
let project = codebuild
.list_builds_for_project()
.project_name(build_project)
.send()
.await?;
let build_ids = project.ids();
// Goodbye to this line:
// .unwrap_or_default();
}
How to actually implement this RFC
In order to implement this feature, we need update the code generate accessors for lists and maps to add .unwrap_or_default()
. Because we are returning slices unwrap_or_default()
does not produce any additional allocations for empty collection.
Could this be implemented for HashMap
?
This works for lists because we are returning a slice (allowing a statically owned &[]
to be returned.) If we want to support HashMaps in the future this is possible by using OnceCell
to create empty HashMaps for requisite types. This would allow us to return references to those empty maps.
Isn't this handled by the default trait?
No, many existing APIs don't have the default trait.
Changes checklist
Estimated total work: 2 days
- Update accessor method generation to auto flatten lists
-
Update docs for accessors to guide users to
.field.is_some()
if they MUST determine if the field was set.
RFC: Eliminating Public http
dependencies
Status: Accepted
Applies to: client
For a summarized list of proposed changes, see the Changes Checklist section.
This RFC defines how we plan to refactor the SDK to allow the SDK to consume a 1.0
version of hyper
, http-body
,
and http
at a later date. Currently, hyper
is 0.14.x
and a 1.0
release candidate series is in progress. However,
there are open questions that may significantly delay the launch
of
these three crates. We do not want to tie the 1.0
of the Rust SDK to these crates.
Terminology
- http-body: A crate (and trait) defining how HTTP bodies work. Notably, the change from
0.*
to1.0
changeshttp-body
to operate on frames instead of having separate methods. http
(crate): a low level crate ofhttp
primitives (no logic, just requests and responses)- ossified dependency: An ossified dependency describes a dependency that, when a new version is released, cannot be utilized without breaking changes. For example, if the
mutate_request
function on every operation operates on&mut http::Request
wherehttp = 0.2
, that dependency is "ossified." Compare this to a function that offers the ability to convert something into anhttp = 0.2
request—since http=1 and http=0.2 are largely equivalent, the existence of this function does not prevent us from using http = 1 in the future. In general terms, functions that operate on references are much more likely to ossify—There is no practical way for someone to mutate anhttp = 0.2
request if you have anhttp = 1
request other than a time-consuming clone, and reconversion process.
Why is this important?
Performance:
At some point in the Future, hyper = 1
, http = 1
and http-body = 1
will be released. It takes ~1-2 microseconds to rebuild an HTTP request. If we assume that hyper = 1
will only operate on http = 1
requests, then if we can't use http = 1
requests internally, our only way of supporting hyper = 1
will be to convert the HTTP request at dispatch time. Besides pinning us to a potentially unsupported version of the HTTP crate, this will prevent us from directly dispatching requests in an efficient manner. With a total overhead of 20µs for the SDK, 1µs is not insignificant. Furthermore, it grows as the number of request headers grow. A benchmark should be run for a realistic HTTP request e.g. one that we send to S3.
Hyper Upgrade: Hyper 1 is significantly more flexible than Hyper 0.14.x, especially WRT to connection management & pooling. If we don't make these changes, the upgrade to Hyper 1.x could be significantly more challenging.
Security Fixes:
If we're still on http = 0.*
and a vulnerability is identified, we may end up needing to manually contribute the patch. The http
crate is not trivial and contains parsing logic and optimized code (including a non-trivial amount of unsafe
). See this GitHub issue. Notable is that one issue may be unsound and result in changing the public API.
API Friendliness
If we ship with an API that public exposes customers to http = 0.*
, we have the API forever. We have to consider that we aren't shipping the Rust SDK for this month or even this year but probably the Rust SDK for the next 5-10 years.
Future CRT Usage If we make this change, we enable a future where we can use the CRT HTTP request type natively without needing a last minute conversion to the CRT HTTP Request type.
struct HttpRequest {
inner: Inner
}
enum Inner {
Httpv0(http_0::Request),
Httpv1(http_1::Request),
Crt(aws_crt_http::Request)
}
The user experience if this RFC is implemented
Customers are impacted in 3 main locations:
- HTTP types in Interceptors
- HTTP types in
customize(...)
- HTTP types in Connectors
In all three of these cases, users would interact with our http
wrapper types instead.
In the current version of the SDK, we expose public dependencies on the http
crate in several key places:
- The
sigv4
crate. Thesigv4
crate currently operates directly on many types from thehttp
crate. This is unnecessary and actually makes the crate more difficult to use. Althoughhttp
may be used internally,http
will be removed from the public API of this crate. - Interceptor Context:
interceptor
s can mutate the HTTP request through an unshielded interface. This requires creating a wrapper layer aroundhttp::Request
and updating already written interceptors. aws-config
:http::Response
anduri
- A long tail of exposed requests and responses in the runtime crates. Many of these crates will be removed post-orchestrator so this can be temporarily delayed.
How to actually implement this RFC
Enabling API evolution
One key mechanism that we SHOULD use for allowing our APIs to evolve in the future is usage of ~
version bounds for the runtime crates after releasing 1.0.
Http Request Wrapper
In order to enable HTTP evolution, we will create a set of wrapper structures around http::Request
and http::Response
. These will use http = 0
internally. Since the HTTP crate itself is quite small, including private dependencies on both versions of the crate is a workable solution. In general, we will aim for an API that is close to drop-in compatible to the HTTP crate while ensuring that a different crate could be used as the backing storage.
// since it's our type, we can default `SdkBody`
pub struct Request<B = SdkBody> {
// this uses the http = 0.2 request. In the future, we can make an internal enum to allow storing an http = 1
http_0: http::Request<B>
}
Conversion to/from http::Request
One key property here is that although converting to/from an http::Request
can be expensive, this is not ossification of the API. This is because the API can support converting from/to both http = 0
and http = 1
in the future—because it offers mutation of the request via a unified interface, the request would only need to be converted once for dispatch if there was a mismatch (instead of repeatedly). At some point in the future, the http = 0
representation could be deprecated and removed or feature gated.
Challenges
- Creating an HTTP API which is forwards compatible, idiomatic and "truthful" without relying on existing types from Hyper—e.g. when adding a header, we need to account for the possibility that a header is invalid.
- Allow for future forwards-compatible evolution in the API—A lot of thought went into the
http
crate API w.r.t method parameters, types, and generics. Although we can aim for a simpler solution in some cases (e.g. accepting&str
instead ofHeaderName
), we need to be careful that we do so while allowing API evolution.
Removing the SigV4 HTTP dependency
The SigV4 crate signs a number of HTTP
types directly. We should change it to accept strings, and when appropriate, iterators of strings for headers.
Removing the HTTP dependency from generated clients
Generated clients currently include a public HTTP dependency in customize
. This should be changed to accept our HTTP
wrapper type instead or be restricted to a subset of operations (e.g. add_header
) while forcing users to add an interceptor if they need full control.
Changes checklist
-
Create the
http::Request
wrapper. Carefully audit for compatibility without breaking changes. 5 Days. - Refactor currently written interceptors to use the wrapper: 2 days.
- Refactor the SigV4 crate to remove the HTTP dependency from the public interface: 2 days.
-
Add / validate support for SdkBody
http-body = 1.0rc.2
either in a PR or behind a feature gate. Test this to ensure it works with Hyper. Some previous work here exists: 1 week -
Remove
http::Response
andUri
from the public exposed types inaws-config
: 1-4 days. - Long tail of other usages: 1 week
-
Implement
~
versions for SDK Crate => runtime crate dependencies: 1 week
RFC: The HTTP Wrapper Type
Status: RFC
Applies to: client
For a summarized list of proposed changes, see the Changes Checklist section.
This RFC defines the API of our wrapper types around http::Request
and http::Response
. For more information about why we are wrapping these types, see RFC 0036: The HTTP Dependency.
Terminology
Extensions
/ "Request Extensions": Thehttp
crate Request/Response types include a typed property bag to store additional metadata along with the request.
The user experience if this RFC is implemented
In the current version of the SDK, external customers and internal code interacts directly with the http
crate. Once this RFC is implemented, interactions at the public API level will occur with our own http
types instead.
Our types aim to be nearly drop-in-compatible for types in the http
crate, however:
- We will not expose existing HTTP types in public APIs in ways that are ossified.
- When possible, we aim to simplify the APIs to make them easier to use.
- We will add SDK specific helper functionality when appropriate, e.g. first-level support for applying an endpoint to a request.
How to actually implement this RFC
We will need to add two types, HttpRequest
and HttpResponse
.
To string or not to String
Our header library restricts header names and values to String
s (UTF-8).
Although the http
library is very precise in its representation—it allows for HeaderValue
s that are both a super and subset of String
—a superset because headers support arbitrary binary data but a subset because headers cannot contain control characters like \n
.
Although technically allowed, headers containing arbitrary binary data are not widely supported. Generally, Smithy protocols will use base-64 encoding when storing binary data in headers.
Finally, it's nicer for users if they can stay in "string land". Because of this, HttpRequest and Response expose header names and values as strings. Internally, the current design uses HeaderName
and HeaderValue
, however, there is a gate on construction that enforces that values are valid UTF-8.
This is a one way door because .as_str()
would panic in the future if we allow non-string values into headers.
Where should these types live?
These types will be used by all orchestrator functionality, so they will be housed in aws-smithy-runtime-api
What's in and what's out?
At the onset, these types focus on supporting the most ossified usages: &mut
modification of HTTP types. They do not
support construction of HTTP types, other than impl From<http::Request>
and From<http::Response>
. We will also make it
possible to use http::HeaderName
/ http::HeaderValue
in a zero-cost way.
The AsHeaderComponent
trait
All header insertion methods accept impl AsHeaderComponent
. This allows us to provide a nice user experience while taking
advantage of zero-cost usage of 'static str
. We will seal this trait to prevent external usage. We will have separate implementation for:
&'static str
String
- http02x::HeaderName
Additional Functionality
Our wrapper type will add the following additional functionality:
- Support for
self.try_clone()
- Support for
&mut self.apply_endpoint(...)
Handling failure
There is no stdlib type that cleanly defines what may be placed into headers—String is too broad (even if we restrict to ASCII). This RFC proposes moving fallibility to the APIs:
impl HeadersMut<'_> {
pub fn try_insert(
&mut self,
key: impl AsHeaderComponent,
value: impl AsHeaderComponent,
) -> Result<Option<String>, BoxError> {
// ...
}
}
This allows us to offer user-friendly types while still avoiding runtime panics. We also offer insert
and append
which panic on invalid values.
Request Extensions
There is ongoing work which MAY restrict HTTP extensions to clone types. We will preempt that by:
- Preventing
Extensions
from being present when initially constructing our HTTP request wrapper. - Forbidding non-clone extensions from being inserted into the wrapped request.
This also enables supporting request extensions for different downstream providers by allowing cloning into different extension types.
Proposed Implementation
Proposed Implementation of `request`
{{#include ../../../rust-runtime/aws-smithy-runtime-api/src/client/http/request.rs}}
Future Work
Currently, the only way to construct Request
is from a compatible type (e.g. http02x::Request
)
Changes checklist
- Implement initial implementation and test it against the SDK as written
-
Add test suite of
HTTP
wrapper - External design review
-
Update the SigV4 crate to remove
http
API dependency - Update the SDK to use the new type (breaking change)
RFC: User-configurable retry classification
Status: Implemented
Applies to: client
For a summarized list of proposed changes, see the Changes Checklist section.
This RFC defines the user experience and implementation of user-configurable retry classification. Custom retry classifiers enable users to change what responses are retried while still allowing them to rely on defaults set by SDK authors when desired.
Terminology
- Smithy Service: An HTTP service, whose API is modeled with the Smithy IDL.
- Smithy Client: An HTTP client generated by smithy-rs from a
.smithy
model file. - AWS SDK: A smithy client that's specifically configured to work with an AWS service.
- Operation: A modeled interaction with a service, defining the proper input and expected output shapes, as well as important metadata related to request construction. "Sending" an operation implies sending one or more HTTP requests to a Smithy service, and then receiving an output or error in response.
- Orchestrator: The client code which manages the request/response pipeline.
The orchestrator is responsible for:
- Constructing, serializing, and sending requests.
- Receiving, deserializing, and (optionally) retrying requests.
- Running interceptors (not covered in this RFC) and handling errors.
- Runtime Component: A part of the orchestrator responsible for a specific function. Runtime components are used by the orchestrator itself, may depend on specific configuration, and must not be changed by interceptors. Examples include the endpoint resolver, retry strategy, and request signer.
- Runtime Plugin: Code responsible for setting and runtime components and related configuration. Runtime plugins defined by codegen are responsible for setting default configuration and altering the behavior of Smithy clients including the AWS SDKs.
How the orchestrator should model retries
A Retry Strategy is the process by which the orchestrator determines when and how to retry failed requests. Only one retry strategy may be set at any given time. During its operation, the retry strategy relies on a series of Retry Classifiers to determine if and how a failed request should be retried. Retry classifiers each have a Retry Classifier Priority so that regardless of whether they are set during config or operation construction, they'll always run in a consistent order.
Classifiers are each run in turn by the retry strategy:
pub fn run_classifiers_on_ctx(
classifiers: impl Iterator<Item = SharedRetryClassifier>,
ctx: &InterceptorContext,
) -> RetryAction {
// By default, don't retry
let mut result = RetryAction::NoActionIndicated;
for classifier in classifiers {
let new_result = classifier.classify_retry(ctx);
// If the result is `NoActionIndicated`, continue to the next classifier
// without overriding any previously-set result.
if new_result == RetryAction::NoActionIndicated {
continue;
}
// Otherwise, set the result to the new result.
tracing::trace!(
"Classifier '{}' set the result of classification to '{}'",
classifier.name(),
new_result
);
result = new_result;
// If the result is `RetryForbidden`, stop running classifiers.
if result == RetryAction::RetryForbidden {
tracing::trace!("retry classification ending early because a `RetryAction::RetryForbidden` was emitted",);
break;
}
}
result
}
NOTE: User-defined retry strategies are responsible for calling run_classifiers_on_ctx
.
Lower-priority classifiers run first, but the retry actions they return may be
overridden by higher-priority classifiers. Classification stops immediately if
any classifier returns RetryAction::RetryForbidden
.
The user experience if this RFC is implemented
In the current version of the SDK, users are unable to configure retry classification, except by defining a custom retry strategy. Once this RFC is implemented, users will be able to define and set their own classifiers.
Defining a custom classifier
#[derive(Debug)]
struct CustomRetryClassifier;
impl ClassifyRetry for CustomRetryClassifier {
fn classify_retry(
&self,
ctx: &InterceptorContext,
) -> Option<RetryAction> {
// Check for a result
let output_or_error = ctx.output_or_error();
// Check for an error
let error = match output_or_error {
// Typically, when the response is OK or unset
// then `RetryAction::NoActionIndicated` is returned.
Some(Ok(_)) | None => return RetryAction::NoActionIndicated,
Some(Err(err)) => err,
};
todo!("inspect the error to determine if a retry attempt should be made.")
}
fn name(&self) -> &'static str { "my custom retry classifier" }
fn priority(&self) -> RetryClassifierPriority {
RetryClassifierPriority::default()
}
}
Choosing a retry classifier priority
Sticking with the default priority is often the best choice. Classifiers should restrict the number of cases they can handle in order to avoid having to compete with other classifiers. When two classifiers would classify a response in two different ways, the priority system gives us the ability to decide which classifier should be respected.
Internally, priority is implemented with a simple numeric system. In order to
give the smithy-rs team the flexibility to make future changes, this numeric
system is private and inaccessible to users. Instead, users may set the priority
of classifiers relative to one another with the with_lower_priority_than
and
with_higher_priority_than
methods:
impl RetryClassifierPriority {
/// Create a new `RetryClassifierPriority` with lower priority than the given priority.
pub fn with_lower_priority_than(other: Self) -> Self { ... }
/// Create a new `RetryClassifierPriority` with higher priority than the given priority.
pub fn with_higher_priority_than(other: Self) -> Self { ... }
}
For example, if it was important for our CustomRetryClassifier
in the previous
example to run before the default HttpStatusCodeClassifier
, a user would
define the CustomRetryClassifier
priority like this:
impl ClassifyRetry for CustomRetryClassifier {
fn priority(&self) -> RetryClassifierPriority {
RetryClassifierPriority::run_before(RetryClassifierPriority::http_status_code_classifier())
}
}
The priorities of the three default retry classifiers
(HttpStatusCodeClassifier
, ModeledAsRetryableClassifier
, and
TransientErrorClassifier
) are all public for this purpose. Users may ONLY
set a retry priority relative to an existing retry priority.
RetryAction
and RetryReason
Retry classifiers communicate to the retry strategy by emitting RetryAction
s:
/// The result of running a [`ClassifyRetry`] on a [`InterceptorContext`].
#[non_exhaustive]
#[derive(Clone, Eq, PartialEq, Debug, Default)]
pub enum RetryAction {
/// When a classifier can't run or has no opinion, this action is returned.
///
/// For example, if a classifier requires a parsed response and response parsing failed,
/// this action is returned. If all classifiers return this action, no retry should be
/// attempted.
#[default]
NoActionIndicated,
/// When a classifier runs and thinks a response should be retried, this action is returned.
RetryIndicated(RetryReason),
/// When a classifier runs and decides a response must not be retried, this action is returned.
///
/// This action stops retry classification immediately, skipping any following classifiers.
RetryForbidden,
}
When a retry is indicated by a classifier, the action will contain a RetryReason
:
/// The reason for a retry.
#[non_exhaustive]
#[derive(Clone, Eq, PartialEq, Debug)]
pub enum RetryReason {
/// When an error is received that should be retried, this reason is returned.
RetryableError {
/// The kind of error.
kind: ErrorKind,
/// A server may tell us to retry only after a specific time has elapsed.
retry_after: Option<Duration>,
},
}
NOTE: RetryReason
currently only has a single variant, but it's defined as an enum
for forward compatibility purposes.
RetryAction
's impl
defines several convenience methods:
impl RetryAction {
/// Create a new `RetryAction` indicating that a retry is necessary.
pub fn retryable_error(kind: ErrorKind) -> Self {
Self::RetryIndicated(RetryReason::RetryableError {
kind,
retry_after: None,
})
}
/// Create a new `RetryAction` indicating that a retry is necessary after an explicit delay.
pub fn retryable_error_with_explicit_delay(kind: ErrorKind, retry_after: Duration) -> Self {
Self::RetryIndicated(RetryReason::RetryableError {
kind,
retry_after: Some(retry_after),
})
}
/// Create a new `RetryAction` indicating that a retry is necessary because of a transient error.
pub fn transient_error() -> Self {
Self::retryable_error(ErrorKind::TransientError)
}
/// Create a new `RetryAction` indicating that a retry is necessary because of a throttling error.
pub fn throttling_error() -> Self {
Self::retryable_error(ErrorKind::ThrottlingError)
}
/// Create a new `RetryAction` indicating that a retry is necessary because of a server error.
pub fn server_error() -> Self {
Self::retryable_error(ErrorKind::ServerError)
}
/// Create a new `RetryAction` indicating that a retry is necessary because of a client error.
pub fn client_error() -> Self {
Self::retryable_error(ErrorKind::ClientError)
}
}
Setting classifiers
The interface for setting classifiers is very similar to the interface of settings interceptors:
// All service configs support these setters. Operations support a nearly identical API.
impl ServiceConfigBuilder {
/// Add type implementing ClassifyRetry that will be used by the RetryStrategy
/// to determine what responses should be retried.
///
/// A retry classifier configured by this method will run according to its priority.
pub fn retry_classifier(mut self, retry_classifier: impl ClassifyRetry + 'static) -> Self {
self.push_retry_classifier(SharedRetryClassifier::new(retry_classifier));
self
}
/// Add a SharedRetryClassifier that will be used by the RetryStrategy to
/// determine what responses should be retried.
///
/// A retry classifier configured by this method will run according to its priority.
pub fn push_retry_classifier(&mut self, retry_classifier: SharedRetryClassifier) -> &mut Self {
self.runtime_components.push_retry_classifier(retry_classifier);
self
}
/// Set SharedRetryClassifiers for the builder, replacing any that were
/// previously set.
pub fn set_retry_classifiers(&mut self, retry_classifiers: impl IntoIterator<Item = SharedRetryClassifier>) -> &mut Self {
self.runtime_components.set_retry_classifiers(retry_classifiers.into_iter());
self
}
}
Default classifiers
Smithy clients have three classifiers enabled by default:
ModeledAsRetryableClassifier
: Checks for errors that are marked as retryable in the smithy model. If one is encountered, returnsRetryAction::RetryIndicated
. Requires a parsed response.TransientErrorClassifier
: Checks for timeout, IO, and connector errors. If one is encountered, returnsRetryAction::RetryIndicated
. Requires a parsed response.HttpStatusCodeClassifier
: Checks the HTTP response's status code. By default, this classifies500
,502
,503
, and504
errors asRetryAction::RetryIndicated
. The list of retryable status codes may be customized when creating this classifier with theHttpStatusCodeClassifier::new_from_codes
method.
AWS clients enable the three smithy classifiers as well as one more by default:
AwsErrorCodeClassifier
: Checks for errors with AWS error codes marking them as either transient or throttling errors. If one is encountered, returnsRetryAction::RetryIndicated
. Requires a parsed response. This classifier will also check the HTTP response for anx-amz-retry-after
header. If one is set, then the returnedRetryAction
will include the explicit delay.
The priority order of these classifiers is as follows:
- (highest priority)
TransientErrorClassifier
ModeledAsRetryableClassifier
AwsErrorCodeClassifier
- (lowest priority)
HttpStatusCodeClassifier
The priority order of the default classifiers is not configurable. However, it's
possible to wrap a default classifier in a newtype and set your desired priority
when implementing the ClassifyRetry
trait, delegating the classify_retry
and
name
fields to the inner classifier.
Disable default classifiers
Disabling the default classifiers is possible, but not easy. They are set at different points during config and operation construction, and must be unset at each of those places. A far simpler solution is to implement your own classifier that has the highest priority.
Still, if completely removing the other classifiers is desired, use the
set_retry_classifiers
method on the config to replace the config-level
defaults and then set a config override on the operation that does the same.
How to actually implement this RFC
In order to implement this feature, we must:
- Update the current retry classification system so that individual classifiers as well as collections of classifiers can be easily composed together.
- Create two new configuration mechanisms for users that allow them to customize retry classification at the service level and at the operation level.
- Update retry classifiers so that they may 'short-circuit' the chain, ending retry classification immediately.
The RetryClassifier
trait
/// The result of running a [`ClassifyRetry`] on a [`InterceptorContext`].
#[non_exhaustive]
#[derive(Clone, Eq, PartialEq, Debug)]
pub enum RetryAction {
/// When an error is received that should be retried, this action is returned.
Retry(ErrorKind),
/// When the server tells us to retry after a specific time has elapsed, this action is returned.
RetryAfter(Duration),
/// When a response should not be retried, this action is returned.
NoRetry,
}
/// Classifies what kind of retry is needed for a given [`InterceptorContext`].
pub trait ClassifyRetry: Send + Sync + fmt::Debug {
/// Run this classifier on the [`InterceptorContext`] to determine if the previous request
/// should be retried. If the classifier makes a decision, `Some(RetryAction)` is returned.
/// Classifiers may also return `None`, signifying that they have no opinion of whether or
/// not a request should be retried.
fn classify_retry(
&self,
ctx: &InterceptorContext,
preceding_action: Option<RetryAction>,
) -> Option<RetryAction>;
/// The name of this retry classifier.
///
/// Used for debugging purposes.
fn name(&self) -> &'static str;
/// The priority of this retry classifier. Classifiers with a higher priority will run before
/// classifiers with a lower priority. Classifiers with equal priorities make no guarantees
/// about which will run first.
fn priority(&self) -> RetryClassifierPriority {
RetryClassifierPriority::default()
}
}
Resolving the correct order of multiple retry classifiers
Because each classifier has a defined priority, and because
RetryClassifierPriority
implements PartialOrd
and Ord
, the standard
library's sort method may be used to correctly arrange classifiers. The
RuntimeComponents
struct is responsible for storing classifiers, so it's also
responsible for sorting them whenever a new classifier is added. Thus, when a
retry strategy fetches the list of classifiers, they'll already be in the
expected order.
Questions and answers
- Q: Should retry classifiers be fallible?
- A: I think no, because of the added complexity. If we make them fallible then we'll have to decide what happens when classifiers fail. Do we skip them or does classification end? The retry strategy is responsible for calling the classifiers, so it be responsible for deciding how to handle a classifier error. I don't foresee a use case where an error returned by a classifier would be interpreted either by classifiers following the failed classifier or the retry strategy.
Changes checklist
-
Add retry classifiers field and setters to
RuntimeComponents
andRuntimeComponentsBuilder
.-
Add unit tests ensuring that classifier priority is respected by
RuntimeComponents::retry_classifiers
, especially when multiple layers of config are in play.
-
Add unit tests ensuring that classifier priority is respected by
- Add codegen customization allowing users to set retry classifiers on service configs.
-
Add codegen for setting default classifiers at the service level.
- Add integration tests for setting classifiers at the service level.
-
Add codegen for settings default classifiers that require knowledge of operation error types at the operation level.
- Add integration tests for setting classifiers at the operation level.
-
Implement retry classifier priority.
- Add unit tests for retry classifier priority.
- Update existing tests that would fail for lack of a retry classifier.
RFC: Forward Compatible Errors
Status: RFC
Applies to: client
For a summarized list of proposed changes, see the Changes Checklist section.
This RFC defines an approach for making it forwards-compatible to convert unmodeled Unhandled
errors into modeled ones. This occurs as servers update their models to include errors that were previously unmodeled.
Currently, SDK errors are not forward compatible in this way. If a customer matches Unhandled
in addition to the _
branch and a new variant is added, they will fail to match the new variant. We currently handle this issue with enums by prevent useful information from being readable from the Unknown
variant.
This is related to ongoing work on the non_exhaustive_omitted_patterns
lint which would produce a compiler warning when a new variant was added even when _
was used.
Terminology
For purposes of discussion, consider the following error:
#[non_exhaustive]
pub enum AbortMultipartUploadError {
NoSuchUpload(NoSuchUpload),
Unhandled(Unhandled),
}
- Modeled Error: An error with an named variant, e.g.
NoSuchUpload
above - Unmodeled Error: Any other error, e.g. if the server returned
ValidationException
for the above operation. - Error code: All errors across all protocols provide a
code
, a unique method to identify an error across the service closure.
The user experience if this RFC is implemented
In the current version of the SDK, users match the Unhandled
variant. They can then read the code from the Unhandled
variant because Unhandled
implements the ProvideErrorMetadata
trait as well as the standard-library std::error::Error
trait.
Note: It's possible to write correct code today because the operation-level and service-level errors already expose
code()
viaProvideErrorMetadata
. This RFC describes mechanisms to guide customers to write forward-compatible code.
fn docs() {
match client.get_object().send().await {
Ok(obj) => { ... },
Err(e) => match e.into_service_error() {
GetObjectError::NotFound => { ... },
GetObjectError::Unhandled(err) if err.code() == "ValidationException" => { ... }
other => { /** do something with this variant */ }
}
}
}
We must instead guide customers into the following pattern:
fn docs() {
match client.get_object().send().await {
Ok(obj) => { ... },
Err(e) => match e.into_service_error() {
GetObjectError::NotFound => { ... },
err if err.code() == "ValidationException" => { ... },
err => warn!("{}", err.code()),
}
}
}
In this example, because customers are not matching on the Unhandled
variant explicitly this code is forward compatible for ValidationException
being introduced in the future.
Guiding Customers to this Pattern There are two areas we need to handle:
- Prevent customers from extracting useful information from
Unhandled
- Alert customers currently using unhandled what to use instead. For example, the following code is still problematic:
match err { GetObjectError::NotFound => { ... }, err @ GetObject::Unhandled(_) if err.code() == Some("ValidationException") => { ... } }
For 1
, we need to remove the ProvideErrorMetadata
trait implementation from Unhandled
. We would expose this isntead through a layer of indirection to enable code generated to code to still read the data.
For 2
, we would deprecate the Unhandled
variants with a message clearly indicating how this code should be written.
How to actually implement this RFC
Locking down Unhandled
In order to prevent accidental matching on Unhandled
, we need to make it hard to extract useful information from Unhandled
itself. We will do this by removing the ProvideErrorMetadata
trait implementation and exposing the following method:
#[doc(hidden)]
/// Introspect the error metadata of this error.
///
/// This method should NOT be used from external code because matching on `Unhandled` directly is a backwards-compatibility
/// hazard. See `RFC-0039` for more information.
pub fn introspect(&self) -> impl ProvideErrorMetadata + '_ {
struct Introspected<'a>(&'a Unhandled);
impl ProvideErrorMetadata for Introspected { ... }
Introspected(self)
}
Generated code would this use introspect
when supporting top-level ErrorMetadata
(e.g. for aws_sdk_s3::Error
).
Deprecating the Variant
The Unhandled
variant will be deprecated to prevent users from matching on it inadvertently.
enum GetObjectError {
NotFound(NotFound),
#[deprecated("Matching on `Unhandled` directly is a backwards compatibility hazard. Use `err if err.error_code() == ...` instead. See [here](<docs about using errors>) for more information.")]
Unhandled(Unhandled)
}
Changes checklist
-
Generate code to deprecate unhandled variants. Determine the best way to allow
Unhandled
to continue to be constructed in client code -
Generate code to deprecate the
Unhandled
variant for the service meta-error. Consider how this interacts with non-service errors. -
Update
Unhandled
to make it useless on its own and expose information via anIntrospect
doc hidden struct. - Update developer guide to address this issue.
- Changelog & Upgrade Guidance
RFC: Behavior Versions
Status: RFC
Applies to: client
For a summarized list of proposed changes, see the Changes Checklist section.
This RFC describes "Behavior Versions," a mechanism to allow SDKs to ship breaking behavioral changes like a new retry strategy, while allowing customers who rely on extremely consistent behavior to evolve at their own pace.
By adding behavior major versions (BMV) to the Rust SDK, we will make it possible to ship new secure/recommended defaults to new customers without impacting legacy customers.
The fundamental issue stems around our inability to communicate and decouple releases of service updates and behavior within a single major version.
Both legacy and new SDKs have the need to alter their SDKs default. Historically, this caused new customers on legacy SDKs to be subject to legacy defaults, even when a better alternative existed.
For new SDKs, a GA cutline presents difficult choices around timeline and features that can’t be added later without altering behavior.
Both of these use cases are addressed by Behavior Versions.
The user experience if this RFC is implemented
In the current version of the SDK, users can construct clients without indicating any sort of behavior major version. Once this RFC is implemented, there will be two ways to set a behavior major version:
- In code via
aws_config::defaults(BehaviorVersion::latest())
and<service>::Config::builder().behavior_version(...)
. This will also work forconfig_override
. - By enabling
behavior-version-latest
in eitheraws-config
(which brings backfrom_env
) OR a specific generated SDK crate
# Cargo.toml
[dependencies]
aws-config = { version = "1", features = ["behavior-version-latest"] }
# OR
aws-sdk-s3 = { version = "1", features = ["behavior-version-latest"] }
If no BehaviorVersion
is set, the client will panic during construction.
BehaviorVersion
is an opaque struct with initializers like ::latest()
, ::v2023_11_09()
. Downstream code can check the version by calling methods like ::supports_v1()
When new BMV are added, the previous version constructor will be marked as deprecated
. This serves as a mechanism to alert customers that a new BMV exists to allow them to upgrade.
How to actually implement this RFC
In order to implement this feature, we need to create a BehaviorVersion
struct, add config options to SdkConfig
and aws-config
, and wire it throughout the stack.
#![allow(unused)] fn main() { /// Behavior major-version of the client /// /// Over time, new best-practice behaviors are introduced. However, these behaviors might not be backwards /// compatible. For example, a change which introduces new default timeouts or a new retry-mode for /// all operations might be the ideal behavior but could break existing applications. #[derive(Debug, Clone)] pub struct BehaviorVersion { // currently there is only 1 MV so we don't actually need anything in here. _private: (), } }
To help customers migrate, we are including from_env
hooks that set behavior-version-latest
that are deprecated. This allows customers to see that they are missing the required cargo feature and add it to remove the deprecation warning.
Internally, BehaviorVersion
will become an additional field on <client>::Config
. It is not ever stored in the ConfigBag
or in RuntimePlugins
.
When constructing the set of "default runtime plugins," the default runtime plugin parameters will be passed the BehaviorVersion
. This will select the correct runtime plugin. Logging will clearly indicate which plugin was selected.
Design Alternatives Considered
An original design was also considered that made BMV optional and relied on documentation to steer customers in the right direction. This was deemed too weak of a mechanism to ensure that customers aren't broken by unexpected changes.
Changes checklist
-
Create
BehaviorVersion
and the BMV runtime plugin - Add BMV as a required runtime component
- Wire up setters throughout the stack
- Add tests of BMV (set via aws-config, cargo features & code params)
-
RemoveWe decided to persist these deprecationsaws_config::from_env
deprecation stand-ins - Update generated usage examples
RFC: Improve Client Error Ergonomics
Status: Implemented
Applies to: clients
This RFC proposes some changes to code generated errors to make them easier to use for customers. With the SDK and code generated clients, customers have two primary use-cases that should be made easy without compromising the compatibility rules established in RFC-0022:
- Checking the error type
- Retrieving information specific to that error type
Case Study: Handling an error in S3
The following is an example of handling errors with S3 with the latest generated (and unreleased) SDK as of 2022-12-07:
let result = client
.get_object()
.bucket(BUCKET_NAME)
.key("some-key")
.send()
.await;
match result {
Ok(_output) => { /* Do something with the output */ }
Err(err) => match err.into_service_error() {
GetObjectError { kind, .. } => match kind {
GetObjectErrorKind::InvalidObjectState(value) => println!("invalid object state: {:?}", value),
GetObjectErrorKind::NoSuchKey(_) => println!("object didn't exist"),
}
err @ GetObjectError { .. } if err.code() == Some("SomeUnmodeledError") => {}
err @ _ => return Err(err.into()),
},
}
The refactor that implemented RFC-0022 added the into_service_error()
method on SdkError
that
infallibly converts the SdkError
into the concrete error type held by the SdkError::ServiceError
variant.
This improvement lets customers discard transient failures and immediately handle modeled errors
returned by the service.
Despite this, the code is still quite verbose.
Proposal: Combine Error
and ErrorKind
At time of writing, each operation has both an Error
and ErrorKind
type generated.
The Error
type holds information that is common across all operation errors: message,
error code, "extra" key/value pairs, and the request ID.
The ErrorKind
is always nested inside the Error
, which results in the verbose
nested matching shown in the case study above.
To make error handling more ergonomic, the code generated Error
and ErrorKind
types
should be combined. Hypothetically, this would allow for the case study above to look as follows:
let result = client
.get_object()
.bucket(BUCKET_NAME)
.key("some-key")
.send()
.await;
match result {
Ok(_output) => { /* Do something with the output */ }
Err(err) => match err.into_service_error() {
GetObjectError::InvalidObjectState(value) => {
println!("invalid object state: {:?}", value);
}
err if err.is_no_such_key() => {
println!("object didn't exist");
}
err if err.code() == Some("SomeUnmodeledError") => {}
err @ _ => return Err(err.into()),
},
}
If a customer only cares about checking one specific error type, they can also do:
match result {
Ok(_output) => { /* Do something with the output */ }
Err(err) => {
let err = err.into_service_error();
if err.is_no_such_key() {
println!("object didn't exist");
} else {
return Err(err);
}
}
}
The downside of this is that combining the error types requires adding the general error metadata to each generated error struct so that it's accessible by the enum error type. However, this aligns with our tenet of making things easier for customers even if it makes it harder for ourselves.
Changes Checklist
-
Merge the
${operation}Error
/${operation}ErrorKind
code generators to only generate an${operation}Error
enum:- Preserve the
is_${variant}
methods - Preserve error metadata by adding it to each individual variant's context struct
- Preserve the
- Write upgrade guidance
- Fix examples
RFC: File-per-change changelog
Status: Implemented
Applies to: client and server
For a summarized list of proposed changes, see the Changes Checklist section.
Historically, the smithy-rs and AWS SDK for Rust's changelogs and release notes have been
generated from the changelogger
tool in tools/ci-build/changelogger
. This is a tool built
specifically for development and release of smithy-rs, and it requires developers to add
changelog entries to a root CHANGELOG.next.toml
file. Upon release, the [[smithy-rs]]
entries
in this file go into the smithy-rs release notes, and the [[aws-sdk-rust]]
entries are associated
with a smithy-rs release commit hash, and added to the aws/SDK_CHANGELOG.next.json
for
incorporation into the AWS SDK's changelog when it releases.
This system has gotten us far, but it has always made merging PRs into main more difficult
since the central CHANGELOG.next.toml
file is almost always a merge conflict for two PRs
with changelog entries.
This RFC proposes a new approach to change logging that will remedy the merge conflict issue, and explains how this can be done without disrupting the current release process.
The proposed developer experience
There will be a changelog/
directory in the smithy-rs root where
developers can add changelog entry Markdown files. Any file name can be picked
for these entries. Suggestions are the development branch name for the
change, or the PR number.
The changelog entry format will change to make it easier to duplicate entries across both smithy-rs and aws-sdk-rust, a common use-case.
This new format will make use of Markdown front matter in the YAML format. This change in format has a couple benefits:
- It's easier to write change entries in Markdown than in a TOML string.
- There's no way to escape special characters (such as quotes) in a TOML string, so the text that can be a part of the message will be expanded.
While it would be preferable to use TOML for the front matter (and there are libraries that support that), it will use YAML so that GitHub's Markdown renderer will recognize it.
A changelog entry file will look as follows:
---
# Adding `aws-sdk-rust` here duplicates this entry into the SDK changelog.
applies_to: ["client", "server", "aws-sdk-rust"]
authors: ["author1", "author2"]
references: ["smithy-rs#1234", "aws-sdk-rust#1234"]
# The previous `meta` section is broken up into its constituents:
breaking: false
# This replaces "tada":
new_feature: false
bug_fix: false
---
Some message for the change.
Implementation
When a release is performed, the release script will generate the release notes,
update the CHANGELOG.md
file, copy SDK changelog entries into the SDK,
and delete all the files in changelog/
.
SDK Entries
The SDK changelog entries currently end up in aws/SDK_CHANGELOG.next.json
, and each entry
is given age
and since_commit
entries. The age is a number that starts at zero, and gets
incremented with every smithy-rs release. When it reaches a hardcoded threshold, that entry
is removed from aws/SDK_CHANGELOG.next.json
. The SDK release process uses the since_commit
to determine which changelog entries go into the next SDK release's changelog.
The SDK release process doesn't write back to smithy-rs, and history has shown that it
can't since this leads to all sorts of release issues as PRs get merged into smithy-rs
while the release is in progress. Thus, this age
/since_commit
dichotomy needs to
stay in place.
The aws/SDK_CHANGELOG.next.json
will stay in place in its current format without changes.
Its JSON format is capable of escaping characters in the message string, so it will be
compatible with the transition from TOML to Markdown with YAML front matter.
The SDK_CHANGELOG.next.json
file has had merge conflicts in the past, but this only
happened when the release process wasn't followed correctly. If we're consistent with
our release process, it should never have conflicts.
Safety requirements
Implementation will be tricky since it needs to be done without disrupting the existing
release process. The biggest area of risk is the SDK sync job that generates individual
commits in the aws-sdk-rust repo for each commit in the smithy-rs release. Fortunately,
the changelogger
is invoked a single time at the very end of that process, and only
the latest changelogger
version that is included in the build image. Thus, we can safely
refactor the changelogger
tool so long as the command-line interface for it remains
backwards compatible. (We could change the CLI interface as well, but it will
require synchronizing the smithy-rs changes with changes to the SDK release scripts.)
At a high level, these requirements must be observed to do this refactor safely:
- The CLI for the
changelogger render
subcommand MUST stay the same, or have minimal backwards compatible changes made to it. - The
SDK_CHANGELOG.next.json
format can change, but MUST remain a single JSON file. If it is changed at all, the existing file MUST be transitioned to the new format, and a mechanism MUST be in place for making sure it is the correct format after merging with other PRs. It's probably better to leave this file alone though, or make any changes to it backwards compatible.
Future Improvements
After the initial migration, additional niceties could be added such as pulling authors from git history rather than needing to explicitly state them (at least by default; there should always be an option to override the author in case a maintainer adds a changelog entry on behalf of a contributor).
Changes checklist
- Refactor changelogger and smithy-rs-tool-common to separate the changelog serialization format from the internal representation used for rendering and splitting.
- Implement deserialization for the new Markdown entry format
-
Incorporate new format into the
changelogger render
subcommand -
Incorporate new format into the
changelogger split
subcommand -
Port existing
CHANGELOG.next.toml
to individual entries -
Update
sdk-lints
to fail ifCHANGELOG.next.toml
exists at all to avoid losing changelog entries during merges. - Dry-run test against the smithy-rs release process.
- Dry-run test against the SDK release process.
RFC: Identity Cache Partitions
Status: Implemented
Applies to: AWS SDK for Rust
Motivation
In the below example two clients are created from the same shared SdkConfig
instance and each
invoke a fictitious operation. Assume the operations use the same auth scheme relying on the same identity resolver.
#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
let config = aws_config::defaults(BehaviorVersion::latest())
.load()
.await;
let c1 = aws_sdk_foo::Client::new(&config);
c1.foo_operation().send().await;
let c2 = aws_sdk_bar::Client::new(&config);
c2.bar_operation().send().await;
Ok(())
}
There are two problems with this currently.
-
The identity resolvers (e.g.
credentials_provider
for SigV4) are re-used but we end up with a differentIdentityCachePartition
each time a client is created.- More specifically this happens every time a
SharedIdentityResolver
is created. The conversion fromFrom<SdkConfig>
sets the credentials provider which associates it as the identity resolver for the auth scheme. Internally this is converted toSharedIdentityResolver
which creates the new partition (if it were already aSharedIdentityResolver
this would be detected and a new instance would not be created which means it must be aSharedCredentialsProvider
orSharedTokenProvider
that is getting converted). The end result is the credentials provider from shared config is re-used but the cache partition differs so a cache miss occurs the first time any new client created from that shared config needs credentials.
- More specifically this happens every time a
-
The
SdkConfig
does not create an identity cache by default. Even if the partitioning is fixed, any clients created from a shared config instance will end up with their own identity cache which also results in having to resolve identity again. Only if a user supplies an identity cache explicitly when creating shared config would it be re-used across different clients.
Design intent
Identity providers and identity caching are intentionally decoupled. This allows caching behavior to be more easily
customized and centrally configured while also removing the need for each identity provider to have to implement
caching. There is some fallout from sharing an identity cache though. This is fairly well documented on
IdentityCachePartition
itself.
/// ...
///
/// Identities need cache partitioning because a single identity cache is used across
/// multiple identity providers across multiple auth schemes. In addition, a single auth scheme
/// may have many different identity providers due to operation-level config overrides.
///
/// ...
pub struct IdentityCachePartition(...)
Cache partitioning allows for different identity types to be stored in the same cache instance as long as they are assigned to a different partition. Partitioning also solves the issue of overriding configuration on a per operation basis where it would not be the correct or desired behavior to re-use or overwrite the cache if a different resolver is used.
In other words cache partitioning is effectively tied to a particular instance of an identity resolver. Re-using the same instance of a resolver SHOULD be allowed to share a cache partition. The fact that this isn't the case today is an oversight in how types are wrapped and threaded through the SDK.
The user experience if this RFC is implemented
In the current version of the SDK, users are unable to share cached results of identity resolvers via shared SdkConfig
across clients.
Once this RFC is implemented, users that create clients via SdkConfig
with the latest behavior version will share
a default identity cache. Shared identity resolvers (e.g. credentials_provider
, token_provider
, etc) will provide
their own cache partition that is re-used instead of creating a new one each time a provider is converted into a
SharedIdentityResolver
.
Default behavior
let config = aws_config::defaults(BehaviorVersion::latest())
.load()
.await;
let c1 = aws_sdk_foo::Client::new(&config);
c1.foo_operation().send().await;
let c2 = aws_sdk_bar::Client::new(&config);
// will re-use credentials/identity resolved via c1
c2.bar_operation().send().await;
Operations invoked on c2
will see the results of cached identities resolved by client c1
(for operations that use
the same identity resolvers). The creation of a default identity cache in SdkConfig
if not provided will be added
behind a new behavior version.
Opting out
Users can disable the shared identity cache by explicitly setting it to None
. This will result in each client
creating their own identity cache.
let config = aws_config::defaults(BehaviorVersion::latest())
// new method similar to `no_credentials()` to disable default cache setup
.no_identity_cache()
.load()
.await;
let c1 = aws_sdk_foo::Client::new(&config);
c1.foo_operation().send().await;
let c2 = aws_sdk_bar::Client::new(&config);
c2.bar_operation().send().await;
The same can be achieved by explicitly supplying a new identity cache to a client:
let config = aws_config::defaults(BehaviorVersion::latest())
.load()
.await;
let c1 = aws_sdk_foo::Client::new(&config);
c1.foo_operation().send().await;
let modified_config = aws_sdk_bar::Config::from(&config)
.to_builder()
.identity_cache(IdentityCache::lazy().build())
.build();
// uses it's own identity cache
let c2 = aws_sdk_bar::Client::from_conf(modified_config);
c2.bar_operation().send().await;
Interaction with operation config override
How per/operation configuration override behaves depends on what is provided for an identity resolver.
let config = aws_config::defaults(BehaviorVersion::latest())
.load()
.await;
let c1 = aws_sdk_foo::Client::new(&config);
let scoped_creds = my_custom_provider();
let config_override = c1
.config()
.to_builder()
.credentials_provider(scoped_creds);
// override config for two specific operations
c1.operation1()
.customize()
.config_override(config_override);
.send()
.await;
c1.operation2()
.customize()
.config_override(config_override);
.send()
.await;
By default if an identity resolver does not provide it's own cache partition then operation1
and operation2
will
be wrapped in new SharedIdentityResolver
instances and get distinct cache partitions. If my_custom_provider()
provides it's own cache partition then operation2
will see the cached results.
Users can control this by wrapping their provider into a SharedCredentialsProvider
which will claim it's own
cache partition.
let scoped_creds = SharedCredentialsProvider::new(my_custom_provider());
let config_override = c1
.config()
.to_builder()
.set_credentials_provider(Some(scoped_creds));
...
How to actually implement this RFC
In order to implement this RFC implementations of ResolveIdentity
need to be allowed to provide their own cache
partition.
pub trait ResolveIdentity: Send + Sync + Debug {
...
/// Returns the identity cache partition associated with this identity resolver.
///
/// By default this returns `None` and cache partitioning is left up to `SharedIdentityResolver`.
/// If sharing instances of this type should use the same partition then you should override this
/// method and return a claimed partition.
fn cache_partition(&self) -> Option<IdentityCachePartition> {
None
}
}
Crucially cache partitions must remain globally unique so this method returns IdentityCachePartition
which is
unique by construction. It doesn't matter if partitions are claimed early by an implementation of ResolveIdentity
or at the time they are wrapped in SharedIdentityResolver
.
This is because SdkConfig
stores instances of SharedCredentialsProvider
(or SharedTokenProvider
) rather than
SharedIdentityResolver
which is what currently knows about cache partitioning. By allowing implementations of ResolveIdentity
to provide their own partition then SharedCredentialsProvider
can claim a partition at construction time
and return that which will re-use the same partition anywhere that the provider is shared.
#[derive(Clone, Debug)]
pub struct SharedCredentialsProvider(Arc<dyn ProvideCredentials>, IdentityCachePartition);
impl SharedCredentialsProvider {
pub fn new(provider: impl ProvideCredentials + 'static) -> Self {
Self(Arc::new(provider), IdentityCachePartition::new())
}
}
impl ResolveIdentity for SharedCredentialsProvider {
...
fn cache_partition(&self) -> Option<IdentityCachePartition> {
Some(self.1)
}
}
Additionally a new behavior version must be introduced that conditionally creates a default IdentityCache
on SdkConfig
if not explicitly configured (similar to how credentials provider works internally).
Alternatives Considered
SdkConfig
internally
stores SharedCredentialsProvider
/SharedTokenProvider
. Neither of these types knows anything about cache partitioning.
One alternative would be to create and store a SharedIdentityResolver
for each identity resolver type.
pub struct SdkConfig {
...
credentials_provider: Option<SharedCredentialsProvider>,
credentials_identity_provider: Option<SharedIdentityResolver>,
token_provider: Option<SharedTokenProvider>,
token_identity_provider: Option<SharedIdentityResolver>,
}
Setting one of the identity resolver types like credentials_provider
would also create and set the equivalent
SharedIdentityResolver
which would claim a cache partition. When generating the From<SdkConfig>
implementations
the identity resolver type would be favored.
There are a few downsides to this approach:
SdkConfig
would have to expose accessor methods for the equivalents (e.g.credentials_identity_provider(&self) -> Option<&SharedIdentityResolver>
). This creates additional noise and confusion as well as the chance for using the type wrong.- Every new identity type added to
SdkConfig
would have to be sure to useSharedIdentityResolver
.
The advantage of the proposed approach of letting ResolveIdentity
implementations provide a cache partition means
SdkConfig
does not need to change. It also gives customers more control over whether an identity resolver implementation
shares a cache partition or not.
Changes checklist
-
Add new
cache_partition()
method toResolveIdentity
-
Update
SharedIdentityResolver::new
to use the newcache_partition()
method on theresolver
to determine if a new cache partition should be created or not -
Claim a cache partition when
SharedCredentialsProvider
is created and override the newResolveIdentity
method -
Claim a cache partition when
SharedTokenProvider
is created and override the newResolveIdentity
method - Introduce new behavior version
-
Conditionally (gated on behavior version) create a new default
IdentityCache
onSdkConfig
if not explicitly configured -
Add a new
no_identity_cache()
method toConfigLoader
that marks the identity cache as explicitly unset
RFC: Environment-defined service configuration
Status: RFC
Applies to: client
For a summarized list of proposed changes, see the Changes Checklist section.
In the AWS SDK for Rust today, customers are limited to setting global configuration variables in their environment; They cannot set service-specific variables. Other SDKs and the AWS CLI do allow for setting service-specific variables.
This RFC proposes an implementation that would enable users to set service-specific variables in their environment.
Terminology
- Global configuration: configuration which will be used for requests to any service. May be overridden by service-specific configuration.
- Service-specific configuration: configuration which will be used for requests only to a specific service.
- Configuration variable: A key-value pair that defines configuration e.g.
key = value
,key: value
,KEY=VALUE
, etc.- Key and value as used in this RFC refer to each half of a configuration variable.
- Sub-properties: When parsing config variables from a profile file,
sub-properties are a newline-delimited list of key-value pairs in an indented
block following a
<service name>=\n
line. For an example, see the Profile File Configuration section of this RFC where sub-properties are declared for two different services.
The user experience if this RFC is implemented
While users can already set global configuration in their environment, this RFC proposes two new ways to set service-specific configuration in their environment.
Environment Variables
When defining service-specific configuration with an environment variable, all keys are formatted like so:
"AWS" + "_" + "<config key in CONST_CASE>" + "_" + "<service ID in CONST_CASE>"
As an example, setting an endpoint URL for different services would look like this:
export AWS_ENDPOINT_URL=http://localhost:4444
export AWS_ENDPOINT_URL_ELASTICBEANSTALK=http://localhost:5555
export AWS_ENDPOINT_URL_DYNAMODB=http://localhost:6666
The first variable sets a global endpoint URL. The second variable overrides the first variable, but only for the Elastic Beanstalk service. The third variable overrides the first variable, but only for the DynamoDB service.
Profile File Configuration
When defining service-specific configuration in a profile file, it looks like this:
[profile dev]
services = testing-s3-and-eb
endpoint_url = http://localhost:9000
[services testing-s3-and-eb]
s3 =
endpoint_url = http://localhost:4567
elasticbeanstalk =
endpoint_url = http://localhost:8000
When dev
is the active profile, all services will use the
http://localhost:9000
endpoint URL except where it is overridden. Because the
dev
profile references the testing-s3-and-eb
services, and because two
service-specific endpoint URLs are set, those URLs will override the
http://localhost:9000
endpoint URL when making requests to S3
(http://localhost:4567
) and Elastic Beanstalk (http://localhost:8000
).
Configuration Precedence
When configuration is set in multiple places, the value used is determined in this order of precedence:
highest precedence
- EXISTING Programmatic client configuration
- NEW Service-specific environment variables
- EXISTING Global environment variables
- NEW Service-specific profile file variables in the active profile
- EXISTING Global profile file variables in the active profile
lowest precedence
How to actually implement this RFC
This RFC may be implemented in several steps which are detailed below.
Sourcing service-specific config from the environment and profile
aws_config::profile::parser::ProfileSet
is responsible for storing the active
profile and all profile configuration data. Currently, it only tracks
sso_session
and profile
sections, so it must be updated to store arbitrary
sections, their properties, and sub-properties. These sections will be publicly
accessible via a new method ProfileSet::other_sections
which returns a ref to
a Properties
struct.
The Properties
struct is defined as follows:
type SectionKey = String;
type SectionName = String;
type PropertyName = String;
type SubPropertyName = String;
type PropertyValue = String;
/// A key for to a property value.
///
/// ```txt
/// # An example AWS profile config section with properties and sub-properties
/// [section-key section-name]
/// property-name = property-value
/// property-name =
/// sub-property-name = property-value
/// ```
#[derive(Clone, Debug, PartialEq, Eq, Hash)]
pub struct PropertiesKey {
section_key: SectionKey,
section_name: SectionName,
property_name: PropertyName,
sub_property_name: Option<SubPropertyName>,
}
impl PropertiesKey {
/// Create a new builder for a `PropertiesKey`.
pub fn builder() -> Builder {
Default::default()
}
}
// The builder code is omitted from this RFC. It allows users to set each field
// individually and then build a PropertiesKey
/// A map of [`PropertiesKey`]s to property values.
#[derive(Clone, Debug, Default, PartialEq, Eq)]
pub struct Properties {
inner: HashMap<PropertiesKey, PropertyValue>,
}
impl Properties {
/// Create a new empty [`Properties`].
pub fn new() -> Self {
Default::default()
}
#[cfg(test)]
pub(crate) fn new_from_slice(slice: &[(PropertiesKey, PropertyValue)]) -> Self {
let mut properties = Self::new();
for (key, value) in slice {
properties.insert(key.clone(), value.clone());
}
properties
}
/// Insert a new key/value pair into this map.
pub fn insert(&mut self, properties_key: PropertiesKey, value: PropertyValue) {
let _ = self
.inner
// If we don't clone then we don't get to log a useful warning for a value getting overwritten.
.entry(properties_key.clone())
.and_modify(|v| {
tracing::trace!("overwriting {properties_key}: was {v}, now {value}");
*v = value.clone();
})
.or_insert(value);
}
/// Given a [`PropertiesKey`], return the corresponding value, if any.
pub fn get(&self, properties_key: &PropertiesKey) -> Option<&PropertyValue> {
self.inner.get(properties_key)
}
}
The aws_config::env
module remains unchanged. It already provides all the
necessary functionality.
Exposing valid service configuration during <service>::Config
construction
Environment variables (from Env
) and profile variables (from
EnvConfigSections
) must be available during the conversion of SdkConfig
to
<service>::Config
. To accomplish this, we'll define a new trait
LoadServiceConfig
and implement it for EnvServiceConfig
which will be
stored in the SdkConfig
struct.
/// A struct used with the [`LoadServiceConfig`] trait to extract service config from the user's environment.
// [profile active-profile]
// services = dev
//
// [services dev]
// service-id =
// config-key = config-value
#[derive(Clone, Debug, PartialEq, Eq, Hash)]
pub struct ServiceConfigKey<'a> {
service_id: &'a str,
profile: &'a str,
env: &'a str,
}
impl<'a> ServiceConfigKey<'a> {
/// Create a new [`ServiceConfigKey`] builder struct.
pub fn builder() -> builder::Builder<'a> {
Default::default()
}
/// Get the service ID.
pub fn service_id(&self) -> &'a str {
self.service_id
}
/// Get the profile key.
pub fn profile(&self) -> &'a str {
self.profile
}
/// Get the environment key.
pub fn env(&self) -> &'a str {
self.env
}
}
/// Implementers of this trait can provide service config defined in a user's environment.
pub trait LoadServiceConfig: fmt::Debug + Send + Sync {
/// Given a [`ServiceConfigKey`], return the value associated with it.
fn load_config(&self, key: ServiceConfigKey<'_>) -> Option<String>;
}
#[derive(Debug)]
pub(crate) struct EnvServiceConfig {
pub(crate) env: Env,
pub(crate) env_config_sections: EnvConfigSections,
}
impl LoadServiceConfig for EnvServiceConfig {
fn load_config(&self, key: ServiceConfigKey<'_>) -> Option<String> {
let (value, _source) = EnvConfigValue::new()
.env(key.env())
.profile(key.profile())
.service_id(key.service_id())
.load(&self.env, Some(&self.env_config_sections))?;
Some(value.to_string())
}
}
Code generation
We require two things to check for when constructing the service config:
- The service's ID
- The service's supported configuration variables
We only have this information once we get to the service level. Because of that, we must use code generation to define:
- What config to look for in the environment
- How to validate that config
Codegen for configuration must be updated for all config variables that we want
to support. For an example, here's how we'd update the RegionDecorator
to check
for service-specific regions:
class RegionDecorator : ClientCodegenDecorator {
// ...
override fun extraSections(codegenContext: ClientCodegenContext): List<AdHocCustomization> {
return usesRegion(codegenContext).thenSingletonListOf {
adhocCustomization<SdkConfigSection.CopySdkConfigToClientConfig> { section ->
rust(
"""
${section.serviceConfigBuilder}.set_region(
${section.sdkConfig}
.service_config()
.and_then(|conf| {
conf.load_config(service_config_key($envKey, $profileKey))
.map(Region::new)
})
.or_else(|| ${section.sdkConfig}.region().cloned()),
);
""",
)
}
}
}
// ...
To construct the keys necessary to locate the service-specific configuration, we
generate a service_config_key
function for each service crate:
class ServiceEnvConfigDecorator : ClientCodegenDecorator {
override val name: String = "ServiceEnvConfigDecorator"
override val order: Byte = 10
override fun extras(
codegenContext: ClientCodegenContext,
rustCrate: RustCrate,
) {
val rc = codegenContext.runtimeConfig
val serviceId = codegenContext.serviceShape.sdkId().toSnakeCase().dq()
rustCrate.withModule(ClientRustModule.config) {
Attribute.AllowDeadCode.render(this)
rustTemplate(
"""
fn service_config_key<'a>(
env: &'a str,
profile: &'a str,
) -> aws_types::service_config::ServiceConfigKey<'a> {
#{ServiceConfigKey}::builder()
.service_id($serviceId)
.env(env)
.profile(profile)
.build()
.expect("all field sets explicitly, can't fail")
}
""",
"ServiceConfigKey" to AwsRuntimeType.awsTypes(rc).resolve("service_config::ServiceConfigKey"),
)
}
}
}
Changes checklist
- In
aws-types
:-
Add new
service_config: Option<Arc<dyn LoadServiceConfig>>
field toSdkConfig
and builder. -
Add setters and getters for the new
service_config
field. -
Add a new
service_config
module.-
Add new
ServiceConfigKey
struct and builder. -
Add new
LoadServiceConfig
trait.
-
Add new
-
Add new
- In
aws-config
:-
Move profile parsing out of
aws-config
intoaws-runtime
. -
Deprecate the
aws-config
reëxports and direct users toaws-runtime
. -
Add a new
EnvServiceConfig
struct and implementLoadServiceConfig
for it. -
Update
ConfigLoader
to set theservice_config
field inSdkConfig
. -
Update all default providers to use the new of the
EnvConfigValue::validate
method.
-
Move profile parsing out of
- In
aws-runtime
:-
Rename all profile-related code moved from
aws-config
toaws-runtime
so that it's easier to understand in light of the API changes we're making. -
Add a new struct
PropertiesKey
andProperties
to store profile data.
-
Rename all profile-related code moved from
- Add an integration test that ensures service-specific config has the expected precedence.
-
Update codegen to generate a method to easily construct
ServiceConfigKey
s. -
Update codegen to generate code that loads service-specific config from the environment for a limited initial set of config variables:
- Region
- Endpoint URL
- Endpoint-related "built-ins" like
use_arn_region
anddisable_multi_region_access_points
.
-
Write a guide for users.
- Explain to users how they can determine a service's ID.
Contributing
This is a collection of written resources for smithy-rs and SDK contributors.
Writing and debugging a low-level feature that relies on HTTP
Background
This article came about as a result of all the difficulties I encountered while developing the request checksums feature laid out in the internal-only Flexible Checksums spec (the feature is also highlighted in this public blog post.) I spent much more time developing the feature than I had anticipated. In this article, I'll talk about:
- How the SDK sends requests with a body
- How the SDK sends requests with a streaming body
- The various issues I encountered and how I addressed them
- Key takeaways for contributors developing similar low-level features
How the SDK sends requests with a body
All interactions between the SDK and a service are modeled as "operations". Operations contain:
- A base HTTP request (with a potentially streaming body)
- A typed property bag of configuration options
- A fully generic response handler
Users create operations piecemeal with a fluent builder. The options set in the builder are then used to create the inner HTTP request, becoming headers or triggering specific request-building functionality (In this case, calculating a checksum and attaching it either as a header or a trailer.)
Here's an example from the QLDB SDK of creating a body from inputs and inserting it into the request to be sent:
let body = aws_smithy_http::body::SdkBody::from(
crate::operation_ser::serialize_operation_crate_operation_send_command(&self)?,
);
if let Some(content_length) = body.content_length() {
request = aws_smithy_http::header::set_request_header_if_absent(
request,
http::header::CONTENT_LENGTH,
content_length,
);
}
let request = request.body(body).expect("should be valid request");
Most all request body creation in the SDKs looks like that. Note how it automatically sets the Content-Length
header
whenever the size of the body is known; It'll be relevant later. The body is read into memory and can be inspected
before the request is sent. This allows for things like calculating a checksum and then inserting it into the request
as a header.
How the SDK sends requests with a streaming body
Often, sending a request with a streaming body looks much the same. However, it's not possible to read a streaming
body until you've sent the request. Any metadata that needs to be calculated by inspecting the body must be sent as
trailers. Additionally, some metadata, like Content-Length
, can't be sent as a trailer at all.
MDN maintains a helpful list of metadata that can only be sent as a header.
// When trailers are set, we must send an AWS-specific header that lists them named `x-amz-trailer`.
// For example, when sending a SHA256 checksum as a trailer,
// we have to send an `x-amz-trailer` header telling the service to watch out for it:
request
.headers_mut()
.insert(
http::header::HeaderName::from_static("x-amz-trailer"),
http::header::HeaderValue::from_static("x-amz-checksum-sha256"),
);
The issues I encountered while implementing checksums for streaming request bodies
Content-Encoding: aws-chunked
When sending a request body with trailers, we must use an AWS-specific content encoding called aws-chunked
. To encode
a request body for aws-chunked
requires us to know the length of each chunk we're going to send before we send it. We
have to prefix each chunk with its size in bytes, represented by one or more hexadecimal digits. To close the body, we
send a final chunk with a zero. For example, the body "Hello world" would look like this when encoded:
B\r\n
Hello world\r\n
0\r\n
When sending a request body encoded in this way, we need to set two length headers:
Content-Length
is the length of the entire request body, including the chunk size prefix and zero terminator. In the example above, this would be 19.x-amz-decoded-content-length
is the length of the decoded request body. In the example above, this would be 11.
NOTE: Content-Encoding
is distinct from Transfer-Encoding
. It's possible to
construct a request with both Content-Encoding: chunked
AND Transfer-Encoding: chunked
, although we don't ever need
to do that for SDK requests.
S3 requires a Content-Length
unless you also set Transfer-Encoding: chunked
S3 does not require you to send a Content-Length
header if you set the Transfer-Encoding: chunked
header. That's
very helpful because it's not always possible to know the total length of a stream of bytes if that's what you're
constructing your request body from. However, when sending trailers, this part of the spec can be misleading.
- When sending a streaming request, we must send metadata like checksums as trailers
- To send a request body with trailers, we must set the
Content-Encoding: aws-chunked
header - When using
aws-chunked
encoding for a request body, we must set thex-amz-decoded-content-length
header with the pre-encoding length of the request body.
This means that we can't actually avoid having to know and specify the length of the request body when sending a request to S3. This turns out to not be much of a problem for common use of the SDKs because most streaming request bodies are constructed from files. In these cases we can ask the operating system for the file size before sending the request. So long as that size doesn't change during sending of the request, all is well. In any other case, the request will fail.
Adding trailers to a request changes the size of that request
Headers don't count towards the size of a request body, but trailers do. That means we need to take trailers (which
aren't sent until after the body) into account when setting the Content-Length
header (which are sent before the
body.) This means that without setting Transfer-Encoding: chunked
, the SDKs only support trailers of known length.
In the case of checksums, we're lucky because they're always going to be the same size. We must also take into account
the fact that checksum values are base64 encoded before being set (this lengthens them.)
hyper
supports HTTP request trailers but isn't compatible with Content-Encoding: aws-chunked
This was a big source of confusion for me, and I only figured out what was happening with the help of @seanmonstar.
When using aws-chunked
encoding, the trailers have to be appended to the body as part of poll_data
instead of
relying on the poll_trailers
method. The working http_body::Body
implementation of an aws-chunked
encoded body
looked like this:
impl Body for AwsChunkedBody<Inner> {
type Data = Bytes;
type Error = aws_smithy_http::body::Error;
fn poll_data(
self: Pin<&mut Self>,
cx: &mut Context<'_>,
) -> Poll<Option<Result<Self::Data, Self::Error>>> {
let this = self.project();
if *this.already_wrote_trailers {
return Poll::Ready(None);
}
if *this.already_wrote_chunk_terminator {
return match this.inner.poll_trailers(cx) {
Poll::Ready(Ok(trailers)) => {
*this.already_wrote_trailers = true;
let total_length_of_trailers_in_bytes = this.options.trailer_lens.iter().sum();
Poll::Ready(Some(Ok(trailers_as_aws_chunked_bytes(
total_length_of_trailers_in_bytes,
trailers,
))))
}
Poll::Pending => Poll::Pending,
Poll::Ready(err) => Poll::Ready(Some(err)),
};
};
match this.inner.poll_data(cx) {
Poll::Ready(Some(Ok(mut data))) => {
let bytes = if *this.already_wrote_chunk_size_prefix {
data.copy_to_bytes(data.len())
} else {
// A chunk must be prefixed by chunk size in hexadecimal
*this.already_wrote_chunk_size_prefix = true;
let total_chunk_size = this
.options
.chunk_length
.or(this.options.stream_length)
.unwrap_or_default();
prefix_with_total_chunk_size(data, total_chunk_size)
};
Poll::Ready(Some(Ok(bytes)))
}
Poll::Ready(None) => {
*this.already_wrote_chunk_terminator = true;
Poll::Ready(Some(Ok(Bytes::from("\r\n0\r\n"))))
}
Poll::Ready(Some(Err(e))) => Poll::Ready(Some(Err(e))),
Poll::Pending => Poll::Pending,
}
}
fn poll_trailers(
self: Pin<&mut Self>,
_cx: &mut Context<'_>,
) -> Poll<Result<Option<HeaderMap<HeaderValue>>, Self::Error>> {
// When using aws-chunked content encoding, trailers have to be appended to the body
Poll::Ready(Ok(None))
}
fn is_end_stream(&self) -> bool {
self.already_wrote_trailers
}
fn size_hint(&self) -> SizeHint {
SizeHint::with_exact(
self.encoded_length()
.expect("Requests made with aws-chunked encoding must have known size")
as u64,
)
}
}
"The stream is closing early, and I don't know why"
In my early implementation of http_body::Body
for an aws-chunked
encoded body, the body wasn't being completely read
out. The problem turned out to be that I was delegating to the is_end_stream
trait method of the inner body. Because
the innermost body had no knowledge of the trailers I needed to send, it was reporting that the stream had ended.
The fix was to instead rely on the outermost body's knowledge of its own state in order to determine if all data had
been read.
What helped me to understand the problems and their solutions
-
Reaching out to others that had specific knowledge of a problem: Talking to a developer that had tackled this feature for another SDK was a big help. Special thanks is due to @jasdel and the Go v2 SDK team. Their implementation of an
aws-chunked
encoded body was the basis for my own implementation. -
Avoiding codegen: The process of updating codegen code and then running codegen for each new change you make is slow compared to running codegen once at the beginning of development and then just manually editing the generated SDK as necessary. I still needed to run
./gradlew :aws:sdk:relocateAwsRuntime :aws:sdk:relocateRuntime
whenever I made changes to a runtime crate but that was quick because it's just copying the files. Keep as much code out of codegen as possible. It's much easier to modify/debug Rust than it is to write a working codegen module that does the same thing. Whenever possible, write the codegen modules later, once the design has settled. -
Using the
Display
impl for errors: TheDisplay
impl for an error can ofter contain helpful info that might not be visible when printing with theDebug
impl. Case in point was an error I was getting because of theis_end_stream
issue. WhenDebug
printed, the error looked like this:DispatchFailure(ConnectorError { err: hyper::Error(User(Body), hyper::Error(BodyWriteAborted)), kind: User })
That wasn't too helpful for me on its own. I looked into the
hyper
source code and found that theDisplay
impl contained a helpful message, so I matched into the error and printed thehyper::Error
with theDisplay
impl:user body write aborted: early end, expected 2 more bytes'
This helped me understand that I wasn't encoding things correctly and was missing a CRLF.
-
Echo Server: I first used netcat and then later a small echo server written in Rust to see the raw HTTP request being sent out by the SDK as I was working on it. The Rust SDK supports setting endpoints for request. This is often used to send requests to something like LocalStack, but I used it to send request to
localhost
instead:#[tokio::test] async fn test_checksum_on_streaming_request_against_s3() { let sdk_config = aws_config::from_env() .endpoint_resolver(Endpoint::immutable("http://localhost:8080".parse().expect("valid URI"))) .load().await; let s3_client = aws_sdk_s3::Client::new(&sdk_config); let input_text = b"Hello world"; let _res = s3_client .put_object() .bucket("some-real-bucket") .key("test.txt") .body(aws_sdk_s3::types::ByteStream::from_static(input_text)) .checksum_algorithm(ChecksumAlgorithm::Sha256) .send() .await .unwrap(); }
The echo server was based off of an axum example and looked like this:
use axum::{ body::{Body, Bytes}, http::{request::Parts, Request, StatusCode}, middleware::{self, Next}, response::IntoResponse, routing::put, Router, }; use std::net::SocketAddr; use tracing_subscriber::{layer::SubscriberExt, util::SubscriberInitExt}; #[tokio::main] async fn main() { tracing_subscriber::registry().with(tracing_subscriber::EnvFilter::new( std::env::var("RUST_LOG").unwrap_or_else(|_| "trace".into()), )) .with(tracing_subscriber::fmt::layer()) .init(); let app = Router::new() .route("/", put(|| async move { "200 OK" })) .layer(middleware::from_fn(print_request_response)); let addr = SocketAddr::from(([127, 0, 0, 1], 3000)); tracing::debug!("listening on {}", addr); axum::Server::bind(&addr) .serve(app.into_make_service()) .await .unwrap(); } async fn print_request_response( req: Request<Body>, next: Next<Body>, ) -> Result<impl IntoResponse, (StatusCode, String)> { let (parts, body) = req.into_parts(); print_parts(&parts).await; let bytes = buffer_and_print("request", body).await?; let req = Request::from_parts(parts, Body::from(bytes)); let res = next.run(req).await; Ok(res) } async fn print_parts(parts: &Parts) { tracing::debug!("{:#?}", parts); } async fn buffer_and_print<B>(direction: &str, body: B) -> Result<Bytes, (StatusCode, String)> where B: axum::body::HttpBody<Data = Bytes>, B::Error: std::fmt::Display, { let bytes = match hyper::body::to_bytes(body).await { Ok(bytes) => bytes, Err(err) => { return Err(( StatusCode::BAD_REQUEST, format!("failed to read {} body: {}", direction, err), )); } }; if let Ok(body) = std::str::from_utf8(&bytes) { tracing::debug!("{} body = {:?}", direction, body); } Ok(bytes) }
](writing_and_debugging_a_low-level_feature_that_relies_on_HTTP.md)