Hiram Software Blog


Day One with the Amazon API Gateway

Update on July 22, 2015: This post is now out of date due to helpful and welcome updates from the Amazon API Gateway team within 10 days! See the next post to learn about the changes.

On Thursday, July 9, 2015, AWS launched "Amazon API Gateway", a product whose pitch is to remove the need to host APIs on virtual machines and compose them into AWS Lambda Functions.

I love the vision and direction, and I was excited to give the product a try. Different than the Lambda release, there was no "opt-in" form, and one only had to wait for the AWS Console updates to propagate regions.

tl;dr — Amazon API Gateway should be labeled as "preview" since it is blocking two basic scenarios:

  • Your API cannot both see the request body and the request metadata (headers, path params, and query params).
  • Your API cannot see which IAM role was authenticated with the request.

When these two items become unblocked, I think I will be able to recommend trying out Amazon API Gateway in the future. Read on to follow my experience prototyping the next version of the SheetsDB API.

Here's a summary diagram I'll cover throughout:

API Gateway Configured

Note: as of July 22, 2015, this diagram is out of date! See the next post for more info.

What is Amazon API Gateway (for AWS developers)?

The marketing website claims the Amazon API Gateway is:

a fully managed service that makes it easy for developers to create, publish, maintain, monitor, and secure APIs at any scale. With a few clicks in the AWS Management Console, you can create an API that acts as a “front door” for applications to access data, business logic, or functionality from your back-end.

Yes, that is true, but here's how I would summarize it to fellow developers:

Amazon API Gateway transforms and routes HTTP Requests into AWS Lambda Events to trigger your Lambda code. It also supports sending these events to legacy systems, though it is less clear how it adds any value over ELB in this release.

Ok, so the primary value-add is getting AWS Lambda code to fire based on HTTP Requests. That's excellent. I had previously prototyped a poor-man's bookmarklet script that:

  • Made a GET request to a static website hosted on S3, something like throwaway.html?x-bookmark=http://...
  • Had Bucket Logging enabled
  • Had a Lambda event fire when new bucket logs were available
  • Parsed the logs (using regexes and not JSalParser) looking for x-bookmark
  • Appended the URL to my bookmarks.html file.

Hacky? Yes! Fun to put together? Yes! Would I have loved an API endpoint to replace using S3? Yes, Yes, Yes!

Motivating example: SheetsDB API

Hiram Software runs a profitable lifestyle business SheetsDB on the side. The basic premise is that some organizations have two groups of people, technical and non-technical, who need to use the same human-sized dataset for products. Technical people prefer JSON, non-technical people prefer spreadsheets. How do you convert between them reliably and regularly? We think SheetsDB is a credible solution.

SheetsDB is implemented as an AngularJS Single Page Application with an API backend. Today there are four environments for the API backend: Development (localhost), Test (ec2 instance), Retail, and Enterprise.

The reality is that only enterprise customers are profitable for us (and we suspect for most SAAS products). We've been looking for ways to cut costs on retail for a while. Most retail customers kick the tires, and then a small percentage return to sign enterprise agreements. However, we have to pay to keep servers available for these retail customers even though their average usage is low. We just can't predict which future enterprise customers will kick the tires when.

The way most SAAS products tackle this is by getting monthly subscriptions, but we don't think that is a good fit for SheetsDB since the product is purely transactional. It's hard to convince a retail customer there is value in paying $X per month to perform $Z operations. Enterprise customers, in contrast, need $Z operations done securely, reliably, and with support.

One idea we consistently reject is merging retail and enterprise environments. A key reason for this is regulatory compliance and guaranteeing enterprise customer data is isolated and secured in a way that are not feasible for retail customers. For example, one (of many) defense-in-depth strategies is that only specific whitelisted IP addresses can resolve the enterprise API backend servers. Re-using this environment for retail customers would jeopardize our existing contracts.

With the release of Amazon API Gateway we have the opportunity to migrate the retail environment for cost savings. It may have higher variable costs compared with the dedicated servers we use today, but it will lower the total cost of the retail environment because of infrequent retail usage.

Prototyping the Post Spreadsheet endpoint

Why start with Post? The simple reason is that's the riskiest operation. By prototyping high-risk stuff first, we can optimize failing fast and limiting total investment if there are blocking bugs. We're by no means the only people who do this or thought of it, but we recommend thinking in terms of risk when prototyping.

Today that request is done by the following:

POST /v1/a/:accountId/spreadsheet
{
    "spreadsheet-metadata": {
        //...
    },
    "spreadsheet-content": {
        //...
    }
}

(Authentication uses session cookies with server-side storage for the OAuth2 tokens)

Part 1: Prototype the API Endpoint

Following the helpful UI, to create the /v1/a/:accountId/spreadhsheet endpoint we need to prototype the following:

  1. API route for the POST request
  2. The Lambda function to handle the request
  3. Authentication
  4. (Some security and validation stuff that isn't relevant)

I've used Lambda before, so I jump right in to configuring the API route. I do the following:

  1. Create an API
  2. Create a Resource at /v1/a/{accountId}/spreadsheet
    1. The curly braces represent a path parameter like :accountId in other places
  3. Create a method "POST"
  4. Map the post method to my favorite hello-world Lambda function.
    1. The hello world function returns the raw event and context variables in the HTTP response for debugging.

When done, my screen looks like the following:

API Gateway Configured

So far, so good.

Part 2: Testing to figure out what is in event by default

What is in the event object by default? The request body. Ok, I can work with that:

Default event object

But how do I figure out the account ID? It's not in the context object. Neither is the IAM account that made the request. Ok, weird.

Part 3: Configuring Input Mapping

After some trolling of the AWS Discussion Forums, I discerned I would need to do some input mapping. Thank you to Stefano@AWS for your posts.

Basically, you need to implement a Velocity Template (VTL) to transform an undocumented input state to JSON that represents your event object. How to do this?

Well, you need to figure out to use a magic variable called $input that has a function-like operation params(). The function params() returns a named parameter from the API endpoint. So to get the accountId parameter, you use params('accountId'):

Mapping parameter to the input

An important note is that $input doesn't return JSON, but the resulting document has to be well-formed JSON. You need to put the call to $input inside quotes. I'm also not a fan of having to (re-)learn VTL, but I appreciate AWS for "buying" a standard. There are caveats about VTL: Macros don't work, for example.

Going back to test it:

Testing my mapping

Where did my request body go? Amazon API Gateway discarded the request body when using the mapping.

On one level that makes sense since I mapped the input of the HTTP request to the output of the event object. On another level that is frustrating because that means I have to go explicitly map everything for every request. My Lambda function will have to do the mapping again as I will need to do input validation. This input mapping feels like it violates DRY.

Part 4: Adding back in the request body

Ok, so I need to use the $input variable to get access to the input. There is another method $input.path() that provides information about the request body. That's obvious, right? path() gets the body, not the request path.

Trying again:

Mapping parameter to the input

And testing (full-disclosure — there was a lot of back and forth testing where I needed quotes. You need them.):

Mapping parameter to the input

Great. Now the request is mangled in a format that is NOT JSON. This is stupid. How do I get JSON?

I'll save you some time, you can't.

You have to specify each part of your request body and select using an undocumented selector syntax. The dollar sign ($) is root, but the selector syntax behaves like it's getting passed to eval():

$input.path('$') => {metadata={something=true}}
$input.path('metadata.something') => true

But what's stupid is the output isn't JSON. The whole point of the API is accepting JSON requests, but the mapping output can't get JSON? There is still nothing useful on the context object.

If you're curious, it looks like path is a nested HashMap, and the output is some JVM language's default way of formatting hashes. Great for reading in debug logs, but since it isn't escaped, it is useless to consume here.

I also tried getting the root document:

Mapping parameter to the input

Testing it:

Mapping parameter to the input

That looks like the default implementation of a Java object's toString(). No obvious methods to get JSON.

The obvious search results pointed to other frustrated people using VTL, and the best solution involved macros, which aren't supported here.

I got blocked. I won't be able to migrate the SheetsDB API today. Sad face.

A kind note about testability

A great feature is how easy it is to go back and forth between updating the input mapping and testing it. I would like to say this allowed quick iteration and exploration of my API, without which I would have stopped using the API Gateway almost immediately.

I also generally like the flow and feel of the configuration experience. I'd rather configure using code, but this was a competent alternative.

Part 5: Realizing the input mapping is broken: A rant

This is the part of the post where I rant and conclude that the API Gateway isn't ready for release. You can't actually configure a POST request to pass along the request body and request parameters to your Lambda event object.

My working hypothesis is for political reasons senior management moved up the release to the AWS event last week. The team wasn't ready, but they got something out the door. Maybe there is news Azure is coming out with a similar product? Plus, the team didn't want to release with an opt-in like they did with Lambda.

Put together you have a great promise with zero practical usability. The product team should be proud about their direction, but management should question their judgment around releasing when they did. AWS has a reputation, unlike Microsoft's, that in 3 months these issues will be ironed out. Management should consider sticking a "beta" or "preview" label on the product for the time being. I will be back in 3 months.

A second part of this rant is about the direction of the input mapping. There is a requirement to couple API methods to JSON schema. I support schema validation as an optional requirement, but not as an imposed requirement. When I see this decision made in software products, my hypothesis is a high-titled "architect" came up with theoretical and academic arguments that no one had the energy or reputation to fight.

Schema validation is great when there are firm contracts in place between independent systems. It works well in places where you would likely write integration tests. This answer on StackOverflow gives some great reasoning. However, during development and prototyping schema validation (like input validation) is unnecessary overhead. You may also have a downstream (or upstream if you think in nginx) system that will do both for you, and you just need the API Gateway to do IAM auth. Schemas were one of the three key reasons I observed that drove people to migrate to JSON from XML (the others being native "compatibility" in the browser and the other aesthetics). Schema validation is not one-size-fits-all, and it should be removed as a requirement. Amazon risks pushing peope to use something else to work around the overhead.

If there are architects on AWS who would like to talk about this more, I'm happy to sit down for coffee. (The email is hello[at]hiramsoftware.com).

Part 6: Blocked on authentication

Even assuming I could refactor the API calls to bundle everything into the body of the request (and bypass all the benefits of putting state in path parameters, query parameters, etc), there still is no way to authenticate.

The core problem is that the API Gateway uses one IAM policy to grant access to invoke the API, and then an internal permission is used to invoke the Lambda function separately. Lambda still runs using its own separate IAM Role, and Lambda does not get told what IAM instance (user, role, group) invoked the API. I first expected Lambda to be executed in the context of the IAM Role that invoked the API. Failing that, I expected the IAM instance to be on the context object. The API Gateway does neither. This is a major oversight, and for me corroborates the hypothesis the product is not ready.

What is the impact? Say you have an API that allows users to update spreadsheets. You only want users to be able to update their own spreadsheets. Reasonable? Could you ship without it? So, not only do you have to authenticate users, but you also have to authorize them. IAM provides both features (authentication in the form of the signatures, authorization in the form of policies). Since the IAM policy used to invoke the API is not used in executing the code, there is an implicit elevation of privilege that is possible. That's why it would be ideal if the same role were used both to invoke the API and execute the Lambda code.

Normally you would hack around this by writing some silly checks (if IAM user == 123), but since the Lambda code has no idea what IAM instance got authenticated the poor-man's authorization is impossible. If you wanted to use session cookies, you can't because you can't get both the request body and headers in the same event. If you wanted to bundle the session inside the request body, you have no way of corroborating the user input beyond adding HMACs or other overly-complex validation for something a good API server would just handle.

Authentication does not work in a way I would expect it to work for authenticated APIs.

Conclusion

The Amazon API Gateway has great promise, and it is a great start of a way to route HTTP requests to Lambda events.

However, the Amazon API Gateway is not yet ready to handle requests where your code wants to read from both the metadata (headers, path parameters, and query parameters) and the request body.

It also is not yet ready to handle authentication based on IAM, despite what the advertising says. As a reminder, here is how the API Gateway really works:

API Gateway Configured

Note: as of July 22, 2015, this diagram is out of date! See the next post for more info.

The whole point is to map HTTP requests to Lambda events. Different information gets discarded, some one-size-fits-all decisions are imposed during input mapping, and your Lambda code is executed in a different IAM instance than the user or role who invoked the API.

If you are from the API Gateway team, here are some friendly suggestions:

  • Thank you for the testing flow. It's very useful for prototyping. You should be proud of this experience.
  • Thank you for surfacing the Delete API button. Originally it was buried in a very-hard-to-find screen, and after one day it was moved to a more obvious spot.
  • Make it so Lambda has access to the request body AND request parameters in the event object without much configuration.
    • Why not follow express or some other popular NodeJS model?
    • If I were on the team, I would have done something closer to this:
"event" : {
    "body": {
        // complex JSON body here (if content-type matches application/json, yada, yada, yada)
     },
     "httprequest": {
        "method" : "POST",
        "path": "/full/abs/path",
        "endpoint" : "http://useful-so-I-can-compose-redirects...",
        "clientIP": "every-good-API-has-IP-throttling"
     },
     "cookies": {
        "chocolate-chip": "everyone-reads-cookies, so please parse them..."
     }
}
  • Remove the requirement for schema validation. It's myopic as a hard requirement.
  • Fix the mismatch between the IAM policy that can invoke the API and the IAM role that invokes Lambda. Ideally it would be the same role and policy.
  • Add in the IAM information to the context or make it so the Lambda function uses the IAM role that invoked the API (this is ideal!).
  • Add in the environment to the context.
  • Add in method and path to the context.
  • Step back and revisit the $input magic variable. $input.path() -> request body is not obvious and contradictory.
  • Have a way to format the return result from $input.path() to JSON.

Discuss on Hacker News