Categories
Engineering

Infrastructure as Code (IaC)

Infrastructure as Code (IaC) as in cloud infrastructure, not infrastructure code as in writing your own middleware library.

Is it a misnomer?

In most cases IaC refers to some kind of template-based solution.

Template in this article refers to AWS CloudFormation template

It is strongly debatable if templates are code. If anything, code inside templates is generally an anti-pattern.
Think of templates as an “intermediary language” rather than code.
Templates are executed by respective platform “engines” driving actual cloud infrastructure configuration management.
Consider it “code” as far as an asset managed in version control together with the rest of the source code.

Templates

Templates will be sufficient for most infrastructure (IaC) jobs. Most templating languages will have constructs to express logic, for example: conditions. Most will also allow to include inline code as part of the configuration, nonetheless it is code mixed with template, for example: AWS velocity template for DynamoDB.

Writing Infrastructure as Code templates this way – in YAML or JSON – has been reasonable, but not without concerns. A good article on the topic is In defense of YAML.

Templates for templates

The AWS SAM is a good example of simplifying CloudFormation templates, notice the Transform: 'AWS::Serverless-2016-10-31' elements in SAM. A single SAM element may transform into a list of CloudFormation elements deploying a series of resources as a result.
Another example is serverless.com their templating language supports multiple platforms while also simplifies the templating compared to CloudFormation.

One of my concerns however is the mixing of Infrastructure as Code with Function as a Service definitions. For example, the definition of a function AWS::Serverless::Function in the same place with an Gateway API AWS::Serverless::Api and related Usage Plan AWS::ApiGateway::UsagePlan.
I would like to keep application and infrastructure concerns separate.

My immediate approach was going to be to split up the template into multiple files and use AWS::Include to bring them back together.
AWS::Include does not work for all parts of the schema. Trying to use AWS:Include under Resources to include a set of functions simply does not work.

Next approach was going to be Nested Stacks. It is the recommended approach for large stacks anyway, so it seemed like a winner. It turns out Nested stacks are great for reusable templates – see Use Nested Stacks to Create Reusable Templates and Support Role Specialization – not so much for decomposing an application (template).

Using actual code for infrastructure

The AWS API gives access to the platform and resources through a range of SDK-s (Python SDK is called boto3). It is truly a low level access to resources that is typically used for developing software on the platform.
Infrastructure could be managed using the SDK. There are many good automation examples on how AWS Lambda can react to events from the infrastructure and respond with changes to it.
Managing more than a few resources using the SDK is not feasible, considering the coordination it would require: dependencies, delay in resource setup.

There is a better approach: code

The troposphere library allows for easier creation of the AWS CloudFormation JSON by writing Python code to describe the AWS resources.

The GitHub project has many good examples

Using troposphere for Infrastructure as Code

The examples would led you to believe that using troposphere is not much different than writing a CloudFront or SAM template in Python. Depending on your use case however, there are opportunities to explore:

  • Function as a Service (FaaS) implementations are heavily influenced by deployment. It would make sense if the deployment details were close – ideally right next – to the code.
  • In a similar fashion database – for example DynamoDB – resource definition could be close to the ORM or data access implementation.
  • Make infrastructure and deployment decisions in code. There is of course conditions in CloudFormation, and CloudFormation macros for more complex processing.

Having the infrastructure and deployment configuration close to the implementation code has its own pros and cons.

Development time requirements

Ideally troposphere would only be used in development time (not runtime if we can avoid it), therefore the deployment package would not include this library.

# Development requirements, not for Lambda runtime
# pip install -r requirements-dev.txt
awacs==0.9.0
troposphere==2.4.6

I use template.py in the root of the project, the same place where the template.yaml (or .json) would be, for producing the CloudFormation template.

I have followed two patterns for defining platform resources in the source code.

Class type resource definition

The resource definitions are part of the class implementation.

Note that the troposphere library is only imported at the time of getting the resources. Then it is only used when generating the CloudFormation template. None of these would need to run in the lambda runtime.

# the application component
class ApplicationComponent(...):
	# class attributes and methods
	# ...
	@classmethod
	def cf_resources(cls):
		from troposphere import serverless
		# return an array of resources associated with the application component
		return [...]

# the template generator
from troposphere import Template
t = Template()
t.set_transform('AWS::Serverless-2016-10-31')
for r in ApplicationComponent.cf_resources():
	t.add_resource(r)
print(t.to_yaml())

Decorator type resource definition

I use decorators for Lambda function implementations. The decorator registers a function (on the function) returning the associated resources (array).

# the wrapper
def cf_function(**func_args):
	def wrap(f):
		def wrapped_f(*args, **kwargs):
			return f(*args, **kwargs)
	
		def cf_resources():
			from troposphere import serverless
			# use relevant arguments from func_args
			f = serverless.Function(...)  # include all necessary parameters
			# add any other resources
			return [f]

	wrapped_f.cf_resources = cf_resources
	return wrapped_f
return wrap

# the lambda function definition
@cf_function(...)  # add any arguments
def lambda1(event, context):
	# implementation
	return {
		'statusCode': 200
	}
	
# the template generator
from troposphere import Template
t = Template()
t.set_transform('AWS::Serverless-2016-10-31')
for r in lambda1.cf_resources():
	t.add_resource(r)
print(t.to_yaml())

You could implement lambda handlers in classes by wrapping them as Ben Kehoeshows in his [Gist]
(https://gist.github.com/benkehoe/efb75a793f11d071b36fed155f017c8f).

Conclusion

AWS recently released the alpha version of their Cloud Development Kit (CDK). I have not tried it yet, but the Python CDK looks super similar to troposphere. Of course, they both represent the same CloudFormation resource definition in Python.
troposphere or AWS CDK, they both bring a set of new use cases for managing cloud infrastructure and let us truly define Infrastructure as Code.
I will explore the use of troposphere on my next project.

Categories
Engineering

AWS Lambda in Python with SAM

The AWS SAM and CloudFormation mix works well for my projects.
I have been working mainly with Python for building Lambda functions on AWS.
However, managing code from project to production has been less than trivial.
AWS Lambda Deployment Package in Python

This article and the sample project on GitHub shows how to

  • structure a Python project for Serverless functions,
  • deploy app using SAM and CloudFormation.

A sample project demonstrating this approach is at  GitHub:designfultech/python-aws

Building and deploying JavaScript functions using SAM was super simple – see Watcher project.
SAM’s original project structure for Python is less than ideal.

  • 3rd party libraries, dependencies, are expected to be in the project root.
  • All project assets in the root will be deployed.
  • I want to maintain a virtualenv for development and define different dependencies (requirements.txt) for the runtime.

Project structure

I use the Pycharm IDE for development. My projects have their own virtual environments, and I use the following project structure.

Where

  • source has all the deployable source code
    • ext 3rd party libraries, not a Python package
      • requirements.txt – use it to install dependencies
    • lib keeps my own shared libraries
      • init.py it is a Python package
      • vendor.py library for vendoring, more on this later
    • functions all function(s) code
      • __init.py it is a Python package
    • I may add other packages like models for ORM
    • runtime_context.py more on this later
    • requirements.txt this one is kept empty for SAM build
  • dist is created by SAM build for the deployment
  • requirements.txt development-time dependencies for the project
  • template.yaml AWS serverless application template

Vendoring

Vendoring is a technique to incorporate 3rd party packages, dependencies into the project. It is a neat trick used in other languages, and this specific one is adopted from Google’s App Engine library.

  1. Create a directory in the root of the project for the 3rd party packages ext. Add ext to your Python path so dependencies resolve during development.
  2. Create and maintain the requirements.txt inside ext for the deployed runtime.
  3. Install the packages in the ext directory.
    pip install -t . -r requirements.txt
  4. How to make the code in the ext directory available to the runtime?
    This is where Google’s helper – google.appengine.ext.vendor – comes in. It adjusts the path for the Python runtime.
  5. Add the code to a project file, for example: /lib/vendor.py
  6. My approach is then to create the runtime_context.py in the root of the project
    import os from lib import vendor vendor.add(os.path.join(os.path.dirname(os.path.realpath(__file__)), 'ext'))
  7. Any Python application that needs access to the packages in the extdirectory needs a single line of import.
    import runtime_context

Using the runtime_context.py

I use the runtime_context.py to setup vendoring for all functions.
I also place shared configurations and variables here, for example:

  • logging
  • environment variables
import logging LOGGER = logging.getLogger()
LOGGER.setLevel(logging.DEBUG)

import os
GREETING = os.environ.get('GREETING', 'Hello World!')

SAM

The approach and project structure described here works for development, local test with SAM, and AWS runtime.

When deploying to AWS, run the SAM CLI from the root of the project, where the template file is. Use a distribution directory dist for building and packaging.

  • Install the 3rd party modules
cd source/ext pip install -r requirements.txt
  • Build the function
sam build --template template.yaml --build-dir ./dist
  • Package the function
    [BUCKET_NAME] is the name for the bucket where the deployment artefacts are uploaded.
sam package --s3-bucket [BUCKET_NAME] --template-file dist/template.yaml\
  --output-template-file dist/packaged.yaml
  • Deploy the function
    [STACK_NAME] is the name of the stack for the deployment.
aws cloudformation deploy --template-file dist/packaged.yaml\
  --stack-name [STACK_NAME] --s3-bucket [BUCKET_NAME]

Remove deployed app

When done with the application, un-dpeloy it by removing the stack.

aws cloudformation delete-stack --stack-name PythonAppStack

Final notes

This is just one approach that works for me. There are probably numerous other ways to make coding Python functions easy and comfortable.


Categories
Engineering

Building the Filestore Serverless app on AWS

The application

Filestore is a cloud file storage API backed by S3. The project includes a sample client based on FilePond.

User guide, source code and deployment instructions for AWS are available on GitHub:designfultech/filestore-aws

Planning

I wanted to use the latest Python – 3.7 – for coding.

Avoiding infrastrucutre code was a key principle for the project. I also wanted to avoid the use of 3rd party libraries if possible.

CloudFormation and SAM templates – in YAML – has worked very well in past projects, and so did for this one.
API definitions were written in Swagger (OpenAPI 2.0) – in YAML.

I wanted to use S3’s API directly for uploads and downloads.

Ideally the app would be running behind a domain setup on Route 53, this setup is not included in the project for now.

Presigned URL-s with S3

Tha application builds upon a key platform capability: S3’s presigned URL-s.

Presigned URL-s allow anyone – unauthenticted users – to access S3 objects for a specified duration. In this case the application allows upload (PUT) and download (GET) of S3 objects.

Presigned URL-s also allow the client to use S3’s API directly. There is no need to go through Lambda for uploading or downloading files, which could incure high costs.
Since Lambda’s cost is time based, large amount of data transfer over slow connection would eat up a lot of compute time.

Getting started

Review the Building the Watcher [Serverless app] for details on:

  • Approach to infrastructure as code and SAM’s intricacies
  • API Design and Swagger specifics
  • Cross-cutting concerns: securing API with a key

AWS Lambda in Python

Python has first class support on the platform. The AWS SDK is known as Boto 3and it is bundled with the Lambda runtime environment, no need to include as a dependency. However, Lambda does not include the very latest version of the Boto3 library (at the time of this writing).

UPDATE: SAM’s support for Python has a few gotchas when including 3rd party libraries. More about this in the article dedicated to AWS Lambda in Python.

Platform services

The diagram shows all services included in the app.

Aside from the S3 service and bucket, there isn’t anything new compared to the previous project Building the Watcher [Serverless app] on AWS.

Working with DynamoDB

I was hoping to make use of an Object Mapper library that abstracts away the DynamoDB low-level API. There are a few good candidates out there such as PynamoDB and Flywheel.

After a short evaluation, I ended up coding up my own lightweight, albeit less generic abstraction, see /source/lib/ddb.py.
On a larger project with multiple entities I would definitely use one of the OM libraries.

I originally wanted to build the application to support multi-tenancy, but decided to leave that for another project where it would make more sense. However supporting some of the multi-tenant strategies (SaaS Solutions on AWS: Tenant Isolation Architectures) with these libraries is not trivial, or simply not possible – something to remember.

AWS S3 (Simple Storage Service)

The S3 API is significantly simpler than the DynamoDB API. It was simple enough to use directly from the functions without any abstraction.

Mime-type

When I started, I did not realise S3 uploads recognise the mime-type of the files. I was going to use python-magic for this magic.from_buffer.
It turned out is not necessary as S3 objects have a ContentType attribute for this.

file_object = S3_CLIENT.get_object(Bucket=..., Key=...)
content_type = file_object.get('ContentType')

Download

The file uploads do not preserve the original file name when placed into S3, they have the same name as the DynamoDB key. When the Presigned URL is generated for the download, using the ResponseContentDisposition attribute can set the file name for the download.

Events

AWS Lambda supports a range of S3 events as triggers. This, and all the other event sources, make the AWS platform and the Serverless model really powerful.
The application uses the s3:ObjectCreated event to update the DynamoDB item with properties of the s3 object (file) such as size and mime-type.

CORS

Most likely the browser client is going to be on a different domain than the API and S3, therefore CORS settings are necessary to make this application work.

API

There are two steps for the API to work with CORS:

  1. Create an endpoint for the OPTIONS and return the necessary CORS headers. The API Gateway makes this very easy using a mock integration.
    This is configured in the Swagger API definition.
  2. Return the Access-Control-Allow-Origin header in the Lambda response.

S3

CORS configuration for S3 resources has a specific place in the CloudFormation template: CorsConfiguration.

Conditional deployment

If the application is configured at the time of deployment to store uploads immediately – StoreOnLoad=True – then the FileExpireFunction function is not needed.

CloudFormation Condition facility allows control over what gets deployed, amongst other conditional behaviours. In this project, depending on the parameter value, the expire function may or may not get deployed.

Conditions:
  CreateExpirationScheduler: !Equals [ !Ref StoreOnLoad, False ]
...
Resources:
  FileExpireFunction:
    Type: AWS::Serverless::Function
    Condition: CreateExpirationScheduler
    Properties:
	...

Browser Client

I picked FilePond for browser client. It offers a high degree of customisation, and comes with a good set of capabilities.

The server API interaction had to be customised to work with the Filestore and S3 API-s. I implemented the customisation in a wrapper library – static/web/uploader.js. It takes care of the uploading (process) and deleting (revert) of files.

The sample webpage and form static/web/index.html is built using jQuery and Bootstrap to keep it simple. The form has a single file upload managed by the FilePond widget. In this example there is no backend to pick up the data from the form.

See the README.md Web app client section for more details on how to deploy the sample web app client on S3 as a static site.

Testing

I have experimented two types of tests for this project: unit and integration.

Unit test

Unit tests are fairly straightforward in Python. The interesting bit here is the stubbing of AWS services for the tests. botocore has a Stubber class that can stub practically any service in AWS.

There is one unit test implemented for the preprocess function, which shows how to stub the DynamoDB and S3 services.

import boto3
from botocore.stub import Stubber

ddb_stubber = Stubber(boto3.client('dynamodb'))
s3_stubber = Stubber(boto3.client('s3'))

See tests/unit/file.py for more detail on the specific test code.

Integration test

I have found Tavern ideal for most of my integration testing scenarios.

REST API testing has first class support. Defining tests with multiple steps (stages) in YAML is easy.

There are 4 integration tests defined: uploadstoredeletedownload. These tick the boxes on the preprocessstoredeleteinfouploaded functions. However it does not help with testing functions like expire which is triggered by a scheduled event only.

Tavern can pick up and use environment variables in the scripts. See the README.md Test section for more details on how to setup and run the integration tests.

What about the non-REST API-s?

I am reluctant to add a REST endpoint to functions such as expire just for the sake of testability.

The aws CLI can invoke Lambda functions directly and so can Boto 3 via an API call – Lambda.Client.invoke. If there was a way to include non-REST Lambda function invocations in Tavern test cases, that would be ideal.
Tavern supports plugins for adding new protocols – it has REST and MQTT added already. I wonder if it is feasible to build a plugin to support Lambda invocations?

Final thoughts on the architecture

The serverless architecture worked very well for this app.
In the end the amount of code was relatively small considering the functionality the app provides.

Categories
Engineering

Building the Watcher Serverless app on AWS

The application

Watcher is a simple productivity tool to check for changes on websites. Create individual watchers by setting the location (URL) for the page and path (XPath) for the content to monitor. The application then checks for any change in the content every 6 hours. It is an API first and API only application with 4 endpoints: create a watcher, list watchers, delete a watcher, test run a watcher.

Source code and deployment instructions for AWS are available on GitHub:designfultech/watcher-aws

Planning

I implemented the same application in Python running on Google App Engine, which gave a good baseline for comparison.

I was looking to build on a new architecture in a new environment. The application was going to be built on AWS in Node JS using as much of the AWS Serverless capabilities as possible.

AWS is in many ways the front-runner of Serverless and their Lambda service is one of the leading Function as a Service (FaaS) engine.
Node.js is a frequent choice of language for FaaS implementations and I wanted to see how it would compare to Python.
I was keen to use as many of the readily available functionality and services as possible, and write as little infrastructure code as possible (not to be confused with infrastructure as code).

Getting started

Serverless is full of infrastructure configurations, also known as infrastructure as code. It was a decision from the beginning that everything goes into code, no (Web) console monkey business in the final result. Wiring up and configuring the services in code (YAML) has a bit of a learning curve, and comes with a few surprises – more about these later. Production ready code requires a good amount of DevOps skills, there is no way around it with this architecture.

Other than picking the right language and suitable libraries there is not much else for the main logic. As for the infrastructure code, there are far too many options available.

Approach to infrastructure as code

The choice was to go with SAM and drop into CloudFormation where necessary.
Ultimately all the solutions will have a straight line down to the platform SDK.
The main question with 3rd party solutions is, how well and close they can follow platform changes? Do they support the entire platform and all the attributes needed?

Operating a platform in actual code lacks abstraction and too verbose.
Native platform scripts and templates offer the right level – close enough to the platform SDK for fine grain control, but well above in abstraction to be manageable. SAM as a bonus can raise the level of abstraction further to make life even easier. If there is a future for infrastructure as code it must be to do more with less code.

Other projects with different conditions may find the alternative solutions more suitable.

  • A quick prototype probably makes better use of a toolkit like Serverless Framework.
  • A large scale, multi-provider cloud environment could use a general purpose engine like Terraform.
  • An adaptive environment with lots of logic in provisioning resources would need a coding approach like Troposphere.

Designing the application

Given the service oriented nature of Serverless and FaaS, a service definition seems to be a good start. Swagger (OpenAPI) is a good specification and has good platform support on AWS and in general. Another reason for choosing SAM templates was the reasonable level of support for Swagger.

Swagger specifics

  • SAM does not handle multi-part Swagger files at the time of writing this article, which would be an issue for a larger project. One solution could be to use a pre-processor, something like Emrichen which looks very promising.
  • Unfortunately the platform specifics and infrastructure as code leaks into the API definition, there is no easy and clean way around it. Therefore the Swagger definition is peppered with x-amazon- tags in support of the AWS platform and SAM. Perhaps the right pre-processor could merge external, platform-specific definitions in an additional step during deployment.

Platform services

Here is a high-level diagram of the AWS platform services in use in the application. Lot more relevant bits could go onto this diagram that are not trivial at first – more about these later.

API Gateway was a given for reaching the REST API-s in the application.
DynamoDB can easily meet the database needs. It is also easy to expose CRUD + Query functions via the API Gateway.
Lambda was the primary candidate for the two services doing most of the work for the application.
CloudWatch has a scheduler function that comes in handy for triggering the regular checks.
Fanning out the individual checks can be done using a queue mechanism, SQS does the job well.
Finally, notification e-mails go out via SES.

Cross-cutting concerns

This project was specifically avoiding getting into cross-cutting concerns such as application configuration, security, logging, monitoring. They will be explored in another project.

Application configuration uses environment variables set at deployment time.

Security uses simple API key authentication. The implementation uses IAM’s roles and policies to authorise access, but it does not follow any recognisable model or method at this time.

Logging and monitoring was not explored in any detail. The first impressions of the vanilla logging facility were poor. The lack of out of box aggregated logging is a big surprise.

Building the application

Database (as a Service)

DynamoDB is a versatile key-value store with built-in REST endpoints for manipulating the data. I was keen to make use of the built-in capabilities and see how far they stretch. While DynamoDB does the job for this project, it will be increasingly more laborious to use for even slightly more complex jobs.

My very first task was to hook up DynamoDB operations to the API Gateway and be able to create new items (records), list them, and delete them without involving Lambda. It is all configurable in Swagger with relative ease.

Observations

  • I could not return the newly created item in the response to the client. There is no support for ALL_NEW or UPDATED_NEW in ReturnValues for the putItemoperation, so now the API returns an empty object with a 201 status.
  • Mapping templates (request and/or response) in Swagger look like code in the wrong place.
  • Mapping actually uses another template language – Velocity. An immediate limitation I hit was trying to add a timestamp to the created_at attribute. There is no way to insert an ISO8601 UTC timestamp. The solution was to use $context.requestTime, which is CLF-formatted. The only other alternative would be Epoch.

AWS Lambda in JavaScript (Node)

There are two functions implemented on Amazon Lambda:

  1. Checking a single item, web page, for any change.
  2. Periodically getting the list of watcher items and launching individual checks.

The code is fairly simple and self-explanatory for both functions, see comments in source code for more detail on GitHub.

Coding Lambda functions in Node/JavaScript was a breeze. Adopting the function*() yield combo in the programming model makes it very easy, sync-like, to code in this inherently async environment. Defining the functions in SAM is straight-forward.

I managed to focus on coding the application features without much distraction from infrastructure code. The one task that did not work well was manipulating data in DynamoDB. AWS SDK is too low level, it would not be efficient to use on a larger, more complex project. Next time I would definitely look for a decent ORM library, an abstraction on top of the SDK, something like Dynamoose.

The AWS SDK is bundled with the Lambda runtime. It does not need to be a dependency in the package.json, it can be a development dependency only.

Application integration

Every once in a while – set to 6 hours by default – one of the Lambda functions retrieves the list of watcher items to check for any change. There are a few different strategies to overcome the restrictions of Lambda function timeout, fanning out the check function seemed an easy and cost effective way to go. Each item from the list is passed to an SQS queue in a message, then picked by the target function.
A few considerations:

  • Messages are processed in batches of 1 to 10 depending on the configuration. This use case needed to process only 1 message at a time.
  • If the check fails for any reason, the message is put back into the queue and retried after a short while. Set a maximum number of deliveries on the message to avoid countless failed function calls and associated costs.
    Another control mechanism, for other use cases, is to set the expiration on undelivered messages.
  • Worth defining a dead letter queue where failed messages are delivered after all tries are exhausted or timed out.
  • Queues resources are not available in SAM, had to use CloudFormation AWS::SQS::Queue definitions.

The wide range of event sources supported by Lambda is very powerful. Be aware that event sources can have very different looking event data and determining the source from either context or event is not trivial.

One of the Functions – run-watcher – can be invoked via API Gateway and via SQS message. These two sources have entirely different message schemas. A small code piece is responsible for detecting the source, based on event specific attributes and parsing the message accordingly. In an ideal world, the AWS infrastructure would provide explicit information about the source of the different events.

SAM’s intricacies

The aforementioned scheduling of periodic checks was done using the AWS CloudWatch Timer service. This was easy enough to configure in SAM as an event on the Function.

SAM’s ability to create resources as necessary is a double-edge sword. While it significantly reduces the amount of infrastructure as code, it also introduces surprises to those who are not intimately familiar with all the platform specific resources of their application.

The scheduler resource is a good example to investigate. In SAM the scheduler event was defined in 4 lines as

Timer:
  Type: Schedule
  Properties:
    Schedule: rate(6 hours)

The CloudFormation equivalent would be something like this

WatcherScanScheduler:
  Type: AWS::Events::Rule
  Properties:
    Description: Schedule regular scans of watchers
    Name: WatcherScanScheduler
    ScheduleExpression: rate(6 hours)
    State: ENABLED
    Targets:
      -
        Arn: !GetAtt WatcherScanFunction.Arn
        Id: WatcherScanFunction

The SAM version is ultimately expanded to the same CloudFormation definition by the tool. Notice that in SAM there is no place for some of the properties found in the explicit CloudFormation definition. This is why many of the resources generated by SAM behind the scenes will show up with seemingly random names. A good summary of generated resources by SAM is on https://awslabs.github.io.

This behaviour is not limited to SAM, many 3rd party tools that use an abstraction on top of CloudFormation have the same effect.

Resources depend on other resources. In most cases dependencies are satisfied with inserting a reference to the resource using their name or their ARN.
When creating the individual resources part of a stack, some of the references may not resolve because the pertinent resource do not exist yet at the time. The resolution to this problem is to explicitly state dependencies in the template. The usage plan (AWS::ApiGateway::UsagePlan) in the SAM template is a good example to analyse.

WatcherApiUsagePlan:
  Type: AWS::ApiGateway::UsagePlan
  DependsOn: WatcherApiStage
  Properties:
    ApiStages:
    - ApiId: !Ref WatcherApi
      Stage: !Ref EnvironmentParameter
    Description: Private Usage Plan
    Quota:
      Limit: 500
      Period: MONTH
    Throttle:
      BurstLimit: 2
      RateLimit: 1
    UsagePlanName: PrivateUse

Trying to deploy the stack without the explicitly stated dependency DependsOn: WatcherApiStage will fail with not finding the Stage which is automagically generated by the SAM template together with the API (AWS::Serverless::Api) later.
This case is further complicated by SAM’s abstraction of API resources and hidingthe Stage resource definition. Where is the name WatcherApiStage coming from? It is generated by SAM as described at this link.

The SAM CLI has a validation function to run on the SAM template, and the Swagger API Definition. The current validator has little value, failing to catch even simple schema violations. The true validation of the template happens when CloudFormation attempts to build the stack. In the future, infrastructure as code must have the same level of syntax and semantic check support as other programming languages.

Cross-over between infrastructure as code and application code

Occasionally the the application code must share information with the infrastructure as code. For example: the ARN of the queue where the application sends messages to. In this case, SAM has an environment variable for the Function that picks up the queue ARN using a utility function (GetAtt). The environment variable is read during execution to get the queue ARN for sending messages. This works well in most cases where infrastructure definitions are made in one place only and picked up during deployment, and the references do not change after deployment.

Setting the region for SES is a similar case, but warrants a short explanation. SES is only available in 3 specific regions and the application must use one of those for sending e-mails even it is deployed elsewhere. There are no SES resources (AWS::SES::…) defined in this project, yet the SES region is defined in SAM as an environment variable. It could have been a configuration in application code, but since SES is an infrastructure element, the region configuration is best placed in SAM.

Summary

The application works, it does what it is supposed to. There were some trade-offs made and there are areas for improvement. But most importantly the architecture and the platform has huge potential.

Building the application was easy enough. Was it as easy as on Google App Engine? That is for the next time.

Categories
Engineering

The Search for a New Stack

The Google App Engine Standard Environment and the ecosystem around it has provided a comfortable and solid software stack in the past.

There is a general move towards standalone services, such is the case with Tasks and Scheduler.

  • New ones were introduced (FirestoreMemorystore),
  • some of them were deprecated (mail, Channel API),
  • while the fate of a few are unknown (memcache, search).

There were a mix of changes on the (Standard) platform:

  • Version 2 runtimes, auto-scaling containers (gVisor)
  • Move to REST API-s, away from the proprietary Google API.

These are generally welcoming changes, and while the platform is in a state of flux currently, some of these changes have removed the comfort and ease of building software for the platform.

The search begins

Writing infrastructure code has eaten into IT projects for a long time. It causes all sorts of project slippages, budget overruns, and enormous technical debt.

Developing your own server – web, messaging, socket, even database – or building a new web application framework, or client library are all “writing infrastructure code”.

Therefore it is imperative that coding focuses on business capabilities and value with the new stack.

It also means we can do away with the server infrastructure – servers, VMs, containers. While we are at it, it should remove the associated scalability and availability concerns too.

Costs should align with business value, and include effective use only, for example: computing time (no idle), messages sent, users registered, data stored, files transferred, jobs scheduled, and so on.

The stack has to sit on a platform with a rich ecosystem – a good range of core services for example:

  • database,
  • object storage,
  • messaging,
  • identity management.

Some of the best in class capabilities could live outside of the platform, for example:

  • payments,
  • bulk and transactional emails.

The stack must have the facilities to gain insight into its inner workings. Support for variable levels of centralised logging from hosted services and from logging API in bespoke code.

Support for monitoring of platform and application services, alerting on specified issues and metrics.

There must adequate tooling support for the entire application lifecycle extended to

  • environments, including local development,
  • programming languages, including relevant tool-chains.

A worthy candidate

The emerging paradigm, Function as a Service (FaaS) together with Serverless, promises to meet my expectations. It would have to be part of a comprehensive cloud offering that delivers the rest of the stack.

FaaS/Serverless reinforces cloud native concepts, and that leaves a lot of questions open for this new stack.

There is still the Google App Engine Standard 2nd Generation with the rest of the services from the Google Cloud Platform. If it wasn’t for the challenges that prompted the search for a new stack, it would be one of the best options available. Perhaps it will be again some day. In the meantime, is there something better?

Is it going to deliver on the benefits? What are the trade-offs going to be? Let’s see.