We’ve been investigating AWS IoT as a platform for our upcoming IoT projects. A typical use case for us would be providing the backend for a consumer device. To achieve that with AWS we’re looking at using Cognito User Pools to provide user authentication, API Gateways with Lambda to provide functionality to the Apps and IoT to communicate with the devices.
For consumer devices, we have the requirement that a user can own multiple devices, and they can choose to share access to these devices with other users. Establishing user/device ownership must be very flexible, depending on the specific type of product (i.e. lightbulbs vs. smart thermostats).
We’d also like to use the Device Shadow and MQTT Queues directly from users' devices to minimise latency in registering a user’s actions, and to minimise system running costs.
The example projects for AWS IoT are very impressive, but miss out most of the detail that would be essential for us to achieve the above.
Things we tried:
Cognito Identity Pools (Federated Identities)
We took this approach after looking at the Temperature Monitor example in the AWS IoT Node SDK Examples. This example uses the Identity Pools' 'unauthenticated' role to give access to guest users' IoT shadows and queues. Even if we switch to authenticated then the policy is still too loose and will allow access to all devices. We need explicit user-to-device relationship mapping. We thought about device registry attributes to match up to policy rules but decided against it as we’d rather model our user/device mapping in a database to contain the complexity. Next.
IoT Custom Authentication
We then investigated not using Cognito at all, instead providing our own externally created JSON Web Tokens with Claims to prove access to devices, building the policy in the authorizer function. This worked really well in our Node.js PoC. We then tried to apply this approach to the browser, but it wouldn’t work. The authorizer function requires the token to be passed as a header, but browsers do not support custom headers for websockets (the issue is tracked here github.com/aws/aws-iot-device-sdk-js/issues/169). Next.
Cognito User Pool with STS AssumeRole
The steps we'd taken so far had shown that we needed strong, dynamic policy creation for user identities that could be natively authorized with the AWS Signature Version 4 Signing Process, as it can be passed in the query string (and let us use web sockets as per the SDK examples).
We created a small API Gateway that would require a Cognito authenticated user and then use AWS STS AssumeRole to generate a dynamic policy that would restrict that user to the IoT resources they should be able to access. The permissions granted are the intersection of a pre-defined role and a JSON policy document built in the function. Our pre-defined role is a very liberal policy to allow access to any IoT resources, while the dynamic policy lists the exact resources we need to control a single device. The function returns a set of credentials that can then be passed into the Device SDK (as per the SDK examples).
Note: The JSON policy is limited to 2048 characters so we may require multiple calls to get a different permission set if multiple devices are being managed at once. This isn’t a major problem for us at the moment, but would be a problem if multiple devices need to be managed simultaneously by the same user. If AWS enhance the IoT Custom Authorizers to support websockets, that may become a viable alternative for this problem.
We put together a project to show the steps taken - github.com/sauce-consultants/aws-iot-cognito-user-pool-example. It uses Terraform to set up a Cognito user pool and all the static policies we need, Claudia.js to create the API Gateway with a lambda function, the AWS Node SDK to create a user, and then a test script to call the API and use the token to publish a message. We also updated the IoT SDK temperature monitor demo to prove the browser websocket authentication worked but have not included that in the project.
- Find ways to perform automated tests against this policy. We need to know if a policy change gives users extra access. The Node.JS AWS SDK prefers to treat credentials as a global static resource which makes this more error prone than we’d like. We fell foul of this more than once when manually testing policy changes. We are considering having canary devices and queues that shouldn’t be accessible and a process that repeatedly attempts to access them as a normal user. This won’t catch everything but will find some big holes.
If you've got any thoughts or comments we'd love to hear them. Feel free to start an issue on the github project page.