2.4 Function Orchestration
Sometimes a task is more complex than can be easily implemented with a single function. In this lesson I will show you how to use AWS Step Functions to create a state machine which can tie together a sequence of functions to compute complex tasks in the background.
2.4 Function Orchestration
Hi, and welcome back to Introduction to Serverless. In this lesson, I'm going to show you how you can combine functions together with AWS step functions. As far as I know, this functionality isn't available on Google Cloud. On the Microsoft Azure platform, it's called Logic Apps. When you do function orchestration this way, you are creating a state machine or a workflow graph. There are defined start and end points, and in between, you can create a flow. In our scenario, the flow would look like this. First, we are going to look if the audio file exists. If not, then we are going to determine the voice we want to use. And we are going to use the function from the last lesson to process and upload the audio file. And finally, we are going to serve it from S3. Before we can create our custom state machine, I want to have all functions ready. Because you can only copy state machines, but not edit them. Let's go to the Lambda section of the AWS console, and create a new function from scratch. I'm calling it check-audio-cache, it's so simple that we can edit it in line. First, we need the AWS SDK and an S3 object. In the handler, we are going to extract the file name, text and language from the event parameters. If there is no language, we fall back to US English. Then we try to get an object from S3. It's in the same bucket we used before, with the file name and the extension mp3 as the key. If there is an error, then the object wasn't found. I can return isCached false, and pass on all other parameters. If there wasn't one, I'm passing isCached true. And a key property of the filename and the extension. I'm using the same role as last person to keep it easy. All the other defaults are fine. The next function will be detect-voice. Here, we are using the same options. In the auditor, let's also import the SDK, and create a Polly object. I'm going to extract all three parameters here as well, to pass them on later. Polly has a function called describeVoices. That returns a list of available voices for a language. You just have to pass in a language code. If there was an error or there is no voice, I'm defaulting to Joanna. Otherwise, I'm using the first voice in the list, passing the file name and text in both cases. Again, I'm using the role from last lesson. Our final function will be serve-audio-file. It's super easy. I'm just returning a hash with a location key to s3 URL. Now that all the functions are set up, we can create a workflow. To access step functions, go to Services > Application services > Step Function. There are different blueprints available. Like a simple Hello World that only has one step. A more complex one will be job poller blueprint. That submits a job to AWS batch, and then waits until the job completed, either successfully or with a failure. To define your state machine, you have to the Amazon States language, it uses the JSON format. Unfortunately, there is no original editor yet. You can only few the state machine. Let's create our own custom state machine for processing audio files. I'm going to call it text-to-speech. First, I'm going to define all the States. The first one is the LookupAudioFile state. It's of Type task and the resource is going to be a Lambda function, ARN. Conveniently, Amazon suggests all the functions in the region. Then I have to say which function is going to be the next one. It's FileIsCached. This data is of the type choice, it only has one choice, that is determined by querying that is cach parameter that was fasten. If it's true, then we're going to ServeAudioFile. For any other case, we can use a Default but we'll go to DetermineVoice. It's a tasks state again. Planting to the detect-voice lambda function. Next in line is ProcessAndUploadAudio. It is called by a ServeAudioFile that we have already used in the just state. The state is also special because it gets an end tag that indicates that the state machine reached the end of it has completed the task. You can have multiple of those fun states. Now the only thing that's missing is the defining the initial state. In our case, that's LookupAudio file. After creating state machine, we can create an execution to run it. In this case, I'm going to pass in a filename and a text. It will default to English. When we start this execution, you can see the state machine on the left. The blue note is the current one that is being executed. By clicking on a step, you can see more about it. For instance the input and output. A green note means, that it was executed successfully. In this case, I'd run through the whole system. Let's run it again with the very same parameter. This time, it will have the value cached and skipped the audio generation. Now, how do we use it in the API gateway? To show this, I am creating a new API. In the root resource, I'm adding a POST method, as we connect to our step function service. You will need to select the region and service. Next, we need to specify the action, which is StartExecution, and provide an execution role. I've created one already to save some time, but you need the step functions for access policy attached. And finally, there needs to be a body mapping template. I'm using application JSON, since an HTML form is a real pain to get working with this. We need an input key to test the string of the body JSON, As well as the ARN of our state machine. You might think that we just received the output of the execution in the response, but that's not how is works. We are in async land so the state machine executes in the background. What we will receive is a reference to the execution. I'm going to deploy this API to prod as well, and copy the URL to make a POST request with call. You could also use any other HTTP tool. It's going to be a POST request for the content type header of application JSON, as well as some JSON data in the body. If the request was successful, you will receive the ARN and the start date. What you could do if you want to get the result is to periodically query this execution id, and wait for it to be finished to receive the results. If you have a look at the S3 bucket shows that the file is also there.