# Serverless in the wild: Characterizing and optimizing the serverless workload

## Meta Info

Serverless in the wild: Characterizing and optimizing the serverless workload at a large cloud provider

Presented in [ATC 2020](https://www.usenix.org/conference/atc20/presentation/shahrad).

Trace: <https://github.com/Azure/AzurePublicDataset>

## Understanding the paper

### TL;DRs

This paper characterizes the entire production FaaS workload of *Azure Functions*.\
It also proposes a policy for reducing the number of cold start function executions.

### Background

* Function as a Service (FaaS): A software paradigm
  * Users simply upload the code of their functions to the cloud.
  * Functions get executed when “triggered” or “invoked” by events.
  * The cloud provider is responsible for *provisioning the needed resources*, *providing high function performance*, and *billing users for their actual function executions*.
* Function execution: Require the code (e.g., user code, language runtime libraries) in memory.
  * Warm start: The code is already in memory, so the function can be started quickly.
  * Cold start: The code has to be brought in from persistent storage.

### Some Observations

* Most functions are *invoked very infrequently*.
* The most popular functions are invoked *8 orders of magnitude* more frequently than the least popular ones.
* Functions exhibit *a variety of triggers*, producing invocation patterns that are often *difficult to predict*.
* A 4x range of function memory usage and that 50% of functions run in less than 1 second.

### Existing Work

AWS and Azure use *a fixed “keep-alive” policy* that retains the resources in memory for *10 and 20 minutes* after a function execution, respectively.

### Proposed Policy

* To reduce the number of cold start invocations.
  * Use *a different keep-alive value for each user’s workload*, according to its actual invocation frequency and pattern.
  * Enable the provider to *pre-warm a function execution* before its invocation happens (making it a warm start).

### Implementation

Implemented in simulation and for *Apache OpenWhisk FaaS platform*.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://paper.lingyunyang.com/reading-notes/conference/atc-2020/serverless-in-the-wild-characterizing-and-optimizing-the-serverless-workload.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
