Yarokuk

What are persisted queries and how to set them up with Relay and express-graphql?

Yaroslav Kukytsyak

Yaroslav Kukytsyak

Updated Jun 6, 2021

This is an overview of persisted queries in the context of GraphQL that also shows how to set them up with Relay on the client and express-graphql on the server.

Persisted queries overview

To explain what persisted GraphQL queries are, let's consider an example GraphQL query:

query ProfileQuery($userId: ID!) {
  user(id: $userId) {
    name
    picture {
      url
    }
  }
}

Generally, you would send the query text along with the variables in the body of a POST request:

POST /graphql

{
  "query": "query ProfileQuery(\n...",
  "variables": {
    "userId": "u1"
  }
}

On the other hand, with persisted queries, the query text is already on the server and you would only need to send a query ID that refers to that particular query (the query ID is generally generated by hashing the query text using a hash function like md5).

For example, if the ID of the ProfileQuery shown above is "5eb63bbbe01eeed093cb22bb8f5acdc3", then we would send the queryId instead of query in the POST request body:

POST /graphql

{
  "queryId": "5eb63bbbe01eeed093cb22bb8f5acdc3",
  "variables": {
    "userId": "u1"
  }
}

The server then needs to have a way to map a query ID to the actual query, for example by having a JSON file with all the persisted queries:

{
  "5eb63bbbe01eeed093cb22bb8f5acdc3": "query ProfileQuery(\n...",
  ...
}

Advantages of using persisted queries

There are several advantages of using persisted queries:

  • The amount of data that the client needs to upload to the server when performing a GraphQL query or mutation is significantly reduced in many cases because a query ID is generally much shorter than the entire query/mutation text.
  • If a server allows only persisted queries, all the GraphQL operations that can be performed by clients are restricted, which improves security.
  • The HTTP GET method can be used to perform GraphQL queries (e.g., GET /graphql?id=5eb63...&variables={"userId":"u1"}) without having to worry about exceeding the maximum query string length. This means that we can leverage HTTP caching as well as preload queries with <link rel="preload">.

Notice that these advantages are mainly useful in production, so it's totally acceptable to use persisted queries in production and non-persisted queries in development.

Persisted queries with Relay

Let's see now how to set up persisted queries with Relay.

First of all, we need to pass the --persist-output flag to the Relay compiler along with the path specifying where to save the generated JSON file containing the mapping from query IDs to query text.

relay-compiler --persist-output ./server/queryMap.json

The first time we run this, the ./server/queryMap.json file will be created and contain all the queries used by the app. Subsequent runs will simply update the JSON file by adding all the new queries (old queries are left in place even if those queries are no longer used since someone might still be using an older version of the app relying on those queries).

Besides that, by running the relay compiler with --persist-output, the files generated by Relay are updated and the fetchQuery function that is part of the network layer will be now invoked with the operation object containing a non-null id field and a null text field, so you should update the fetchQuery function to use id instead of text.

For example, consider the following fetchQuery function, which is using operation.text:

const fetchQuery: FetchFunction = async (operation, variables) => {
  const payload = {
    query: operation.text,
    variables,
  };

  const response = await post('/graphql', payload);

  if (!response.ok) {
    throw response;
  }

  return response.json();
};

We would update it to use operation.id:

const fetchQuery: FetchFunction = async (operation, variables) => {
  const payload = {
    queryId: operation.id, // Passing the id instead of text
    variables,
  };

  const response = operation.operationKind === 'query'
    ? await get('/graphql', payload) // Using HTTP GET for queries
    : await post('/graphql', payload);

  if (!response.ok) {
    throw response;
  }

  return response.json();
};

Notice that we now send an HTTP GET request for queries and an HTTP POST request for mutations. In both cases, the payload is the same: the query/mutation ID and the variables.

Some of the reasons why we use an HTTP GET request for queries are:

  • Since the payload consists only of an ID and the query variables, we don't have to worry about exceeding the maximum query string length.
  • We can leverage HTTP caching.
  • We can preload queries with <link rel="preload">.

Some of the reasons why we use an HTTP POST request for mutations are:

  • Variables consisting of mutation inputs can get quite large and exceed the maximum query string length.
  • Mutations are generally not allowed with HTTP GET requests by GraphQL HTTP servers.
  • We don't need HTTP caching or preloading for mutations.

Even though we can use persisted queries both in development and in production, I would recommend to use non-persisted queries in development for the following reasons:

  • We get a better DevTools experience since we can see the query text rather than an opaque query ID.
  • We don't need to be updating the persisted queries on the server every time we update a query on the client during development.
  • Since we are not updating the persisted queries on the server during the development of features, we don't need to add commits with changes to queryMap.json every time we open a pull request. Instead, we update queryMap.json on the server only when going to production.

To use non-persisted queries in development, we can update the fetchQuery function to call a helper function sendGraphqlRequest, which sends the query ID when in production and the query text when in development:

const sendGraphqlRequest = (operation: RequestParameters, variables: Variables) => {
  if (process.env.NODE_ENV === 'production') {
    const payload = {
      queryId: operation.id,
      variables,
    };

    return operation.operationKind === 'query'
      ? get('/graphql', payload)
      : post('/graphql', payload);
  } else {
    const payload = {
      query: operation.text,
      variables,
    };

    return post('/graphql', payload);
  }
};

const fetchQuery: FetchFunction = async (operation, variables) => {
  const response = await sendGraphqlRequest(operation, variables);

  if (!response.ok) {
    throw response;
  }

  return response.json();
};

Notice that in development, we send a POST request both for queries and mutations since the query text can get quite large and exceed the maximum query string length. Besides that, we don't really need the advantages that HTTP GET requests offer when we are in development.

In order to use persisted queries in the production build of an application, we need to do the following:

  1. First, run the Relay compiler with the --persist-output flag.
  2. Then, build the app for production (e.g., using webpack).
  3. Finally, re-run the Relay compiler without the --persist-output flag so that the files generated by Relay go back to using non-persisted queries.

We can do this by first adding a relay script that runs the Relay compiler with all the options except --persist-output:

"scripts": {
  "relay": "yarn run relay-compiler --schema ./schema.graphql --src ./src/ --language typescript --customScalars.DateTime=String",
  ...
}

Then, we add a relay:persist script that runs relay with the --persist-output flag:

"scripts": {
  "relay:persist": "yarn run relay --persist-output ./server/queryMap.json",
  ...
}

Lastly, we update the build script to first run relay:persist, then webpack-build, and finally relay:

"scripts": {
  "build": "yarn run relay:persist && webpack-build && yarn run relay",
  ...
}

Persisted queries with express-graphql

When using express-graphql on the server, we generally do something like this:

app.use('/graphql', graphqlHTTP({ schema }));

With this kind of setup, we can send POST requests to /graphql with the body containing query and variables as well as GET requests with the query string containing query and variables (e.g., GET /graphql?query={me{name}}&variables={}) and everything will work as expected.

However, when using persisted queries we would send queryId instead of query, which is not handled by express-graphql and therefore we'll get an error saying that query is missing in the request.

Therefore, we need to have an Express middleware that executes right before graphqlHTTP and maps queryId to query by using the generated queryMap.json.

Let's call this middleware persistedQueriesMiddleware. A simplistic implementation of this middleware can look like this:

const persistedQueriesMiddleware: RequestHandler = (req, res, next) => {
  const { queryId } = req.method === 'GET' ? req.query : req.body;

  const persistedQuery = queryMap[queryId];

  if (persistedQuery) {
    req.body.query = persistedQuery;
  }

  next();
};

We first get the queryId from the query string in the case of a GET request or from the body in the case of a POST request.

Then, we try to get the persisted query with ID equal to queryId and if it exists, we set the req.body.query to that persisted query so that graphqlHTTP will find the query that it expects in the body of the request.

Finally, we call next() to move on to the next middleware.

Let's now use this middleware before graphqlHTTP:

app.use(
  '/graphql',
  persistedQueriesMiddleware,
  graphqlHTTP({ schema }),
);

Notice that this middleware implementation does nothing to prevent non-persisted queries.

As mentioned earlier, for improved security, we should allow only persisted queries in production while still allowing both persisted and non-persisted queries in development.

I created an npm package called express-graphql-persisted-queries that provides a middleware that allows doing exactly that!

So, let's replace our middleware with the one from express-graphql-persisted-queries:

import { persistedQueries } from 'express-graphql-persisted-queries';
import queryMap from './queryMap.json';

app.use(
  '/graphql',
  persistedQueries({
    queryMap,
    strict: process.env.NODE_ENV === 'production',
  }),
  graphqlHTTP({ schema }),
);

With this in place, when NODE_ENV is 'production', only persisted queries will be allowed, meaning that we will get a 400 Bad Request for requests that contain the query text or that do not contain a valid query ID.