What are persisted queries and how to set them up with Relay and express-graphql?
Yaroslav Kukytsyak
Updated Jun 6, 2021
This is an overview of persisted queries in the context of GraphQL that also shows how to set them up with Relay on the client and express-graphql
on the server.
Persisted queries overview
To explain what persisted GraphQL queries are, let's consider an example GraphQL query:
query ProfileQuery($userId: ID!) {
user(id: $userId) {
name
picture {
url
}
}
}
Generally, you would send the query text along with the variables in the body of a POST request:
POST /graphql
{
"query": "query ProfileQuery(\n...",
"variables": {
"userId": "u1"
}
}
On the other hand, with persisted queries, the query text is already on the server and you would only need to send a query ID that refers to that particular query (the query ID is generally generated by hashing the query text using a hash function like md5
).
For example, if the ID of the ProfileQuery
shown above is "5eb63bbbe01eeed093cb22bb8f5acdc3"
, then we would send the queryId
instead of query
in the POST request body:
POST /graphql
{
"queryId": "5eb63bbbe01eeed093cb22bb8f5acdc3",
"variables": {
"userId": "u1"
}
}
The server then needs to have a way to map a query ID to the actual query, for example by having a JSON file with all the persisted queries:
{
"5eb63bbbe01eeed093cb22bb8f5acdc3": "query ProfileQuery(\n...",
...
}
Advantages of using persisted queries
There are several advantages of using persisted queries:
- The amount of data that the client needs to upload to the server when performing a GraphQL query or mutation is significantly reduced in many cases because a query ID is generally much shorter than the entire query/mutation text.
- If a server allows only persisted queries, all the GraphQL operations that can be performed by clients are restricted, which improves security.
- The HTTP GET method can be used to perform GraphQL queries (e.g.,
GET /graphql?id=5eb63...&variables={"userId":"u1"}
) without having to worry about exceeding the maximum query string length. This means that we can leverage HTTP caching as well as preload queries with<link rel="preload">
.
Notice that these advantages are mainly useful in production, so it's totally acceptable to use persisted queries in production and non-persisted queries in development.
Persisted queries with Relay
Let's see now how to set up persisted queries with Relay.
First of all, we need to pass the --persist-output
flag to the Relay compiler along with the path specifying where to save the generated JSON file containing the mapping from query IDs to query text.
relay-compiler --persist-output ./server/queryMap.json
The first time we run this, the ./server/queryMap.json
file will be created and contain all the queries used by the app. Subsequent runs will simply update the JSON file by adding all the new queries (old queries are left in place even if those queries are no longer used since someone might still be using an older version of the app relying on those queries).
Besides that, by running the relay compiler with --persist-output
, the files generated by Relay are updated and the fetchQuery
function that is part of the network layer will be now invoked with the operation
object containing a non-null id
field and a null text
field, so you should update the fetchQuery
function to use id
instead of text
.
For example, consider the following fetchQuery
function, which is using operation.text
:
const fetchQuery: FetchFunction = async (operation, variables) => {
const payload = {
query: operation.text,
variables,
};
const response = await post('/graphql', payload);
if (!response.ok) {
throw response;
}
return response.json();
};
We would update it to use operation.id
:
const fetchQuery: FetchFunction = async (operation, variables) => {
const payload = {
queryId: operation.id, // Passing the id instead of text
variables,
};
const response = operation.operationKind === 'query'
? await get('/graphql', payload) // Using HTTP GET for queries
: await post('/graphql', payload);
if (!response.ok) {
throw response;
}
return response.json();
};
Notice that we now send an HTTP GET request for queries and an HTTP POST request for mutations. In both cases, the payload is the same: the query/mutation ID and the variables.
Some of the reasons why we use an HTTP GET request for queries are:
- Since the payload consists only of an ID and the query variables, we don't have to worry about exceeding the maximum query string length.
- We can leverage HTTP caching.
- We can preload queries with
<link rel="preload">
.
Some of the reasons why we use an HTTP POST request for mutations are:
- Variables consisting of mutation inputs can get quite large and exceed the maximum query string length.
- Mutations are generally not allowed with HTTP GET requests by GraphQL HTTP servers.
- We don't need HTTP caching or preloading for mutations.
Even though we can use persisted queries both in development and in production, I would recommend to use non-persisted queries in development for the following reasons:
- We get a better DevTools experience since we can see the query text rather than an opaque query ID.
- We don't need to be updating the persisted queries on the server every time we update a query on the client during development.
- Since we are not updating the persisted queries on the server during the development of features, we don't need to add commits with changes to
queryMap.json
every time we open a pull request. Instead, we updatequeryMap.json
on the server only when going to production.
To use non-persisted queries in development, we can update the fetchQuery
function to call a helper function sendGraphqlRequest
, which sends the query ID when in production and the query text when in development:
const sendGraphqlRequest = (operation: RequestParameters, variables: Variables) => {
if (process.env.NODE_ENV === 'production') {
const payload = {
queryId: operation.id,
variables,
};
return operation.operationKind === 'query'
? get('/graphql', payload)
: post('/graphql', payload);
} else {
const payload = {
query: operation.text,
variables,
};
return post('/graphql', payload);
}
};
const fetchQuery: FetchFunction = async (operation, variables) => {
const response = await sendGraphqlRequest(operation, variables);
if (!response.ok) {
throw response;
}
return response.json();
};
Notice that in development, we send a POST request both for queries and mutations since the query text can get quite large and exceed the maximum query string length. Besides that, we don't really need the advantages that HTTP GET requests offer when we are in development.
In order to use persisted queries in the production build of an application, we need to do the following:
- First, run the Relay compiler with the
--persist-output
flag. - Then, build the app for production (e.g., using webpack).
- Finally, re-run the Relay compiler without the
--persist-output
flag so that the files generated by Relay go back to using non-persisted queries.
We can do this by first adding a relay
script that runs the Relay compiler with all the options except --persist-output
:
"scripts": {
"relay": "yarn run relay-compiler --schema ./schema.graphql --src ./src/ --language typescript --customScalars.DateTime=String",
...
}
Then, we add a relay:persist
script that runs relay
with the --persist-output
flag:
"scripts": {
"relay:persist": "yarn run relay --persist-output ./server/queryMap.json",
...
}
Lastly, we update the build
script to first run relay:persist
, then webpack-build
, and finally relay
:
"scripts": {
"build": "yarn run relay:persist && webpack-build && yarn run relay",
...
}
Persisted queries with express-graphql
When using express-graphql
on the server, we generally do something like this:
app.use('/graphql', graphqlHTTP({ schema }));
With this kind of setup, we can send POST requests to /graphql
with the body containing query
and variables
as well as GET requests with the query string containing query
and variables
(e.g., GET /graphql?query={me{name}}&variables={}
) and everything will work as expected.
However, when using persisted queries we would send queryId
instead of query
, which is not handled by express-graphql
and therefore we'll get an error saying that query
is missing in the request.
Therefore, we need to have an Express middleware that executes right before graphqlHTTP
and maps queryId
to query
by using the generated queryMap.json
.
Let's call this middleware persistedQueriesMiddleware
. A simplistic implementation of this middleware can look like this:
const persistedQueriesMiddleware: RequestHandler = (req, res, next) => {
const { queryId } = req.method === 'GET' ? req.query : req.body;
const persistedQuery = queryMap[queryId];
if (persistedQuery) {
req.body.query = persistedQuery;
}
next();
};
We first get the queryId
from the query string in the case of a GET request or from the body in the case of a POST request.
Then, we try to get the persisted query with ID equal to queryId
and if it exists, we set the req.body.query
to that persisted query so that graphqlHTTP
will find the query
that it expects in the body of the request.
Finally, we call next()
to move on to the next middleware.
Let's now use this middleware before graphqlHTTP
:
app.use(
'/graphql',
persistedQueriesMiddleware,
graphqlHTTP({ schema }),
);
Notice that this middleware implementation does nothing to prevent non-persisted queries.
As mentioned earlier, for improved security, we should allow only persisted queries in production while still allowing both persisted and non-persisted queries in development.
I created an npm package called express-graphql-persisted-queries that provides a middleware that allows doing exactly that!
So, let's replace our middleware with the one from express-graphql-persisted-queries
:
import { persistedQueries } from 'express-graphql-persisted-queries';
import queryMap from './queryMap.json';
app.use(
'/graphql',
persistedQueries({
queryMap,
strict: process.env.NODE_ENV === 'production',
}),
graphqlHTTP({ schema }),
);
With this in place, when NODE_ENV
is 'production'
, only persisted queries will be allowed, meaning that we will get a 400 Bad Request
for requests that contain the query text or that do not contain a valid query ID.