Plus de raison de ne pas l'essayer !
25 octobre 2019


Disclaimer
I may be a GraphQL evangelist




collective intelligence driven civic start-up that develops participatory applications.

- 25 people 🏃
- 9 devs 👩💻






Consultation


🇫🇷 Le Grand Débat National
+ 1,9 M
Contributions
+ 10 000
Local events
+ 2,7 M
Visitors
Allows clients to ask what they want.
Easily aggregate data from multiple sources.
Uses a type system to describe the data.

A GraphQL request is called a query and looks like this:
query {
users {
name
votes {
id
value
}
}
}
➡️
👏 The way we ask for the data is the way we receive it.
{
"data": {
"users": [
{
"name": "John Doe",
"votes": [
{
"id": "1",
"value": 1
},
{
"id": "2",
"value": 0
}
]
},
{
"name": "Foo Bar",
"votes": []
}
]
}
}
A query support arguments. For example, if you want to display a specific article, you can specify an id argument to the post field:
query {
user(id: "user1"){
id
name
votes(first: 1) {
value
}
}
}
✌️Adding arguments will allow us to refine our queries, filter, sort, page ...
{
"data": {
"user": {
"id": "user1",
"name": "John Doe",
"votes": [
{ "value": 1 }
]
}
}
}
➡️

Introducing REST fanboy

GraphQL is a Facebook thing why should I care…


Some GraphQL users






Some public GraphQL APIs







An open and neutral home for the GraphQL community
I already have a REST API…


REST and GraphQL are not enemies
Supporting multiple protocols for a single API is ideal from the consumer’s perspective.
GitHub is using their GraphQL (v4) API to back some parts of their REST (v3) API.
It's a very smart way of handling multiple protocols without the maintenance effort !
Dogfooding
eating your own dog food

Do you use you own API ?

Welcome to frontend developement
Current status of Frontend requirements
Realtime
Offline support
Local First Architecture
Optimistic Updates
{
"users": [
{
"id": "1",
"votes": [
{
"id": "1",
}
]
}
]
}
query {
users {
id
votes {
id
}
}
}
Client Caching
Global Unique Cache Key
Refetch identifier
Opaque to clients
public function resolveUserId(User $user): string
{
return \base64_encode('User:' . $user->getId());
}
Global Object Identification
Associate a unique id to each object in order to be able to identify and retrieve it again regardless of its type.
{
"users": [
{
"id": "VXNlcjox",
"votes": [
{
"id": "Vm90ZTox",
}
]
}
]
}
query {
users {
id
votes {
id
}
}
}
Relay Store
All records are normalized in the store, this is why we need a unique and global ID for each record.
{
"users": [
{
"id": "VXNlcjox",
"name": "John Doe",
"votes": [
{
"id": "Vm90ZTox",
"value": 0
}
]
}
]
}
{
"VXNlcjox": {
"name": "John Doe",
"votes": [
{ "__ref": "Vm90ZTox" }
]
},
"Vm90ZTox": {
"value": 0
}
}
GraphQL Response
Client Store
GraphQL Frontend DX
👍 Ask what they want.
💖 Schema facilitate communication with backend devs.
😍 Remove most of data fetching code.
🚆 Optimistic UI and cached data.
✅ Generate Flow/Typescript typings using the strongly-typed schema.
I don't want to rewrite all my code


it also integrates very well in front of micro services and REST APIs .
GraphQL fits very well on a monolith, it is even the case of most users.
Did Facebook rewrote all their code to use GraphQL ?
No. Because it's only a thin layer.

Business Logic
Storage Layer
GraphQL
Client
GraphQL Gateway
REST API
Service
Service
Service
GraphQL breaks caching


server side caching ✅
client side caching ✅
HTTP caching ❓
What kind of caching?
HTTP Caching
Verb | Operation | Cacheable |
---|---|---|
GET | Read | Yes |
POST | Write | No |
Cache-Control: max-age=<seconds>
GraphQL and HTTP
The same URL is called for different queries, producing different results.
GraphQL commonly use POST verb.
Cache duration depends on responses fields.
We have 2 problems : differentiating mutations vs queries, and caching POST responses.
Verb | Operation | Cacheable |
---|---|---|
POST | Query | Yes |
POST | Mutation | No |
GraphQL doesn't say much about transport, it's up to you, POST is not the only way to use GraphQL.
#1 Differentiating mutations and queries
X-HTTP-Method-Override: GET
Always use POST but send an extra header for every query
The Haker method
#1 Differentiating mutations and queries
The Regex method
You can use a regex to make sure mutations are always send to backend.
sub vcl_recv {
#...
# Always send GraphQL mutations to the backend.
if (bodyaccess.rematch_req_body("mutation") == 1) {
return (pass);
}
return (hash);
}
#2 POST requests cannot be cached
When everyone say it's impossible, there is one solution left…
Responses to POST requests are only cacheable when they include
explicit freshness information (see Section 4.2.1 of [RFC7234]).
However, POST caching is not widely implemented. For cases where an
origin server wishes the client to be able to cache the result of a
POST in a way that can be reused by a later GET, the origin server
MAY send a 200 (OK) response containing the result and a
Content-Location header field that has the same value as the POST's
effective request URI (Section 3.1.4.2).
POST requests cannot be cached

The solution is to make the request body a part of the hash, and let the normal caching logic happen.
The result is that only clients who supply the same body will receive the the same reply.
The Fake News method
You can use a regex to make sure mutations are always send to backend.
# Called at the beginning of a request, after the complete request has been received and parsed.
sub vcl_recv {
# Only cache POST GraphQL API requests.
if (req.method == "POST" && req.url ~ "graphql$") {
# Will store up to 500 kilobytes of request body.
std.cache_req_body(500KB);
set req.http.X-Body-Len = bodyaccess.len_req_body();
# If a client supplies a very big request (more than 500KB)
if (req.http.X-Body-Len == "-1") {
return (pass);
}
# Always send GraphQL mutations to the backend.
if (bodyaccess.rematch_req_body("mutation") == 1) {
return (pass);
}
return (hash);
}
#...
}
# Change the hashing function to handle POST request
sub vcl_hash {
# To cache POST requests
if (req.http.X-Body-Len) {
bodyaccess.hash_req_body();
} else {
hash_data("");
}
}
✅ It's quick, simple and it works well.
We use this in production since 1 year. 🙃
But people who read RFCs will not be happy 😅
POST requests cannot be cached
The Fake News method

If using POST is not an option
The Query hash method
GET is a valid way to query a GraphQL server over HTTP. This means that we could indeed cache GraphQL responses.
The only issue with GET is with the size limit of the query string, depending on browsers.
If using POST is not an option
The Query hash method
{
node(id: $id) {
... on User {
name
createdAt
}
}
}
4fde400d10bdc6ca010d199cfce6091e3537d6a3
hash
GET /graphql?query=123&id=user1
ℹ️ Require some frontend and server tooling : the server must know how to turn hashes into queries.
How do we know cache freshness ?
It depends on every fields used…
{
contributions(first: 10) { # Can be cached for 60s
id
title
votes {
totalCount # Can be cached for 30s
}
}
latestContribution { # Can be cached for 5s
id
}
}
type Contribution @cacheControl(maxAge: 240) {
id: Int!
title: String
author: Author
votesCount: Int @cacheControl(maxAge: 30)
}
type Query {
latestContribution: Contribution @cacheControl(maxAge: 10)
}
A GraphQL directive describe the cache policy for each field.
The lowest value is used for an entire response.
A proposal from Apollo server
Is HTTP caching worth it ?
Have an authenticated only API ? ❌
Have data that changes often ? ❌
Highly customizable API ?
Consider the tradeoffs.
GraphQL is slow


A resolver has no idea if this data has been loaded before, or if it will be loaded after, or if other resolvers will end up asking for the same data requirements.
# Query.users
public function resolveUsers(): array
{
return $this->userRepository->findAll();
}
# User.name
public function resolveUserName(User $user): string
{
return $user->getName();
}
# User.votes
public function resolveUserVotes(User $user): array
{
return $this->votesRepository->findByUser($user);
}
N+1 problem
query {
users { # fetches users (1 query)
name
votes { # fetches votes for each user
id # (N queries for N users)
value
}
}
} # Therefore = N+1 round trips
Data-fetching problem that occurs when you need to fetch related items.
How do I make GraphQL efficient ?
Batching : within a single query, allow you to run a batch operation instead of many small searches.
SELECT * FROM votes WHERE id IN ( '1', '2', '3' )
# Instead of :
SELECT * FROM votes WHERE id = '1';
SELECT * FROM votes WHERE id = '2';
SELECT * FROM votes WHERE id = '3';
DataLoader, a data loading mechanism.
Nothing to do with GraphQL but pair well with it.
// User.votes
public function resolveUserVotes(User $user): Promise
{
return $this->userVotesDataLoader->load($user->getId());
}
#load takes as an argument the loading key for the data the caller is interested in and it returns a promise, which will eventually be fulfilled with the data the caller asked for.
This method is used within resolvers
class UserVotesDataLoader {
public function load($key): Promise
{
// Adds the key to an eventual batch and returns a promise
}
}
A batch loading function accepts an Array of keys, and returns a Promise which resolves to an Array of values.
class UserVotesDataLoader {
// Receives all keys that where asked to be loaded
public function all(array $keys): Promise
{
// Resolve data using batching
$votes = $this->repository->findVotesForUsersIds($keys);
// Create an array of values (key => votes)
$results = array_map(
function ($key) { /* your logic */ },
$keys
);
// Fullfulls resolver promises with the data they asked for.
return $this->promiseAdapter->all($results);
}
}
Lazy Execution : we take an asynchronous approach to resolvers. This means resolvers don’t always return a value anymore, they can return somewhat of an “incomplete result” (promise).

Are our database engineers happy ?
It's better but, what about duplicates…

Memoization Caching : within the same query, keep in memory the result of a job to avoid duplicating it.
Application Caching : we can also use a DataLoader key, to cache it's result between requests.
DataLoader enable caching by default.
GraphQL let anyone query for everything


Limiting query complexity and deps
Sending a heavy query can consume too many resources, for example: user ➡️ friends ➡️ friends ➡️ friends …
One way to prevent this is, to do a cost analysis before the execution and to set a limit.
# app/config/config.yml
overblog_graphql:
security:
query_max_complexity: 1000
query_max_depth: 10
How do I implement Authentication ?
Authentication is independent of GraphQL.

class VoteResolver {
// Single source of truth for fetching
public function fetch(string $id): ?Vote {
$vote = $this->repository->find($id); // Nullable
if (!$vote) return null;
return $vote;
}
}
How do I implement Authorization ?
class VoteResolver {
// Single source of truth for fetching
public function fetch(string $id): ?Vote {
$vote = $this->repository->find($id); // Nullable
if (!$vote) return null;
// Single source of truth for authorization
$canSee = checkCanSee($vote);
return $canSee ? $vote : null;
}
}
function checkCanSee(Vote $vote): bool {
return true;
}
class VoteResolver {
// Single source of truth for fetching
public function fetch(Viewer $viewer, string $id): ?Vote
{
$vote = $this->repository->find($id); // Nullable
if (!$vote) return null;
// Single source of truth for authorization
$canSee = checkCanSee($viewer, $vote);
return $canSee ? $vote : null;
}
}
function checkCanSee(Viewer $viewer, Vote $vote): bool {
return $vote->getAuthor()->getId() === $viewer->getId();
}
A Vote can only be seen by its creator
- You run authentication logic 🛡️
- You get a viewer, available for resolvers spread threw context
- Your resolvers must use single source of truth for fetching (from business logic layer) 💡
Authorization workflow
GraphQL is impossible to monitor


Monitoring
REST


GraphQL

Monitoring

For an internal API, it's simple, just give a name to your queries. Then use it as the name of the transaction.
💡 There is an ESLint rule for that.
query ProposalListViewPaginatedQuery {
# your query
}


Dan Schafer
Inspirations for this talk
Marc-André Giroux
Thanks !
Any questions ?

We work hard to update our democracy (with GraphQL)… 👍
Slides: cap-collectif.slides.com/spyl
GRAPHQL, PLUS DE RAISON DE NE PAS L'ESSAYER !
By Aurélien David