Why GraphQL?
Not a long ago I wrote my first GraphQL service, I remember the first few days it was difficult for me to get my head wrapped around it, I had missing bracket, colons, unresolved types, and non-null errors all around the places. But once I understood what it stood for; what were the problems that it solves and how easy it makes to query data requirements of the client; Since then I couldn’t stop using it, I have been using it for almost all of my projects. With this blog I hope to answer the very first question that I asked when I heard of GraphQL.
Origin
GraphQL was internally developed by Facebook in 2012 and was later open sourced in 2015. The idea emerged while Facebook decided to rebuild its mobile native applications. They were looking a powerful data-fetching API to describe all of Facebook’s data requirements yet simple enough to learn and adopt by their product developers and so they came up from this idea of declarative API implementation to a query language to a complete standardized GraphQL specification. Today it powers billions of API requests and is widely adapted by many big tech giants like Netflix, GitHub in their production applications.
REST and its shortcomings
REST(REpresentational State Transfer) has become one of the widely adopted, dominant software/API architectural style. The key abstraction that REST-compliant systems, often called RESTful systems are based on is resource. Any piece of information that can be identified and can be named is a resource. for eg: a document, an image, user data, etc. Since REST is resource oriented it uses a resource identifier(URI address) to identify that particular resource involved in the interaction(accessing, modifying) with the system. The resources are acted upon by using a set of simple, well-defined operations using URI(Unique Resource Identifier). The client and server exchange these resource representation by using a standard interface and protocol(HTTP) and identified by URI.
for example, an HTTP GET to a particular URI fetches an resource(also sometimes referred as object) and returns a server-specified set of fields, an HTTP PUT edits an resource, HTTP DELETE deletes an object. Here, consider for a resource blog
- to access all blogs: /blogs
- to a blog identified by id 27: /blogs/27
various operations on blog would be as
- GET /blogs : a collection of blogs
- POST /blog : create a blog
- PUT /blogs/27 : update blog identified by id 27
- DELETE /blogs/27 : delete blog identified by id 27
However the resource basedness is a great approach and works well in many situations, But comes with a downsides by its very own nature and so becomes difficult , quite more complicated to deal with when systems grow. Let’s considers a situation where we have a blog application as defined above with features where users can add blogs, comment on various blogs and so on…
Now, consider our application needs us to query all the blogs and comments that were posted by a user.
With REST API we definitely have to deal with multiple endpoints and gather the required data. In our example these could be /users/id
endpoint to fetch the initial user data. Next, /users/id/blogs
to fetch all the blogs that were liked, and lastly /blogs/id/comments
to get all the comments of the respective blogs.
Now, Since the data we requested is also accompanied by other unnecessary data then filter the data accordingly before we get the data we desired for. If this functionally is often used then we probably have to create a separate endpoint for it. Again it would have its own challenge of what route to decide. If our special application had many such scenarios…it would get more and more complicated and quite laborious🥴 to deciding and implementing REST-endpoints and now, REST becomes no more RESTful🥱.
Every time you want to build a new screen for your application, you have to go writing a new API on the backend to serve the needs of that screen. Most of it is boilerplate and none of it is reusable, and you end up closely entangling your backend and your frontend application when you do that.
In GraphQL on the other hand, we’d simply form a single query describing our concrete data requirements and send it to the GraphQL API, with a POST request. unlike REST in GraphQL every queries are sent to the same endpoint as a POST request with the request body describing the shape of the data we query. In our scenario above we could fetch the data we required with roughly the following query.
query FetchUserBlogsAndCommentsQuery {user(id: "123"){idusernameblogs{titledescriptiontagscomments{author{username}text}}}}
Then the server responds with a JSON object where these requirements are fulfilled.
{"data":{"user":{"id": 123,"username": "John Doe","email": "john@doe.com","blogs":[{"title": "Why graphql??","description": "GraphQL was developed to cope with the need for more flexibility!","tags": ["GraphQl", "REST", "API"],"comments":[{"author": {"username": "Jane Doe"}"text": "This blog clearly explains why graphql was needed, Helped a lot!!"}]}]}}}
As seen above using GraphQL, the client can specify exactly the data it needs in a query. Notice that the structure of the server’s response precisely follows the nested structure of that of the defined in the query. with this the application logic stays simple and the code on the browser gets exactly the data it needs with a single query. With GraphQL we can better REST😉.
Smaller precise payloads
One of the most common problems with REST is that of Over/under fetching of data which also results to filtering out un-necessary data. This mainly happens because the only way to for a client to query data is to hit multiple endpoints that return fixed shape of data. It is very difficult to design an API that adopts or provide precisely the data required by the client. GraphQL queries always return predictable results. Applications using GraphQL are fast and more stable because they control the data they get.
Over-fetching leads to downloading more data
Over-fetching means the client downloads more data than what was actually required in the application. As described in the above example if we hit /users/id
endpoint to get only the id, username and email; It would result in returning all the other data as DOB, address, location etc. belonged to the user that were infact not necessary. over fetching leads to larger network payload and hence more load times.
Under-fetching for n+1 problem
Another issue is of under fetching which means that the endpoint we hit to get the data doesn’t provide enough of what we actually required for our application. Situations like these result in N+1 problem where a client has to make another request to another endpoint to fetch to gather the data it needs. As mentioned above in the blog list scenario we first have to fetch the user details the with that query the blogs endpoint and for every blog in the list that is returned make additional request to get all the comments for their respective blogs with proper username(comment author details) i.e. for each blog make request to /blogs/id/comments
endpoint.
Because of this multiple round-trips and over/under fetching, applications built in REST inevitably end up building ad hoc endpoints. These actually couple the data to a particular view. For applications with wider view this can very easily end up into a maintenance nightmare of massive code duplication and orphaned endpoints leaving us with inconsistency.
Benefits of a Schema and Strictly Type System
GraphQL uses a strictly, strong type system, All the types, shape of the data that are exposed in an API are defined by the schema using GraphQL schema definition language(SDL). This schema serves as the strong contract for what the operations look like and to determine if the query is valid. with this the schema not only defines the resources available for retrieval but also defines the accepted parameters when fetching the data. GraphQL APIs are organized in terms of types and fields, not endpoints. Thus clients can only ask for what is possible to return.
Now, once the schema is defined, frontend and backend teams can use the API without communication as they both are aware of the shape of the data they are dealing with.
Documentation
Documentation is a first-class feature of a GraphQL system. It is important to ensure that services remain consistent, with its capabilities and so description of GraphQL definition are provided along side the definition in their definitions and are made available via introspection making it highly readable. This is a pretty powerful capability that allows service designers to easily describe and publish their documentation along side their implementations.
No more versioned APIs
Evolving a REST API is a challenge in itself, with newer implementations the endpoints have to be swapped, As the deployed clients cannot break and with rapid release cycles and backward compatibility guarantees, applications will have large number of API implementation versions, under such constraints it is difficult to remove data from custom endpoint. So much like REST the payloads of these custom endpoints grow monotonically as server evolves.
With GraphQL we can add new fields and types without disturbing/impacting existing queries/schema. Aging fields can be deprecated and can be hidden from being accessed. By using a single evolving version, GraphQL gives APIs give our application continuous access to newer features encouraging cleaner and maintainable API implementation.
Better performance and rapid frontend iteration
A common practice with REST API is to structure the endpoints according to the frontend views. This comes handy since it allows clients to get all the data required for a particular view by simply accessing its corresponding endpoint. However, there is also a major drawback with this approach; it doesn’t allow evolving of views and rapid iteration of the frontends. with every change we make to our views we face with the risk that now there is more or less data than before and so adoptability to the evolving UI becomes quite difficult. The backend needs to adopt as well as account to the newer data needs. This kills down productivity of development and with over/under fetching it notably slows down the performance.
GraphQL was developed to cope with the need for more flexibility and efficiency! It solves many such shortcomings. Thanks to its flexible nature and type system, clients can exactly specify what they need without ever need to make adjustments to the underlying API.
REST is GOOD GraphQL is BETTER
It’s not that REST is completely bad and is gonna die sooner, its just that REST is an architectural style rather than a formal protocol, There are actually much of a debate about what exactly REST is and what not it is. With GraphQL we represent a novel way of structuring the client-server contract. Sever publishes an application specific type system, GraphQL provides a unified language to query the data within specified constraints of type system. That language allows developers to express their precise data requirements in a declarative and hierarchal form. REST may long-live.
GraphQL was designed by keeping in mind the shortcomings and pain in designing, scaling and maintaining REST API, With better performance, precise payload, adoptability, stronger type system and declarative approach to data-fetching GraphQL is the better REST. Since its public release in 2015 GraphQL has quickly matured to the point where it can be adopted to nearly any infrastructure and architecture. The community, related software ecosystem as well as the company adaptability has grown with breaking speeds. Companies such as GitHub, Netflix, Airbnb, Coursera, New York Times, Shopify and many have already famously adopted to GraphQL in their tech Stacks.
📚 Further reading and resources
- GraphQL documentary explores the story of why and how GraphQL came to be and the impact it’s having on big tech companies worldwide, including Facebook, Twitter, Airbnb and GitHub.
- graphql spec
- Workshop with organized clear explanations a tutorial from Eve Porcello
- Exploring GraphQL: A Query Language For APIs: A free 7 week edX course, developed by Linux Foundation Training
- How to GraphQL: The Fullstack Interactive Tutorial for GraphQL
- Our learnings from adopting GraphQL A blogpost on Netflix’s adoption to GraphQL
- How Facebook organizes their GraphQL code