147. Client Side Caching

Caching is notoriously difficult. In fact, according to Dave, it might be one of the two (or three) hardest problems in computer science. In this episode, dedicated to all the cache money millionaires, we are talking about client-side caching. We look at how it’s different from server-side caching and why, if you want to improve user experience, it’s the way to go. While client-side caching only helps the client currently using the machine, there are some definite benefits that we walk through, particularly on larger websites. We also look at some of the tools that can help with client-side caching. Once again, it’s an opportunity for Dave to talk about GraphQL. The man can’t get enough, and it shows! Along with this, we talk about how client-side caching pretty much saves the planet, some nifty tools that are coming for all of our caches, and much more. Tune in today!

Key Points From This Episode:

A quick look at server-side caching and how client-side caching is different.
Why someone would do client-side caching and some of the great benefits of it.
Dave’s different experiences working on client-side caching.
Some of the reasons that caching can get so difficult.
How cache-control directives and ETags are useful for client-side caching.
Client-side caching is especially useful for large websites.
Why it’s good to have robust client-side caching.
The common bugs that are related to caching and ways GraphQL deals with them.
A look at the Vary Header, what it does, and William’s experience with it.
Cool features like garbage collection, cache eviction, cash retention on Apollo Client 3.0.

Transcript for Episode 147. Client Side Caching

[INTRODUCTION]

[0:00:01.9] MN: Hello and welcome to The Rabbit Hole, the definitive developer’s podcast in fantabulous Chelsea Manhattan. I’m your host, Michael Nunez. Our co-host today.

[0:00:09.8] DA: Dave Anderson.

[0:00:10.8] MN: Our producer.

[0:00:12.0] WJ: William Jeffries.

[0:00:12.2] MN: Today, we’ll be talking about client-side caching. Before we begin, I just wanted to point out that Jeffries is doing some recording remotely, flying in the skies. Not really right now but he is currently traveling at the moment and he took some time to meet with us for the podcast. Never lets down.

[0:00:31.3] WJ: Hopefully my audio is okay. I’m recording from the lounge in the airport. I’m on like a 10-hour layover right now.

[0:00:40.9] MN: So nice of you to drop on. You’re probably like dead tired and can’t –

[0:00:45.6] WJ: I’m super loopy. I’m not going to make it.

[0:00:49.0] DA: It’s going to be great audio. Great material.

[0:00:51.2] MN: Great content right here, ladies and gentlemen.

[0:00:53.5] DA: Please don’t sell the lounge, it is an executive suite, right? Yeah.

[0:01:00.0] MN: Dave, ss I’m calling him, the cache money millionaire because he brought up the topic of wanting to talk about client -side caching.

[0:01:07.2] DA: Yeah, so much cache.

[0:01:08.0] WJ: Ka-ching.

[0:01:09.0] MN: Ka-ching. Got the Ka-ching.

[0:01:11.9] DA: It’s just data.

[0:01:14.0] MN: Let’s dive right into it. First off, what is client-side caching, why would you do it?

[0:01:19.5] DA: Yeah, that’s a great question. I guess there’s a lot of different forms of caching. Caching is like famously one of the two hard problems in computer science, right? Or is it three, I can’t ever remember one.

[0:01:31.2] MN: One is always naming variables, that one’s hard.

[0:01:36.1] DA: I know off-by-one-error's is one of them. But then I don’t remember if there’s three and then there’s an off-by-one-error. Or, if there’s two and an off-by-one-error which I guess in itself is an off-by-one-error.

[0:01:43.8] MN: You’re always off by one, bro.

[0:01:46.8] DA: Right. Unless I’m off by two.

[0:01:47.8] WJ: You had one error in this joke.

[0:01:49.6] MN: Yeah, exactly.

[0:01:50.5] DA: Right, pretty much. Yeah, my brain’s melting.

[0:01:55.6] MN: Yeah, but caching I’m sure is the final one I’m going to say.

[0:01:58.6] DA: Yeah. I’m sure, like many people familiar with like server-side caching, where you might have a document that you know, you access over HTTP, and that document doesn’t change, so you have to get it out the database or do some calculation to get the document like from a REST API, then you might just cache the response so say like, book one is always going to be book one.

I’m just going to cache it, maybe through Redis or something and just send that back on its way every single time so you get faster response time.

[0:02:34.5] MN: That’s server-side?

[0:02:35.3] DA: That’s server-side caching. For client-side caching, it’s a little bit different because server-side caching benefits everybody who acts as the application. Client-side caching only benefits the user in their browser because it basically same idea like you show caching, you’re storing a representation of the data. But you're doing it all locally.

[0:02:58.2] MN: Right.

[0:02:58.9] DA: In the browser memory.

[0:03:00.5] MN: Right. The client-side caching is good for the user who is currently caching the information on their machine.

[0:03:08.9] DA: Right. So, when the user gets that book number one resource, they could cache that in memory and then if you go to another page that has book one then you don’t have any load time and that’s pretty nice.

[0:03:23.9] MN: Is it resourceful to build one over the other?

[0:03:26.8] DA: It’s different effort to build one cache versus the other cache, for sure. But like, I mean, if you’re going for a good user experience then client-side caching is a pretty sweet deal. Well, it may reduce your bill as well because like maybe they’ll be less requests to the server for like redundant data.

Another benefit of caching is besides the improved proceed application performance, is that if I normalize the data that comes back like if you know, some REST API’s aren’t like pure as the definition of a resource, my book might have like nested author information and you know, if I am making changes to that on the client-side, the kind of changing who the author is.

There might be a lot of different places that that author name could end up in my application. Especially like you know, if you have a very complicated page and it’s a huge book store that you’re managing and you know, there’s a list page and a detail page and if I update that author name on the detail page then I’d hope that like it would show up in the list page as well.

[0:04:39.3] MN: Right, and it would update accordingly depending on where you did the update for the author in your example.

[0:04:44.9] DA: Right, exactly. If you normalize the cache response, so, you break apart all of those little tiny nested pieces into entities, flatten it out and make note of what the relationships between them are, then you can more easily have those changes to the author name, populate in different places.

[0:05:08.9] MN: I imagine you’ve had to do some client-side caching recently which just brought up the conversation as of late?

[0:05:16.0] DA: Yeah. I was thinking a lot more about client-side caching because you guys know me, I love GraphQL and you know?

[0:05:27.1] MN: William. This is another GraphQL episode.

[0:05:28.5] DA: My god I was trying to pull a fast one.

[0:05:32.6] MN: He pulled a fast one on us.

[0:05:34.3] DA: I’m trying not to use the word GraphQL but there it was.

[0:05:39.5] MN: Sure.

[0:05:40.1] DA: How many minutes did I make it?

[0:05:43.8] MN: Probably like five.

[0:05:43.9] DA: All right, okay. I did my best. It’s okay. You know, I was thinking about like my different experiences like working with client-side cache and this different concept of like normalization. And React and Apollo and GraphQL and how that kind of, is a service that’s provided for you.

[0:06:06.2] MN: Right.

[0:06:06.7] DA: But just because it’s provided for you, doesn’t mean that you get away completely free because you still kind of have to understand what the heck’s going on under the hood? And it’s a little bit complicated.

[0:06:18.5] MN: Is that when using Apollo or in general you think?

[0:06:21.9] DA: I think in general, yeah. I mean, if you’re building this complicated system, this data layer to manage your data from the API and you know, you have all these like rich interactions that you're doing on it and manipulating the information and it gets pretty hard, there’s a reason why it’s in that number of hard problems in computers.

[0:06:46.5] WJ: Yeah. It’s funny. I actually didn’t think of Apollo at all when you said cache, like client-side caching, I was thinking of like cache-controltrol directives, ETags and stuff like what you get from the browser.

[0:06:58.7] DA: Yeah, that’s like another thing on top of that even like, because you can tell the browser, “Hey. Don’t re-fetch this if you know, this cache header is still valid.”

[0:07:13.0] WJ: Yeah, you can set like a max age or different directives. And then the browser does a lot of automatic caching for you, it just code everything by an ETag. And then even if the cache expires, you can make a request that’s abbreviated. The server can respond with 304 not modified, instead of sending over the whole payload again, it’s better for bandwidth and data usage.

[0:07:43.0] MN: Does it just like give it a new much later expiry date with the same information for that cache?

[0:07:50.0] WJ: If you set like a two-minute time to live then it will have that same, however much you set last time it will extend it for that same amount of time.

[0:08:01.7] DA: Yeah, that stuff is pretty cool. Really important when you have high traffic website like you know? You know, New York Times is on top of that. All those big websites are managing that, especially like, with other resources to JavaScript bundles and CSS, those can be huge and even like REST resources too. If you’re like a baller API resource guy then you know, you’re all about those cache headers, as well as caching the response like I mentioned earlier on the server side.

[0:08:39.3] MN: Shout out to all the cache money millionaires out there, saving cache by using cache.

[0:08:44.7] DA: Right, yeah. I mean, that’s really like electricity you’re saving.

[0:08:50.1] MN: Yeah.

[0:08:50.6] DA: You're saving the planet.

[0:08:52.2] MN: Yeah.

[0:08:52.6] DA: By using cache. But you know, it is –

[0:08:56.6] WJ: I guess Redux is kind of a data cache, right? I mean, data stores in general are kind of caches.

[0:09:02.9] DA: Yeah, if you’re using a rest API like you might get the API response and just shove it in Redux and then have it for later.

[0:09:11.3] MN: Right.

[0:09:12.1] DA: That’s like the simplest form.

[0:09:13.8] WJ: Apollo does.

[0:09:15.2] DA: Yeah, I mean, it’s kind of neat because GraphQL has this schema built into it and so that thing that I was talking about before like normalizing the API response is something that can be done programmatically because it just knows what the schema looks like and so it knows that this particular object is this kind of type and it has an ID. And then it can just you know, tear everything apart, normalize it, flatten it out. And you know, if you use Apollo dev tools, it’s really nice like they have a feature where you can actually peek into the cache and see what’s going on.

I remember opening it the first time and I was like, “Oh my god, what is happening right here?” There’s just like so much going on, really big page and there’s a lot of objects here, fetchingly I’ll just get shoved down there like one level. You know, every file in your computer and one directory or something.

[0:10:17.7] MN: Man. It is a little intimidating and you can’t really change it which is kind of surprising.

[0:10:25.7] DA: Yeah, you can use the cache API to read from it like read a fragment and then write back so that’s like definitely doable but you know?

[0:10:36.5] WJ: You just can’t do it from the dev tools.

[0:10:38.3] DA: Yeah, the dev tools don’t provide that, that sounds like a bit a pretty dope feature though.

[0:10:42.3] WJ: Yeah. I was thinking about requesting that.

[0:10:45.6] DA: Just to be able to like poke around with the page. I guess you can do that with react-devtools though if you want it to be creative.

[0:10:56.7] WJ: My use case for it is like the actual server, your local development server is down or you know, busted in some annoying way that I don’t want to go and fix. And you know, I’m working on a feature and I want to be able to just have the expected response in cache be right or have whatever the change is that I’m expecting and then not have to have the server online at all.

[0:11:22.6] DA: Okay. Just mocking it out.

[0:11:25.0] WJ: Yeah. Maybe that’s an abuse of the cache. Cache abuse.

[0:11:31.9] DA: There’s some tools out there that lets you like, I guess there’s still servers there that are just like mock servers so –

[0:11:37.4] WJ: Yeah.

[0:11:39.3] MN: Be really interesting for client-side. I just had a thought like client-side caching would definitely help individuals who are running applications on like Lambdas because you don’t – you would rather have all that cached on the client rather than have to constantly hit the Lambdas into to having to pay for it I guess.

[0:11:57.1] DA: Right, I mean I guess that is the really real cost, there’s like, “Oh my god if you call my function it is going to cost me a 100^th of a penny or something like don’t call my function please.”

[0:12:06.4] MN: I mean if you have a big website and it is server-less, then it is going to cost a lot of money to have it on their end where it’s pretty much free I guess because they don’t need to do that.

[0:12:15.8] DA: Yeah. Definitely and like going back to what William was saying about like cache-controltrol headers, you can kind of there are more tools available to you to tweak caching with the REST API than with a Graph QL API. So, it’s really nice to have this really robust client-side caching as a tradeoff. Like you don’t get this fine grain control like cache header because you only have one API end point and it just always returns crazy stuff like whatever you ask for, it just sends back.

[0:12:55.0] MN: Oh man.

[0:12:56.3] DA: But getting that normalization that gives you that perceived application performance and like the really smooth update of data in different places, it’s pretty nice because that kind of stuff on the client-side could be like thousands of lines of code and you know just delete them. It’s fine.

[0:13:20.3] WJ: Although, I hear if you can find a way to make your GraphQL request you’d get request as soon as post request and not have that break everything, then you can get a lot of free caching.

[0:13:35.2] MN: Yeah. That is yeah. Good old GraphQL. It’s post all the things.

[0:13:42.4] DA: Yeah, 200 errors, too right?

[0:13:45.4] MN: Oh man. That sounds fun. Everything is okay because it’s not.

[0:13:55.4] DA: So, don’t worry about it.

[0:13:56.7] WJ: This is fine.

[0:13:59.0] DA: Exactly. But I don’t know, I think it is kind of interesting. All of a sudden it is like although you have this really powerful tool that helps you out in a big way, it helps you be more productive and gets you further, it can be a bit more mysterious than if you wrote all of those thousands of lines of code. Because like you can – well maybe they’re hard to reach it about as well but it is something that is out in plain sight for you to look at and poke out.

[0:14:27.1] MN: Right.

[0:14:27.3] DA: I mean there are different kinds of bugs that come off related to caching especially when you start changing data. Like you might send a request.

[0:14:37.1] WJ: What kinds of bugs?

[0:14:37.7] DA: You might send a request to update that author name and it just doesn’t do anything like when the response comes back. So, Apollo provides some tools where you can tell it, “Okay, I just want you to re-fetch this query when I finish this update operation.” So it is like the easiest thing you can do is you just give it a string with the name of the query and it will just re-fetch it like it is very simple.

[0:15:07.7] MN: Yeah, it sounds very straight forward like, “Hey, I finished just action re-fetch the thing that thing that I am telling you to re-fetch.” and that’s it and it just does it.

[0:15:17.5] DA: Right and you know you do the same thing with the REST of the API’s too. You’re like do the action and it’s like just blow everything away and just start over again, just re-fetch it, put it in there and we’ll see where everything lies. We can also write an update function, which takes the mutation response and figures out a way to read the data out of the cache, make some change with the response and then write it back in but it does feel kind of like open heart surgery. It feels like you’re doing something that is a little risky.

[0:15:50.3] MN: Yeah, it sounds like magic at first. But now it is like black magic.

[0:15:54.5] DA: Right and I mean you can do that too with normalized cache and redox like you can read the data out like when you get a response from an API, write a function that mutates it in some way and writes it back into Redux. My favorite option is to just design the payload of the response in such a way that like the Apollo cache is just able to figure out what happened off the bat.

Because it just had some rules of thumb for like how it’s going to merge data back into the cache where if you have an idea on the object that you sent back and the type name, which is always included in Apollo’s query by default then it will be able to match that with whatever that is already existing and then merge and update the fields that percent back. So, when you do that then, “Hey, then it just works!” And like if you think about it, it makes sense.

[0:16:56.5] WJ: This reminds me of the Vary Header, where you specify what parts of the HTP request to pay attention to when deciding whether or not you have a cache hit.

[0:17:10.6] DA: So how do you use the Vary Header? What are the different parameters that go into it?

[0:17:16.0] WJ: So, you can give it like a comma separated list of all of the header names for the different headers that you want it to cache on. So like user agent for example, you can say like vary on the user agent and then that way it will never cache – well I mean I don’t know. That one would only make sense from the perspective of like a cache server like some intermediary, I think, is like presumably your browser isn’t going to change user agents while you’re on a website.

[0:17:47.5] MN: Yeah.

[0:17:48.7] DA: Right. Although I mean that does happen sometimes but like server side.

[0:17:51.4] WJ: If you like in solid font, yeah.

[0:17:53.8] DA: That happens sometimes but like on the server-side caching where the server provides a different asset to you, depending on what user agent you’re on, which always throws me off but that kind of makes sense.

[0:18:06.9] WJ: Yeah, I came across this when I was working with Fastly and the Fastly has a caching language called Varnish that allows you to do a lot of crazy, crazy caching stuff and they make very heavy use of the Vary Header. So, I remember I am pretty sure that we use user agent at some point for that but I think in this context when you are talking about caching for a specific user, you probably would want something more specific.

[0:18:34.3] DA: Right like the – I mean maybe part of the resource itself.

[0:18:38.7] WJ: Yeah. You might add your own header like you could add a cache an application specific header just for caching and then that can be an arbitrary key. It could be whatever you want so we cache on that you know? In your example where you’re trying to cache a specific object, you can say, “Well you know, I really only care about the object’s ID.” So, if it’s got that same ID you cache it or maybe you care about not just the ID but also a couple of key parameters like this in author. You care about the author’s name.

[0:19:06.9] DA: Right, if it doesn’t actually have an ID built into it. You are trying to work with what you have, you can try to build a composite key that makes sense. I got to bring it back to GraphQL because that’s just you know, that’s my MO. That’s my thing.

[0:19:23.6] MN: You do what you love bro. You got to do it yeah.

[0:19:26.7] DA: It was a bit challenging to customize like the key that you would use it for caching a field in Apollo before the latest version, which is coming out soon Apollo Client 3.0. They now let you define that kind of a custom key based upon different objects. Like if you didn’t have an ID then it would just guess based upon the path to the object. So, it will be like, “Okay, I know this object’s ID and it’s type name. And then I’ll just build a path.”

So, it’s like the book, book number one’s author zero like maybe there is two authors or so many authors. So, there’s authors zero or author one, it just does it by position but you know that makes it harder to work with and all of that. It is much easier if you have a key that makes sense. I mean there is lots of cool stuff in there like that is coming out like garbage collection and cache eviction, cache retention. So, it is actually getting a lot more mature.

[0:20:28.2] MN: What is the current implementation now though? Is it just – there is no garbage collection? Is just like this all exists on the client?

[0:20:36.9] WJ: Just letting it piling up forever.

[0:20:38.6] MN: Oh man.

[0:20:39.3] DA: Yeah, you got to close that browser tab sooner than later.

[0:20:42.2] MN: They are crawling like Staten Island there is just a ton of garbage just sitting on it.

[0:20:45.9] DA: It’s like you wonder what is up with Chrome like you know?

[0:20:49.1] WJ: Sarah Cynthia Sylvia Stout would not take the garbage out.

[0:20:54.2] MN: Oh, damn it, why don’t you take the garbage out? Get it together.

[0:21:00.9] WJ: I used a little Shel Silverstein reference.

[0:21:02.9] MN: Thank you Apollo Client 3.0 for taking out the garbage and managing it.

[0:21:08.5] DA: Yeah, I mean I think I took out the garbage like basically just by blowing up the cache.

[0:21:13.8] MN: Oh, that’s how you do it, you just dynamite?

[0:21:15.9] DA: Yeah just dynamite. Like it’s not like an elegant tool.

[0:21:21.6] MN: But it is awesome that this is like a growing feature in Apollo and they’re enhancing it so that developers could take advantage of the information being cached on the client-side. So, do you know when Apollo Client 3.0 is coming out?

[0:21:37.8] DA: It says the release in Candidate is October. So, I guess whenever they feel like it’s good.

[0:21:42.1] MN: So, let’s just say that. Coming soon TM.

[0:21:46.0] DA: Right no ear blast for it I mean it looks like the release Candidate is out right now, they’re in beta. So, I think whenever they feel like they’re squash enough for those bugs it’s soon going to be out there.

[0:21:56.7] MN: Oh yeah, I mean if you’re a developer who’s brave enough to use the release version to figure out how garbage is being collected it would be really interesting.

[0:22:05.2] DA: Yeah, hopefully it is not trash.

[0:22:08.1] MN: Yeah, the trash all of Apollo Client though I highly doubt it. I mean a lot of people, a lot of developers and engineers are using it right now and it is the hot stuff. I mean it is with GraphQL, William he got us again to talk about GraphQL. Oh man once again. Oh yeah, I am. Looking forward to see how that changes developers and further pushing forward the client cache initiative or the cache money millionaires out there.

[0:22:37.9] WJ: Well speaking of cache key eviction I think I may be about to get evicted from this recording area.

[0:22:44.7] MN: Uh-oh well.

[0:22:46.3] DA: Yeah, don’t miss your flight. You don’t want to be trapped in New Delhi.

[0:22:50.8] WJ: Yeah. And also, I don’t actually have a recording studio here. It is literally just a section of the lounge that is normally closed off until after breakfast and it’s after breakfast.

[0:23:04.2] MN: There you go, uh-oh.

[0:23:05.9] DA: Get the heck out of here.

[0:23:08.5] MN: Yeah, it looks like you are getting evicted yourself.

[0:23:12.0] DA: Go eat a idly.

[0:23:14.0] MN: Yeah man, enjoy the Indian food for us.

[0:23:14.0] WJ: Sounds good.

[END OF INTERVIEW]

[0:23:17.6] MN: Follow us now on Twitter @radiofreerabbit so we can keep the conversation going. Like what you hear? Give us a five-star review and help developers like you find their way into The Rabbit Hole and never miss an episode, subscribe now however you listen to your favorite podcast. On behalf of our producer extraordinaire, William Jeffries and my amazing co-host, Dave Anderson and me, your host, Michael Nunez, thanks for listening to The Rabbit Hole.

[END]

Links and Resources:

The Rabbit Hole on Twitter