A Look at Caching in Drupal 8

In this presentation we introduce Drupal 8's Cache API, review its metadata settings and demonstrate D8's caching flexibility using a fun, real-life example. The fun part is in the form of tacos because they're the de facto currency of the HeyTaco! Slack integration, from which we will grab our cacheable data.

Transcript

Märt Matsoo: Okay, so yes. This session is basically called Drupal 8 Caching. I'd like to throw the word "metadata" in there. If you guys were deciding between the other session and this session, you were hoping that I'd give you awesome, awesome-- an overarching caching strategy is how to make Drupal 8 super, super fast, this isn't it. This is more looking at the cache property and render arrays and its metadata and how it can be manipulated and used to do cool things.

At least I thought it was pretty cool. Let's get started. I won't be offended if you want to go to the other session. It's not a problem. I'll be sad, but I won't be offended. Drupal 8 Cache API and tacos. Now, I'm not going to be giving out tacos. If you want to leave now again, that's fine. I won't be offended. First, it's traditional to introduce oneself. My name is Märt Matsoo in English and Märt Matsoo in Estonian.

I'm originally from Canada, from Toronto, but now, I live in Estonia. I've been working with Drupal since 2008. I, fortunately, lucked into Drupal by deciding to use it to build the Estonian foreign ministry website back in 2008. I was kind of deciding between different open-source content management systems and I lucked out and I've been working with Drupal ever since. I work for Chromatic and I've been there for almost two years.

Chromatic is a US-based distributed company. We all work from home. Most of the people are in the States, but we have one Canadian who's in France and then another Canadian-Estonian who's in Estonia. That's our logo. We've been around for about 10 years. I'm just a little blurb on the company. Distributed, I mentioned that. We're always looking for talent as every single company that works with Drupal seems to do, so you can check us out at chromatichq.com.

Now, it's taco time. First, I'm just going to make no assumptions. I like to use this little example. I actually like it so much that I put it into practice last night. If any of you went out after the dinner, we went to several bars. When it hit 2:30 AM or so, I thought it's time to go home. I walked by a McDonald's. I found that I was hungry. I went in and I got myself a Big Mac, no lie. Now, imagine if I actually wanted a Big Mac, but I didn't have a McDonald's to get one at.

If I had to go to a supermarket, buy all the ingredients, go home, cook it all, it's a difference-- I don't know. That would take me hours probably. For caching minutes and hours with a Big Mac and then when we're talking with webpages, a cached page being served up, you're serving up something that's already finished and ready versus having the server put everything together, running all sorts of database queries, getting the markup, and then sending it to the browser.

Basically, on my local environment when I was preparing for this, it's the difference of with a cold cache of about three seconds, my page would load and the cached version would load in milliseconds. 281 milliseconds for instance. Anyway, and then I wanted to touch on Drupal render arrays because they're kind of part and parcel to this presentation. Simply, they came around in Drupal 7 and they make up the building blocks of the page.

It's something that I frankly don't-- I'm not an expert in them. One of our colleagues, Gus, did a presentation at Drupal Con New Orleans about render arrays. It's basically the preferred way instead of using the theme function now. It has been out for years, obviously, to build the building blocks in what you're doing. The power in them is that you can extend, alter, and override parts of the page before the markup is rendered in the template.

It gives a lot of flexibility in being able to change anything before it gets rendered. I don't know if you can see it from that far away. If you don't know what I'm talking about, you've maybe seen the render function surrounding the content variable and you've maybe wondered-- and, again, years ago, maybe you wondered when you started with Drupal 7 going, "What's this render? Why is render, all of a sudden, in all these templates, which it wasn't in Drupal 6 as I can recall?"

Anyway, and the reason I mentioned all of this is because when you're looking into cache and I was doing the research everywhere, I talked about render arrays. Here, the line is it's important that our render array knows to cache itself. From drupal.org, so DO, it is of the utmost importance that you inform the render API of the cacheability of a render array. That makes it sound pretty important.

I just wanted to touch on render arrays because it's part and parcel to what I'm talking about here. Now, I'm going to give you a bit of background as to why tacos because you're probably wondering why. This is originally a blog post that I wrote back in February because I basically-- my Chromatic were encouraged to write blog posts so that we can get smarter and just kind as a marketing tool as well.

I wanted to learn more about caching, so I looked into it. At the same time, we had started using a service called HeyTaco, and I'll get to that in a minute. The reason I'm doing a presentation on it too, I thought, well, this is an interesting topic. My little badge of honor is that when I wrote that blog post, the big man, Dries himself retweeted it. I thought, "Wow, well, maybe this is something that people are interested in or just he is."

Anyway, that was my 15 minutes of fame. Now, tacos, I can't talk about them enough. Why tacos? Because of Slack. I'm sure a lot of you, if not, most of you, all of you use Slack. HeyTaco is an app for Slack that is a paid service, believe it or not, and the whole point of it is you hand out tacos to your colleagues to show appreciation. Basically, you help me debug something, I send you a taco, and the HeyTaco service keeps track of it.

It sounds stupid and silly, but we actually use it and it's a mini-motivator for us. We hand out tacos, whether it's for doing a good job, whether it's for saying something funny, and HeyTaco keeps track. For instance, we had a team meeting the other week. Since I'm in Estonia, which is seven or eight hours ahead of most of the other people I work with, their day is just starting, mine is ending.

I was actually having a glass of wine and somebody noticed this. Adam sent me a taco because he thought that was funny, and then everyone else started sending me tacos. I hope you can see this. Finally, I made the comment that, "Wow, I'm genuinely getting alcohol tacos." I was getting like appreciation for drinking on the job. Anyway, if that sounds good to you, you can use HeyTaco too and try to earn tacos.

When you send the HeyTaco bot a keyword like leaderboard, it spits out the leaderboard. Sadly, I'm down in 9th, so you get the idea. This is what it looks like. My idea with caching was, "Hey, we use this HayTaco thing, but only we see it. I wonder if they have an API and I could grab the info from there and we could put it on our website if we want." It turns out, "Hey, HeyTaco does have an API."

You send a request and it sends back some JSON that you can do whatever you want with. Knowing that, I said, "Now, I'm going to write a blog post, and now, it's presentation." I'm just going to mention about all the parts that it took to do this because I find it, it was pretty interesting. It wasn't just like a one-step process. If anyone uses the drush core-quick-drupal command to spin up a self-standing Drupal instance, I recommend it.

Basically, with one command, you can have a fresh install of Drupal running. It runs its own web server and it uses a SQLite database, so you don't have to sit there and create your own with all the steps of maybe adding a new vhost or anything like that. In Apache, you just run this one command and it just, boom, spins up Drupal for you and you can start playing around. You want to test out some new modules or, in my case, I created a custom module called HeyTaco and the whole point of it was to create a block.

I use the block plug-in, extended it, and created my own custom block. That custom code that I have in there grabs the JSON from the HeyTaco API, and then we finally get to the caching part where I played around with the cache settings. What I also did just for fun is to suck up to the bosses. We have three owners at Chromatic. I would add a hundred tacos to their total before showing the leaderboard so that when they see it, they see, "Wow, I've got a lot of tacos."

When all the other workers see it, I've added information that their stats have been padded, don't feel bad. Anyway, it was basically just trying to have fun with it. I also had to create a Twig template so that I could output it the way I wanted. Anyway, just to give you an idea, if Chris, who's one of our partners, logs in, this is what he sees. It just says, "Hi, Chris." He's in first place and he's like, "Wow, I've got a lot of tacos. That's great."

Now, if somebody else logs in, they get these asterisks and at the bottom, it says, "Partners results padded by 100 tacos." Now, the whole point of this is each of these blocks is cached, but it's cached differently for the different person, so it's like dynamic caching. Depending on who you are, it's still cached. Now, we'll get into the cache properties metadata. There's four things: keys, context, tags, and max-age.

Basically, I'll just mention what it says about them here from drupal.org that of those four tags, the contexts, the tags, and the max-age must always be set because they affect the cacheability of the entire response and they bubble up to the parents. The parents automatically receive these settings and then it also says that the keys must only be set if the render array should be cached. Basically, if you want it to be cached, you need to set the keys.

Now, you're going to see what it looks like. This is a screenshot, it's pretty tough to read, I suppose, of my custom block plug-in and there's a build function there. Basically, that creates the block. Just like any block in the block admin interface, I put it on to the sidebar, the first sidebar. This is the bigger part of what the cache keys look like. This whole talk is basically down to this part here, the cache property and its metadata. We'll go through them one by one.

The keys. From drupal.org, what identifies the thing that I'm rendering? This is the name of it, what uniquely identifies it. Since I have a custom block called HeyTaco, I don't think anything else is going to be using HeyTaco, so I just gave it the name "heytaco_block." This is like the "what." Again, I mentioned that this must be set if the render array should be cached. Context, this is where things get a little bit more fun.

There's a representation of the thing I'm rendering vary per something and so this is what I mentioned. Our partner Chris would see it one way, Alana would see it a different way, and I would see it a third way, but it's all being cached. This context that I'd put there for this case is user. Drupal then knows that, "Okay, depending on which user it is, I have to show a different context of this block," so I call it the "which."

For people who like their HTTP headers, it also says that cache contexts are completely analogous to HTTP's Vary header. If that helps you understand it better, all the power to you. [chuckles] Now, it's not just user-based, there's all kinds of different contexts you can use. It can be based on IP, time zone. Basically, you can tell Drupal that, "Okay, if it's this person in a different time zone, use this version of the cached block."

From that perspective, there's all kinds of combinations that you can use. Tags, which things does it depend upon so that when those things change, so should the representation. Basically, this is, what is it that would invalidate this cache? This is what was pretty neat for me during the development. I ran into some trouble with it because I was like, "Okay, it has to be when a user changes something."

Say I changed my user name from Märt to my full name Märt Matsoo or something, that should invalidate the cache. Using this if you can read it from back there, it says "user_list" is the tag that I use. Now, finding out that that tag actually exists in Drupal and is like Drupal knows what that means was tough and I actually didn't find out anywhere. It was a colleague of mine who recommended that I use it because before, I think I'm going to talk about this a little bit later.

I'll get back to that in a sec. Sorry, I got lost. You really can't see what it says down there. Basically, the point is this first one says, "Hi, Märt," and it's what I described just a minute ago. I changed my username. Now, it says, "Hi, Märt Matsoo," so the changing of the username invalidated the cache and then now, I get a new version. This is what I was alluding to before.

When I was originally developing it, I was like, "Okay, I have to tell Drupal which users have to be included to invalidate the cache." I had this laborious process of finding all the user IDs creating an array with them and then building this cache tag like that, but it turns out that all I needed to know was that I could do user_list. The only way we found that out was because Gus, again, that I mentioned before, said, "I've seen a cache tag called 'node list.' I wonder if user can be used the same way."

It turns out, yes, it can, so this is one of the beauties of, I guess, Drupal or any open source. The documentation is sometimes hard to find or non-existent, but there's all kinds of priceless little pearls that you can come across. Anyway, and then the fourth and final one is max-age and this one's pretty self-explanatory. After how long should this cache invalidate itself? The default is forever and you have cache permanent. If you don't put anything, it'll just let itself be cached forever and it's measured in seconds.

I have 3,600 seconds, one hour from my example. Basically, that's pretty much it for these parts of it and this is just showing the Twig templates that I built. In conclusion, it's basically, "Let's remember to use render arrays," and that's the cache property that can be used. We have keys, context, tags, and max-age, these. What I really encourage is to play around with it because this whole process for me basically was a lot wider than just caching.

I got to play with quick-core-drupal, Twig templates. I created a custom block plug-in, which had its own challenges as well. Learned more about dependency injection thereby and my bonus [chuckles] suggestions, sorry the name was coming in Estonian, not English, is to use this drush quick-core-drupal, qd, and to spin things up and play around because it really gives you the freedom to experiment with little cost.

I don't know. Did that even last five minutes? What I could try to do is take questions or I also have this drush qd spun up and I can basically just show you that. I may as well, since I'm talking about it, show you that it works differently for different people. Let's see. I'll just spin up the webserver again. Here's an example of me. Now, let's make sure the cache is not the book. Clear the cache. Let's hope this works.

Yes, it took 3.98 seconds. What I'm going to show you now is if, now, I'm in the same thing as an administrator, I'm going to change someone's user name. Let's change mine. It's tough to type with one hand, and so now-- Where was I? Opera, oops. With any luck without clearing the cache, it knew to invalidate itself because the user, remember, this is from user list, was updated and, boom, now, it changed it without clearing the cache it knew, and then I think that's about it. If there are any questions?

Audience 1: What was the change you made to the date?

Märt: Sorry? Oh, sorry. My name.

Audience 1: No, I asked, what change did you make to the date?

Märt: The block originally that was cached had my first name and then I changed it. Is that what you mean?

Audience 1: Yes.

Märt: I added my last name and then it added my last name.

Audience 1: How did you make it cached?

Märt: It was this part here because, originally, it was cached with only one name. What I was showing was that just by updating the user, the cache knew that it has to update itself because if I didn't have-- Let's see.

Audience 1: What code did you change?

Märt: Which which did I change?

Audience 1: What code did you change to make it cached?

Märt: Well, it's cached anyway because of all of these-- I mean, it's cached out of the box because of this.

Audience 1: Okay.

Märt: Sorry, maybe I'm misunderstanding the question.

Audience 1: Yes, okay. I was just maybe-- That's okay.

Märt: Okay, sorry. Well, we can talk later if you can. Anyway--

Audience 1: I did get the answer. Thank you.

Märt: Oh, okay. Any other questions?

Audience 2: [inaudible 00:20:02]

Märt: That's a good question. I can't answer it. I'm not sure. [chuckles] Anything else? From way back there.

Audience 3: [inaudible 00:20:25]

Märt: The question that I heard was, can you pre-warm the cache after a change has been made so you don't have to wait for the initial page load to get it cached? I haven't tested with it. I'm not sure. This is basically the extent of what I was playing with. Just to get it to change and based on the user changing his/her information. Anything else? All right. Thank you very much. That's it. Enjoy lunch.

[applause]

[00:21:34] [END OF AUDIO]