This is a transcript of What's Up With That Episode 3, a 2022 video discussion between Sharon ([email protected]) and John ([email protected]).
The transcript was automatically generated by speech-to-text software. It may contain minor errors.
What lives in the content directory? What is the content layer? How does it fit into Chrome and the web at large? Here to answer all that and more is today’s special guest, John, who not only is a Content owner, but actually split the codebase to create the Content layer.
Notes:
Links:
00:00 SHARON: Hello, and welcome to "What's Up with That", the series that demystifies all things Chrome. I'm your host, Sharon, and today, we're talking about content. What lives in the content directory? What is the content layer? How does it fit into Chrome and the web at large? Here to answer all of that and more is today's special guest, John. He's not only a content owner, but actually split the code base to create the content layer. Since then, a theme of his work has been Chrome's architecture, and how to make it usable by others. He's been involved far and wide across Chrome, but today, we're focusing on content. John, welcome to the program.
00:33 JOHN: Hi, everyone, and thanks for setting this up, Sharon. My name's John, and I'm happy to try to shed some light and history on this part of the Chrome codebase. I've had the pleasure of working on a lot of different parts of Chrome over a number of years I've worked on it. A theme of my work has been on the architecture of Chrome and making it reusable by other products. And one of the projects has been splitting up the codebase and helping create this content layer.
01:02 SHARON: So, can you tell us what the content layer is? Because content is a very overloaded term, and we're going to say it a lot today. So you mentioned the content layer. Can you tell us what that is?
01:10 JOHN: Yes. The content layer is a part of the Chrome codebase that's responsible for the multiprocess sandbox implementation of our platform.
01:24 SHARON: And another term that I had heard a lot tossed around before I really understood what was going on was the content/public API. So is that the same as the content layer, or is that different?
01:36 JOHN: It's part of it. So the content component is very large, and so, we've surrounded it by this small public API. So that you hide the implementation details and the private directories, and then, embedders just only have access to a small public layer.
01:56 SHARON: How did we end up with this content layer? Can you give us a bit of history of how we came up with it? And also, maybe why it's called content?
02:02 JOHN: Sure. The history is - in the beginning, Chrome, like all software projects begins nice and easy to understand. But over time, as you add a lot more features to go from zero users to billions of users, it becomes harder to understand. Small files, small classes become much larger; small functions kind of get numerous hooks to talk to every feature, because they want to know when something happens. And so, this idea started that let's separate the product - things that make Google Chrome what it is - from the platform, which is what any browser, any minimal browser doing the latest HTML specs would need to implement them in a sandbox, a multiprocess way. And so, content was the lower part, and that's how it started.
02:58 SHARON: How did we get the name content?
02:58 JOHN: The name is like a pun. And when we started Chrome, one of the ideas was, we'll focus on content and not chrome, and so, the browser will get out of the way. Chrome is a term used to refer to all the user interface parts of the browser. And so, we said, it's going to be content and not chrome. And so, when you open Chrome, you just see a very small UI. Most of what you see is the content. And so, when we split the directory, it was originally called src/chrome, and so, the content part, that's the pun. That's where it came from.
03:34 SHARON: That's fun. Earlier, you mentioned embedders of content. Can you tell us what an embedder of content is? And this is part of why I was very excited about this episode, because I was working on a team where we were embedders of content for a long time. Well over a year, and it took me a long time to really understand what that was. Because, as you mentioned now, Chrome's grown a lot. You work on a very specific thing understanding these more general concepts of what is content, what is a content embedder, are less important to what you do day-to-day. But can you tell us what an embedder of content is?
04:13 JOHN: Sure. An embedder of content is simply anybody who chooses to use
that code to build a browser on top of it. And so, in the beginning, right when
we did this, the goal was just to have one embedder. Or not the goal, what we
had was just one embedder. It was Chrome. But then, right away, we were like,
you know what? It would be nice for people who work on content and not the
feature part to build a smaller binary. It builds faster. It debugs faster,
runs faster. And so, we built this minimal example also to other people called
content_shell
. And then, we started running tests against that, and that was
the first - or the second embedder of content. And then since then, what was
unexpected, what we started for code health reasons turned out to be very
useful for other projects to restart - or start building their browser from.
And so, things like Android webview, which was using its own fork of WebKit,
then started using content. That was one first-party example. But then, other
projects came along. Things like Electron and [Chromium] Embedded Framework, all
started building not just products on top of it, but other frameworks.
05:30 SHARON: That was really surprising to learn about, because it seems unsurprising that you would build another browser based on Chromium. And people have heard about this when Edge switched over to Chromium. But to learn that things like Electron are built around content seem really surprising, because that's very different from what a browser is.
05:52 JOHN: But they have common needs. They have some HTML data, and they want to render it and do so in a safe, and stable, and secure way. And that's not their value add, working on that code. So it's better for them to use something else.
06:11 SHARON: That makes sense. You also mentioned that Chrome is dependent on content. And when I first started working on Chrome as an intern, I had it told to me so many times, because I couldn't remember, that Chrome can depend on content, but not the other way around. So can you tell us a bit about this layering, and why it's there?
06:31 JOHN: I should also start by saying, content is not just - when we say content, often what we mean, you embed content. You embed content in everything that sits below it in the layer tree. So that includes things like Blink, our rendering engine. V8, our JavaScript Engine. Net, our networking library, and so on. And there's also you can talk to the content/public APIs, but also, sometimes, you talk to the Blink API and the files, and V8, and so on.
07:07 SHARON: So you have this many layer API or product? And, at the bottom, we have things like Net, Blink, and those probably have dependencies on them that I don't know about. And on top of that, we have content, and then, on top of that, we have Chrome?
07:23 JOHN: Right. And so, Chrome as an embedder of content can include directory in the content/public API. But since content can have multiple embedders, it can't include Chrome. If content reached out directly to Chrome, then other people wouldn't be able to use it. Because if you try to bring in this code, it includes files from a directory that you're not using. So, instead, the content/public API, it has APIs going two different directions. One direction is going into content, and then, one direction are these abstract interfaces that go out from content. And any embedder has to implement them. And so, these usually end up in terms like client or delegate. And these are implemented by Chrome, and that's how content is able to call back to it. But then, any other, of course, product or embedder can also implement these same interfaces.
08:23 SHARON: You mentioned Blink and also some things called delegate and whatever. So we have a lot of things called something something host in content. Can you talk a bit about what the relationship between content and Blink is? Because there's a lot of mirroring in terms of how they might be set up, and how they relate to each other.
08:37 JOHN: So Blink was the rendering engine that originally started as WebKit. And we forked, and we named it Blink a number of years ago. And that did not have any concept of processes. So it was something that you call it in one process, and it does its job. And you give it whatever data it needs, and it gives you back the rendered data. And you can poke at it or whatever you want to do with it. But you needed to wrap that with some - you needed a bunch of code around it to make it multi-process. And also, to figure out when it needs something that's not available in the sandbox that it runs in, you have to provide that data. And so, this is where the content layer comes in. It's the one that wraps the rendering engine and uses the networking library and other things to be able to create a fully working browser.
09:33 SHARON: More about processes. So it's easy to think, maybe, that the content - the relationship between the content layer and the browser process. So can you just talk a bit about how processes work in content? And what the content API provides in terms of accessing these processes?
09:54 JOHN: So the content code runs in - it's the initial process that runs. Content starts up, and then - and so, it's in the browser process. But it also creates the render processes for where Blink runs. It creates a GPU process that talks to the GPU and where a bunch of the compositing happens. It creates a network process where we do networking. It creates other processes, things like audio on some platforms, storage process to isolate storage. And then, a lot of short lived processes for security and stability reasons. And so, you can have processes that run content code, but, sometimes, an embedder wants to run its own code in a different process. So it could re-use the same helpers that content has for creating a process, and we'll use that. And then, I think I didn't fully answer your previous question yet, which was the host part. So, often, you'll have classes in Blink that are running in the renderer process, and you need an equivalent class to drive it from the browser process. And that's where we often have the host suffix. So it'd be like a class for -
11:11 SHARON: Can you give an example of -
11:11 JOHN: Yes. So, for example, every renderer process has a class in content/browser
called RenderProcessHost
. And then, every tab object in Blink will
have this class called RenderView
, and then, in content/browser, it will have
this class called RenderViewHost
.
11:36 SHARON: Those are classes that, depending on what you work on, you might see pop up quite a bit. And there's a lot of them. They're all called render something host, and it's a bit tough to keep them straight. But that makes sense as to why they're called render and - why render and host are in the names for them. So you just listed a bunch of different process types. The GPU process, the browser process, render processes. And, usually, whenever we have different processes, we have some security boundary between them. Can you talk a bit about how security and the content layer overlap? Is the content API a security boundary? What happens if someone calls it maliciously? What could go wrong if they do and do it successfully?
12:26 JOHN: So the security boundaries in any browser built on top of content is the processes. We separate things to not just have render processes per tab, but there are multiple render processes per tab thanks to the amazing work of the Site Isolation project. And that's what split up different iframes into different processes. And so, how they talk, all these processes talk through IPC, and our current IPC system's called Mojo. And so, any time you talk, you use Mojo between processes. You're usually talking from between processes of different privileges. And so, one could be sandboxed and the other one not sandboxed. Or one could be sandboxed, and the other one only partially sandboxed. So you have to scrutinize any time you use these Mojo calls to make sure that they can't inadvertently lead to a security vulnerability. Now, even those, as hard as you can, people could still misuse code. Or, also, embedders like Chrome or other content embedders can add their own IPCs. So content obviously doesn't know about the IPCs from other layers, and so, it's possible that it could be an embedder of content that has security vulnerability in their own Mojo calls. And so, content doesn't know about them, so it can't do anything about them. You could write insecure code in content. You can also write insecure code in an embedder, and if someone finds a vulnerability - so let's say someone finds a vulnerability in Blink, and maybe they're only running their code in a minimal content shell. Maybe they can't find any other Mojo calls that they can abuse to be able to get access to the browser process. But maybe someone else, an embedder, is a more full-featured browser. It has more IPC surface, and that could be more of an attack surface for that - to start with that Blink vulnerability and then to hop into the browser process.
14:38 SHARON: And if you gain control of the browser process, that's a very highly privileged process.
14:44 JOHN: Because that has full access to your system. So that's the point where you can leave persistent changes to the user system, which is pretty bad.
14:55 SHARON: That sounds not great. So if you're an average, say, Chrome engineer, that could be anyone. This is probably not too much of a concern. All the stuff we mentioned, this is good to know. How would a Chrome engineer who doesn't directly work on content or in the content directory interact with the content layer?
15:20 JOHN: Well, they might need a signal from Blink, for example. That's
often how someone will do that. They'll be working on a feature in the browser,
and everything works great. But then, they'll be like, I just need something
from Blink. But it's not there. And so, sometimes, they'll have to add an IPC
between processes, and that might interact. They'll be like, how do I get it?
It's in Blink. It's in the RenderView
class. so I need an interface that talks
between each RenderViewHost
and each RenderView
. And that's how they might
get - well, that would be how they get interaction with the multiprocess part
of it. But if someone is just working on something only in a browser process,
they might still be trying to get information about the current tab. And that's
represented by a WebContents
class in content. So they'll look in
content/public/browser, and they'll see WebContents
. And there will be a lot of
interfaces that hang off it. So they'll be looking at it, going through a trail
of interfaces and classes to be able to get more information on what's going on
in the current tab.
16:29 SHARON: Can you give us a quick overview of the WebContents
class?
Because it is one, massive, and two, called something like WebContents
. Which
suggests it's important because content plus the web, and it's also something
you see all over the place. So can you just give us a quick overview of what
that class does? What it's for? What it represents?
16:46 JOHN: Yes. Things now are a lot more complicated than before, but if you
go back in a time machine and see how these things started, you can roughly
think in initial Chrome. Every tab had a class to represent the content in that
tab, and that was called WebContents
. And then, it was called WebContents
because we had other classes. We used to be able to put native stuff in a tab.
And so, that would be called TabContents
. But that's gone now, and we just
have WebContents
. So that's where the name comes from. And then even, for
example, there was RenderProcessHost
, which I mentioned earlier. And then,
each tab, each WebContents
roughly translate into one render process. And so,
now, it's a bit more complicated. There are examples where you can have
WebContents
inside of WebContents
, and that's more esoteric that most people
don't have to deal with. And then, so that's what WebContents
is for. It will
do things like take input and feed it to the page. Every time there's a
permission prompt, you usually go through that. If a page wants access to a
microphone, or video, and so on. It keeps track if there's navigation going on.
What's the current URL? What's the pending URL? It uses other classes to drive
all that stuff as you send out the network request and get it back. And that's
not inside of WebContents
itself, but it's driven by other helper classes.
18:28 SHARON: I tend to think of content as being the home of navigation, which I think is a decent way to think about it and also is maybe biased because of the stuff I've been working on. But you have Chrome, and navigation, and content, and all the stuff here. And then, separately, you have the actual web, the internet. And that has things like actual websites. And there are web standards, and there's things like HTML. And these two things somehow have to intersect. But being on the Chrome side, working on Chrome, apart from writing some browser tests, maybe, you never really interact with any of the more web things. JavaScript, you don't really touch. That's more Blink and HTML only in a test kind of thing. So how do these web standards - there's navigation web standards and all that. How do we actually make sure that they're implemented in Chrome? And where does that happen?
19:32 JOHN: So that happens all over the code, but there's a few critical directories. If you look at net at a low level, a lot of IETF - and some web specs will be implemented there at that layer. Either net or in the network service, which is a code that runs inside the network process. Then you've got V8, of course, our JavaScript engine, and that has to follow the ECMAScript standards. And then, there's a lot of the platform standards. Either some of them only don't need multiple processes to be - to implement them, so they'll just be completely inside Blink. But some of them require multiple processes, things that need access to devices and so on. And so, that implementation will be split across Blink and content/browser. But then, how do you ensure that, not only do you implement this correctly, but also that you don't regress it? So there's a whole slew of tests. There's the Blink tests, which used to be called the layout tests. And those run across the simple, simple test cases for many features to make sure that each one works. And there's also this cool thing where we share now a lot of these tests with other embedders, and that way, you run the same test in every browser. And so, when you write a test, you don't have to write it n times. You can just write it once. So that's how we ensure that we meet the specs.
21:10 SHARON: That makes sense. Because I've been pointed - when I was looking into a class. What does this do? I've been linked to, say, one of the HTML specs or web specs. But the whole time, I'm just thinking, how do we make sure - or who's checking that we're actually implementing this and correctly? But these tests seem like a good way to do it and also ensure some level of consistency across browsers. Assuming you know whether or not the browser you use chooses to run these tests or not, I guess.
21:41 JOHN: And as an engineer on a project like that, the first time you'll hit them is when you're breaking them. You'll make a change, and I think this is fine. And then, you send it to the commit queue, and you break some layout tests. What's happening to me today? And then, you have to drill into it. And the nice thing about layout test is because each one is small, you - it's faster to figure out what you broke because it's just like, hopefully, you only broke a small number of tests.
22:06 SHARON: For sure, and it's a good example of why we have all these tests, is to make sure things don't break. So that is pretty much all the questions I have written down. Is there anything else generally content layer, content/public API-ish related that is interesting that maybe we didn't get a chance to cover?
22:31 JOHN: Yes. The most common questions is people will be like, well, does this belong in content or not? So I can have a chance to point people towards their README files and //content/README.md that describes what's supposed to go in or not. And then, there's also a //content/public/README.md that describes the guidelines we have for the API to make it consistent.
22:59 SHARON: I've definitely seen those questions before. You're updating one of the content/public APIs. Does this belong? While we're here, can you give us a quick breakdown heuristic of what things generally would belong in the content/public API versus you put it up for review, and the reviewer's like, no. This does not belong in content/public?
23:24 JOHN: So sometimes, for example, for convenience, maybe the Chrome layer wants to call other parts of Chrome layer, but they don't have a direct connection. Or maybe a Chrome layer wants to talk to a different component. And so, they'll be like, we'll add something to the content API, and then, that way, Chrome can talk to this other part of Chrome or this other component through content as a shortcut. We don't allow that, and the reason for that is anybody who's gone through the content/public directory, it's already huge. And so, we feel that if Chrome wants to talk to Chrome or to another layer, they should have their own API to each other directly instead of hopping through content. Just because the content API's already very large, very complex, hard to understand. So we don't want to add things that are absolutely not necessary to it. And another thing we try to do is to not add multiple ways of doing something. We only add something to the content API when there's no other way of getting this data from inside content, or there's no other way of getting this data from them better to content. But if there's something similar that can do the same thing, we push back on that.
24:39 SHARON: And also, test-only things? Are those generally OK, or do you want to generally avoid those?
24:45 JOHN: Well, yes. test-only methods, we try really hard - not just for the public API, but inside, because we don't want to bloat the binary. But we do have content/public/test, which is - gives you a lot more leeway to poke at things in your browser test, for example, or your unit tests. Another thing is, we also have guidelines for how the API should be. We don't have, really, concrete classes. It's mostly abstract interfaces. And so, there's a bunch of rules there, and they're all listed in content/public/README. Just so people know the guidelines we have for interfaces there.
25:28 SHARON: On the Chrome binary point, how much is the size of the binary dependent on the size of the content/public API? Is that a big part of the binary, or is it small enough where, sure, we want to keep it from being unnecessarily large but not too much of an issue?
25:48 JOHN: The size is not going to come as much from the content/public API but just from the entire content and all its dependencies. And those are in the tens of megabytes. So, sometimes, for example, if you're bundling the content layer, you're not going to be a small binary. You'll just start off in the 30 megabyte range or 40 megabyte range once you put everything together.
26:12 SHARON: And I guess that's something you have to be more conscious of if you're working in content versus another directory even in Chrome, is that you have to be wary of your dependencies more so than anywhere else. Not only for Chrome, but also, any other embedders who might want to use content.
26:31 JOHN: Yes. And so, for example, if someone's trying to add something in Chrome, we also ask, does this have to be in content? Or can this be part of Chrome, so that not every embedder has to pay that cost if they don't need it? Maybe we'll have an interface, and the embedder can plug the data in through that way but still not have it in content. Another problem, of course, with having data inside content is that not all embedders update at the same speed. So if you're putting something in content, it can quickly go stale, the content, whatever the data is if you're not updating quickly.
27:08 SHARON: That make sense. So we mentioned a bit of what content is, a bit of the history of it. Can you tell us anything about what are upcoming changes that might happen in content? What is the future of the content directory, the layer, the API?
27:28 JOHN: Well, it's always changing. It's not static, driven by the needs of
the product. And so, you look at big changes happening today like MPArch to
support various use cases that we didn't have, or we never thought about
initially. And that's where the WebContents
inside WebContents
, some of that
comes in. There are big changes like banning, for example, pointers and
replacing them with a raw_ptr
. So we can try to address some of the
security problems we have with Use-After-Frees. So that's where, when you look
at the content code or the Chrome code in general, too, you might see a little
bit different than that average C++ project that you see. You'll be like, I'm
getting errors if I try to have a raw pointer, and that's why.
28:15 SHARON: Check out episode one for more on that. We'll link it below. Anything else random content-related or otherwise you would like to share with us?
28:27 JOHN: I think the only other thing I would add is familiarize yourself with the READMEs in content/README and content/public/README before making changes. That will make the author and reviewer's time more efficient. And if you're working on content and below, you can build Content Shell instead of Chrome. That would be faster to build and debug and hopefully make you more productive.
28:52 SHARON: Good tips. Hopefully, our viewers follow them. They would never try to change a content/public API without reading the READMEs first. Well, thank you so much, John, for sitting down and chatting with me about content. This was great, and, hopefully, people find it useful.
29:14 JOHN: And thank you for hosting me, Sharon.
29:23 SHARON: Did you start working on Chrome from the very start, or just - obviously, pre-launch. Because, I think, based on your profile pictures, the picture of that comic book that released when Chrome did - which I was lucky enough to get a copy of when I was an intern. Shout-out Peter. But that obviously suggests you were a major contributor before the public launch of Chrome. So were you working on Chrome from the very beginning?
29:47 JOHN: I was not. It took about six months. I tried to join from the beginning, but I couldn't join right at the beginning. So my sneaky way was I found another project under that same director who was running Chrome, and then, once that project finished in six months, then I jumped into Chrome.
30:09 SHARON: And do you ever think about how crazy it is from this thing that you worked on, effectively, from the start before the public launch? To what it is now where Chrome is one of the foundational pieces of the internet at large? Any time the internet gets run period, probably something in Chrome is running like the next stack, if not, obviously, the browser? Do you ever think about that, and how crazy that is? And your place in that?
30:38 JOHN: Yes. It's amazing how far Chrome has come, and it's really humbling to see it be the number one browser, the most widely-used browser. Because when we were working on Chrome at the beginning, we were just trying to guess what market share it would have. And people would be like, it'll be 10%, and we're like, no way. Even the people working on it, we didn't think that was going to be possible. So to see users really enjoy using it, and for us to keep demonstrating value by sticking to our four principles, security and stability, simplicity and speed. And seeing people not just adopt Chrome as a product, but Chromium as a platform is - it's beyond our wildest dreams. And it's a responsibility that we have every time we make a change to Chrome to all these users and developers using it. You were asking earlier, how does it feel to be here from the start? There's almost a sense of feeling super lucky. But also this humbling feeling where we started in Chrome when it was really small, and our knowledge built up incrementally as it got more complicated. But so, it's like, well, what if I was to jump in Chrome today? It seems like way too many - the code is so complicated now compared to before. This almost responsibility we have as being in Chrome for a long time to share knowledge, to help people pick it up. Because we would ourselves struggle if we were to jump in now.
32:22 SHARON: Yes. As those people, we certainly did struggle. But people are pretty smart, I think, and they can figure it out. But that doesn't mean you can't make it easier for the people in the future figuring it out. Or even people who - you just work on a different part. If I were to do anything in Blink, I'm just like -
32:44 JOHN: Same. I've been on it for a long time. I don't touch Blink.
32:50 SHARON: Yes. Yes.