Introduction to Plugin Systems

Author: Alyssa Riceman

Posted: 2022-07-08

Updated: 2023-10-20

1. Why Plugins?

There are many sorts of program for which extensibility is useful. Where, instead of just having an atomic program which does everything it needs to do straight out of the metaphorical box and which can’t do anything else, one instead wants the versatility of being able to plug additional functionality into one’s program post-installation via a standardized interface, and the versatility of being able to code one’s own plugins for that interface.

This sort of plugin-centric architecture is used in all sorts of places, for all sorts of purposes. In Pandoc, plugins allow the program to read from / write to additional formats. In Firefox, plugins allow for various sorts of automated changes to displayed web content, interaction between the web browser and other programs, and increased ease of access to the browser’s inner workings (such as cookie storage). In Calibre, it would be only a moderate exaggeration to say that practically the whole program is made of plugins, that the core of Calibre is largely just a skeleton onto which plugins can be hung in order to supply it with flesh, although it comes with a large pile of out-of-the-box plugins to facilitate its core functionality; it has plugins to define its GUI, to read from and write to various sorts of files in various ways, to interface with various different sorts of eBook reader, and a variety of other things. And so forth. Plugin-using programs abound, each using plugins in different ways and with different degrees of centrality to the overall program functionality.

There are several advantages which can potentially be gained, by supporting plugins.

For programs designed with an expectation of being used by a wide variety of people with different workflows and use cases, a plugin system can serve as a means of keeping the installation light and the interface clean, ensuring that, beyond the core shared functionality of expected interest to all users, the program will contain only those features a particular person wants or needs, and not every other feature relevant to other people.
For programs whose development teams’ output is constrained, plugins serve as a means of outsourcing development of program functionality to the userbase, allowing more and better functionality to be added faster than would otherwise be practical. Designing a plugin tends to be easier than making a useful pull request against the main program codebase, even if the program is open-source, since a program’s plugin interface will tend to be substantially smaller and better-documented than the overall mass of its source code. Furthermore, supporting plugins is easier from the developer side than merging pull requests is, because plugins are fundamentally extricable from the programs they plug into; before merging a PR, one needs to carefully review it and make sure it doesn’t break anything, for fear that that breakage will make it into a release and be passed onward to the entire userbase; but, if a user installs a plugin that breaks things, that’s a problem only for that one user, and moreover one fixable in many cases just by uninstalling the plugin.
For programs which might usefully interact with external programs or web APIs or hardware, it’s often hard to be thorough, as a developer, in adding support for every possible thing-that-can-be-usefully-interacted-with. Even aside from the development-team-output problem, there are problems of knowledge: to add support for interactions with something, one must first know that that thing exists, and have the thought that someone might get some use out of interacting with that thing. Plugin systems allow users to code up their own methods of interaction with those external things, ensuring that the option for such interaction exists even if the main program developers are entirely unaware of the things-being-interacted-with.

I’m sure additional potential benefits exist which I’m currently failing to think of, as well; but hopefully this list suffices to show at least some reasons why a plugin system might be worth building.

2. The Structure of a Plugin System

So. Suppose one has, for the reasons above or for others, decided to make a program which includes a plugin system in some capacity. How, exactly, does this work, from a high-level code-structure perspective?

Well, at the highest level, a plugin is a piece of code which gets read by the main program and then executed or otherwise acted upon. This can be as simple as a CSS stylesheet which gets read in and used to style a webview-based GUI, or as complicated as an entire big executable serving as essentially a whole other program which happens to be launched through the main one (and which presumably has functionality related to the main one, since otherwise you probably wouldn’t be launching it that way).

In theory, no more is needed for a plugin system than “identify the plugin code as such, then run it”. This is the way the prior example of the big dynamically-linked executable worked, for instance. In practice, though, one often wants, not just to run the plugin code, but to run the plugin code based on specific inputs from the main program and/or to run it and then use its outputs in order to alter the behavior of the main program; after all, if the plugin isn’t interacting with the main program in some way, there’s a good chance the plugin would be more convenient as a standalone program instead of as a plugin. And thus we get to plugin APIs.

Let’s sort plugin APIs into two broad categories: write APIs (sending data from the main program to the plugin), and read APIs (reading plugin data into the main program). A very simple example of a read API would be the previously-discussed CSS stylesheet, which gets read in and then used to style the main program’s GUI. A very simple example of a write API would be an “open in editor” plugin for an image viewer, which, when activated, would open the currently-viewed image in one’s default image editor. And, of course, the two can be combined: one could have a filter plugin for an audio player which takes in the audio currently being played, applies a filter over it, and then returns the newly-filtered audio, thus enabling the player to play the audio back in filtered form.

So. For a plugin system involving neither read nor write APIs, the following pieces are required:

A means of locating plugin files / code. The program could come with a Plugins folder, or it could have an “install plugin” command which tells the main program where the plugin lives (and probably copies it to a dedicated folder for good measure), or it could have a “use plugins at these locations” command-line flag, or plugin code could be directly pasted into an in-program text box, or some other option could be used.
A set of rules for when plugins should be activated. These could include “activate at program launch”, “activate when the user selects a menu option to invoke the plugin”, “activate when the program performs a specific operation such as reading-from-disk”, or suchlike. (Of course, if you want there to be per-plugin variation in when a given plugin is activated, then you’ll need either a read API for the plugin to specify its activation conditions or a user-input option to allow the user to specify each plugin’s activation conditions.)

For a plugin with write APIs, the following additional piece is required:

A standardized interface by which plugins can receive input from the main program. This could be a dynamically-linked main function which takes in a bytearray, or a text template which takes in substitutions to instantiate it into the proper written-to forms, or a variety of other such things, depending on exactly what sort of plugin structure one wants. (Given read APIs, it could even be “read the plugin’s instructions for how to activate it, then activate it that way”, although you’d then need a standardized way of interpreting such instructions.)

And, for a plugin with read APIs, the following additional piece is required:

A standardized interface by which the main program can receive input from the plugins. This could be “read the plugin file, which is assumed to follow Insert Schema here and thus can be used as a meaningful source of input”; this could be “invoke a metadata function from the plugin code, which is assumed to have a certain name and to return information in a certain structure”; et cetera. And, of course, once you’ve done the initial pass of getting plugin information, reading the file or invoking the metadata function or whatever, you might then be informed of additional things-you-can-read, such as if the metadata function returns a list of other functions which can be invoked and have their outputs read too.

(And, of course, plugins can be more complex than just a single text file or CSS file or dynamically-linked library or executable or whatever; Firefox extensions, for instance, are structured as ZIP files, each containing a standardized metadata file, but then each additionally containing arbitrarily many further files to make up the extension content, most typically HTML and CSS and JavaScript and images, but also potentially including anything else that might be relevant to extension functionality.)

Put all of these pieces together, and you should end up with a functioning plugin system. What you do with your plugins and with any applicable inputs and outputs they might have will, of course, vary heavily with the nature of your program; but this broad structure will be applicable regardless.

3. Security Considerations

Plugin systems are, for the reasons discussed above, very useful. However, somewhat inherently, they also serve as a security risk: a plugin system is an attack surface by which plugin-manufacturers can have the main program engage with whatever code they contain. But the degree of risk, of course, varies heavily with what sort of code it is that the plugins contain and how exactly it’s being engaged with.

For a plugin which contains code to be executed, of course, the risk is obvious. Plugin code, if you’re not running some sort of hyper-restrictive “all plugins must go through our extensive security-verification process before we give them the signatures necessary to make our program willing to interact with them” system, can potentially be malicious. If you’re running your plugin as a system-level executable, and it’s malicious, then that’s it; game over. Sandboxed execution, as offered by Lua or WASM or suchlike, can serve to ameliorate that risk, although the sandbox itself will then remain as an attack surface, with any attacker who can bypass the sandboxing being free to do whatever they want. And, for languages with relatively limited capabilities, the concern is lessened; allowing plugins to impose arbitrary CSS on your program, when reskinning it, is only a threat to the extent that malicious CSS is, and that extent is far less than the extent to which, say, a malicious Python script is a threat, even if it’s still more than nothing. But any execution of untrusted code carries some risk.

Even for a plugin system which doesn’t ever directly execute code, though—a plugin-based approach to translation of one’s program, for instance, where each plugin just contains a text file containing translations of each of the program’s UI’s text elements, which get read in and then used in place of the default text on each UI element—risk vectors remain. Careless reading-in of the plugin file will open the opportunity for code injection, for instance. Once again, you can ameliorate the risk—countermeasures to code injection are already standard when reading in files, and it’s not hard to extend those countermeasures to the plugin context—but, once again, it’s extra attack surface. The addition of a plugin system necessitates security effort at the time of its construction, and potentially later on as well should new vulnerabilities be revealed, in order to prevent said system from serving as an attack vector.

In many cases, the correct response to all of this is to say: okay. The plugin system can be an attack vector, then. It’s up to users to avoid installing malicious plugins; I, as the developer, disclaim responsibility in this field. This is, after all, already the ordinary state of things in many other domains. Operating systems might contain some efforts at security, but they still, in the end, allow users sufficient leeway to install malicious software, and this is good and correct of them. Requiring users be as discerning about their plugins as they are about their non-plugin programs isn’t such an imposition, and it is, by far, the easiest option from the developer’s perspective.

But, sometimes, more protections than that might be warranted, for the sake of accessibility. Sure, not-guaranteed-safe plugins are no different from not-guaranteed-safe top-level software, but, well… sometimes one wants to give one’s grandma a computer which one can trust that she won’t fill up with malware the moment she touches the internet, and under such circumstances it’s nice that there do exist operating systems which do their best to secure things even at the cost of freedom-of-program-installation. By a similar token, then, sometimes, to make your plugin system accessible even to people who you don’t trust to competently vet plugins for trustworthiness before installing.

When pursuing that goal, to a decent extent, the thing to do is just follow the solutions described above. Sandbox any code you execute; make sure you’re not getting hit with code-injection; in general, secure your plugin APIs in much the same way that you’d secure any other API that might receive untrusted inputs, even if this comes at the cost of functionality (such as, for instance, barring plugins from interacting with the file system).

The other thing to do, for additional safety, is to compartmentalize plugin capabilities even within your own program. For some plugin APIs, this is a freebie, inherent to how they’re structured; the previously-described translation plugin system, for instance, has no access to the program beyond its ability to define what is written to the program’s UI’s text elements; if that’s all your plugin system does, you don’t need to worry about a malicious plugin, say, deleting your database. (Although you might still need to worry about malicious mislabeling of UI elements, performed as part of a social-engineering attack of some variety.)

But, if you’re aiming to allow plugins to interact more deeply with your program’s core functionality—as is done, for instance, by Firefox, which lets plugins do things like read the cookies the browser has stored for each site, change displayed webpage content, and so forth—that sort of compartmentalization is a lot harder, and not a freebie in. The traditional solution, here, is a permissions system: for each plugin, for each API which might serve as a potential security risk, require the plugin to get explicit permission from the user before reading from / writing to that API. A plugin which wants to delete the user’s cookies first needs to get permission to change the cookie database; otherwise, the browser won’t allow the plugin access to the cookie-deletion section of its API.

This traditional solution, though, is very much not adequate to thwart the efforts of a sufficiently determinedly malware-attracted grandma. It’ll help, on the margins; fewer people will end up being hit by malicious plugins; but, for a sufficient standard of security, it will nonetheless not be enough.

So, in the end, all that’s left is a spectrum of different tradeoffs between one’s plugin system’s functionality and its security:

At the most secure, don’t have a plugin system at all.
Less secure than that, but still pretty secure, is a strictly-limited plugin system whose APIs are kept from interacting with anything important
Less secure still is one whose APIs can interact with important program-components, but only given explicit user permission on a per-component basis
Less secure than that, one which doesn’t bother asking for permissions beyond the initial plugin-installation
On the least secure but most powerful end, plugin systems which allow for full unsandboxed OS interactions.

To reemphasize: in many cases, that least-secure option is warranted! Security is valuable, but it’s not always worth the costs; the most secure hard drive to store one’s secret data in to keep it away from others, after all, is an entirely inert and inaccessible one, and nonetheless we keep on building and using hard drives which can be read from. So it is with plugins: in many cases, it’s useful for a plugin to be able to read and write files, and more value would be lost in functionality if those options were removed than would be gained in protection-from-malware.

But, sometimes, that sort of file access is irrelevant to plugin functionality. And, sometimes, the tradeoff just goes the other way, with file reads/writes potentially useful but not worth their security costs. In the end, it all comes down to the needs of the specific program for which the plugin system is being built; some programs will benefit from more security around their plugin systems, others from less. And thus this spectrum of security options is one worth keeping in mind, when architecting one’s plugin system; tradeoffs may be inevitable, but one can at least trade informedly, and pick the point on the spectrum that best suits one’s needs rather than some other less-ideal point.

Tags: Architecture