Alyssa's Coding Journal

How To Set Up A Basic Website

Posted by Alyssa Riceman on 2023-11-10

I have now created my first real website! And it happens to be a new home for this blog. As of today, I am officially moving this blog off of its old home at WordPress and on to its new home at my personal website. Future posts will be at the new location and not the old one; be accordingly warned.

…and I could, in theory, stop there. Post for the first time in most of a year, say I’m moving, walk back out. But that’d be boring. So let’s instead go dive into what the process was for setting up the new site!

Keep reading

Limits of Firefox Extension Pages

Posted by Alyssa Riceman on 2023-01-06

Half a year ago, I decided I wanted to write an ebook-reader extension for Firefox.

I was, I thought, following in the grand tradition of such extensions as EPUBReader, whose interface I have a number of issues with but whose basic concept I’ve always found very solid. The EPUB format is sufficiently based on web technology that the web browser is the natural home for it, after all. And a browser extension is more conveniently accessible than either a loose bundle of HTML files (which less technically-inclined users are likely to struggle to make use of) or a website (which requires internet access to get access to, rather than being storable via a convenient offline file).

This was, in retrospect, a mistake.

Keep reading

Introduction to Plugin Systems

Posted by Alyssa Riceman on 2022-07-08

1. Why Plugins?

There are many sorts of program for which extensibility is useful. Where, instead of just having an atomic program which does everything it needs to do straight out of the metaphorical box and which can’t do anything else, one instead wants the versatility of being able to plug additional functionality into one’s program post-installation via a standardized interface, and the versatility of being able to code one’s own plugins for that interface.

Keep reading

EPUB 3.2 Structure: A Simplified Overview

Posted by Alyssa Riceman on 2022-03-25

Introduction

This is an overview of the internal structure of EPUB 3.2 files. (Which I’ll henceforth just call ‘EPUB’, no version number specified, when not specifically contrasting with other EPUB versions.) It’s written as a followup / companion piece to my prior overview of EPUB 2.0.1 files, with similar goals in mind: to serve as a resource for programmers who want to make EPUB-generating software, as an easier-to-read (albeit less thorough) alternative to the official format specification (archive).

Like the previous summary, this one is targeted specifically at creators of EPUB-writing software, not EPUB-reading software, and will omit summaries of reader-specific requirements and of features which are deprecated or otherwise unlikely to be of relevance to creators of EPUB-writing software. For those interested in a more complete picture, see the format specification linked in the prior paragraph.

EPUB 3.2, as a format, is substantially more elaborate and feature-rich than EPUB 2.0.1 was. Moreover, even in those parts of the format which are superficially similar, many changes have been made. Thus, this summary will be very long, and will include many elements which are similar to but subtly different from those in the 2.0.1 summary.

Keep reading

Against Overreliance on Filenames For Metadata

Posted by Alyssa Riceman on 2021-10-15

Several years ago, I was trying to extract some images out of an AZW6 container file. With some digging online, I found a script which purported to assist in doing just that, ran it against the container… and it returned an error claiming the target file wasn’t an AZW6 file. It took over an hour of haphazard debugging before I figured out what was wrong: the file’s name was <filename>.azw.res, and the script was assuming that the lack of .azw6 extension meant it wasn’t an AZW6 file, without even bothering to check its internals before throwing that error and shutting down. As soon as I renamed the file to <filename>.azw6 and re-ran the script against it, the extraction went off without a hitch.

This case is illustrative, I think, of why overreliance on filenames as a source of metadata is a bad habit which programmers should do their best to break themselves from.

Keep reading

EPUB 2.0.1 Structure: A Simplified Overview

Posted by Alyssa Riceman on 2021-09-10

Introduction

This is an overview of the internal structure of EPUB 2.0.1 files. (Which I’ll henceforth call just ‘EPUB’, no version number specified, for the sake of conciseness.) My goal in writing this is to provide a useful resource for programmers who want to create programs which generate well-formed EPUB files; I intend to summarize all essential information about the format’s internal structures for that purpose, while hopefully being briefer and less intimidating to read than the format’s official specifications (archive 1, archive 2, archive 3).

While I will be going into a fair amount of detail, this is an overview, not a fully detailed exposition. In particular, if you’re designing an EPUB reader, I’d recommend looking at the official specifications instead; my summary will convey the information necessary to write well-formed EPUB files, but the format has various optional frills—deprecated features, optional but not-typically-used file format support, and so forth—which a writer doesn’t need to know how to generate but which a reader does need to know how to parse, and my summary won’t cover all of those.

Keep reading

On Code Ordering

Posted by Alyssa Riceman on 2021-08-13

When I first learned to program, it was in Python. Python has a very simple principle for how to order your code: you put the dependencies before the things which depend on them. You can’t define a const on line 5 as the output of a function defined on line 18; the interpreter doesn’t yet know about that function, when interpreting the assignment of the const.

In the last few months, I’ve been working with a lot of makefiles. And, in makefiles, things tend to be structured opposite to how they’re structured in Python: the all recipe, which is likely to be among the most dependency-heavy recipes in the file is the first to be defined, and subsequent file structure tends (at least idiomatically) to follow the pattern onward, placing dependencies below the recipes which depend on them.

Both of these options—dependencies before dependers, dependencies after dependers—are perfectly reasonable and sensible ways to do code-ordering. They’re opposite, but they’re both clear and non-confusing. Where things get bad is when neither of these protocols is followed.

This afternoon, I was looking at some Haskell code, in anticipation of finally making an effort to learn the language (as I’ve been meaning to do for years). In particular, at the sample code on this page (archive). And I was struck: the code changed direction, midway through. At the beginning, it was defining its grammar dependency-last, starting with the highest-level pattern (top) and working its way down to lower-level ones on which that pattern depended. But then, suddenly, after it had defined its grammar, it went and defined a main function dependent on that grammar. Within the grammar, the dependencies were below the dependers; but, outside of the grammar, the dependencies were above the dependers.

Shifting directions like that, midway through a file, makes for a confusing reading experience.

Sometimes, perhaps, it’s mandatory due to the restrictions of the languages you’re working with. (To anyone designing languages: please don’t make that sort of direction-changing mandatory; it makes the code more confusing to essentially no benefit.) And sometimes, perhaps, it allows you to simplify the code on other axes enough to be worth it. (Earlier today, in a Rust program, I had an earlier-defined function call a later-defined function, because the later-defined function was part of a larger block of parser functions which sorted neatly together and which, as a group, were dependent on the earlier-defined helper functions as a group; reordering the functions to follow dependency order individually, rather than as blocks, would have reduced code readability more than the direction-changing did.)

But, as a general principle, code tends to be far more readable if you pick a single direction for your dependency-ordering and stick with it. I try my best to write my code that way, when practical, and I recommend that everyone else do the same.

Release: Fluorite

Posted by Alyssa Riceman on 2021-08-06

For the last couple months, I’ve been working on building a dice roller in Rust, named Fluorite. It’s nowhere near complete, yet—still very much a work-in-progress—but, at least for my own use case, it’s successfully outstripped all other dice rollers I’m aware of in convenience and ease-of-use. As of this afternoon, I’ve put out Fluorite’s first public release. The repo is here, for the interested, and the release is here.

The current release is only for 64-bit Windows, since setting up a good cross-platform release pipeline has proven harder than expected. Nonetheless, it’s an exciting milestone, and over coming weeks I hope to improve my release process and add in support for further platforms.

I don’t intend to post Fluorite updates on here on an overly frequent basis; this is my programming blog, not my this-one-bit-of-software-I’m-working-on blog. Those interested in following Fluorite development in depth are free to do so via GitHub. But I figure the initial announcement, at least, is worth making here.

Python Supports Type Annotations Now!

Posted by Alyssa Riceman on 2021-07-30

This week, I returned to Python for the first time in A While. And I discovered something wonderful and beautiful which was a continual source of delight to me throughout the week: as of Python 3.9, Python now supports type annotations!

Being Python, of course, it’s supporting them in a weird Python-y way. They’re officially called ‘type hints’, because unlike traditional type annotations, they’re only there for humans; the interpreter doesn’t care the slightest bit about them. You can throw a dict into a function marked as taking in a str, or return a bool from a function marked as returning a tuple[int, int], or whatever else, and it’ll function just as it would with no type annotations at all.

But, even if the interpreter doesn’t care, I, as a person who needs to interact with the source code, care a lot. The type annotations make it so much faster to figure out what I’m supposed to do with unfamiliar functions or methods, as compared with reading the docs (if I’m lucky and the docs are clear on the matter) or trial-and-error testing (if I’m less lucky).

(They also make it all the more conspicuously inconvenient, when a given library happens not to have implemented type annotations in its code yet and so I am still forced to do those things.)

For that matter, even when working with familiar code and functions, the annotations still end up being useful, just for the way they put type information right at my metaphorical fingertips such that I don’t need to rely on memory so much. When I hover over a variable name in VSCode, now, the IDE will let me know what type that variable is to the best of its ability to figure out, saving me the trouble of rereading the code which produced that variable to make sure it’s the right type.

Overall, this is an excellent change, and I’m glad it’s happened.

Explicitly Marking Unreachable Code

Posted by Alyssa Riceman on 2021-07-09

I discovered, this week, that there’s a way of explicitly marking unreachable code in Rust, without needing to resort to something like panic!("This should be unreachable"). Namely: the unreachable macro. When hit, it panics, because would-be unreachable code being actually reachable is panic-worthy; it has no functional difference from a normal panic statement. But it makes the code a bit more readable, and as such I’m happy to have discovered it.

Minimum Rust Binary Size

Posted by Alyssa Riceman on 2021-06-25

Consider this very simple Rust program:

main() {
    ()
}

When compiled into a Windows executable in debug mode (target: x86_64-pc-windows-msvc), the resulting file is 135 KB. In release mode, 132 KB.

When compiled into a Linux binary in debug mode (target: x86_64-unknown-linux-gnu), the resulting file is 3.4 MB. In release mode, 3.3 MB.

When compiled into a Mac binary in debug mode (target: x86_64-apple-darwin), the resulting file size is 419 KB. In release mode, 414 KB.

I wish I knew enough about reading binaries to be able to get anything useful out of those files in a hex editor, because those file sizes, and especially the differences between them, are interesting. What’s all this machine code being run in order to start a program and immediately exit without doing anything? What are the differences between the systems (or between the compilers for the systems) which lead to such dramatic disparities in how much such code is run in order to do that? I don’t know, and I lack the skills necessary to find out, but I predict that the curiosity is going to keep gnawing at me now that I’ve discovered this.

Thoughts on Iced

Posted by Alyssa Riceman on 2021-06-25

Iced is a GUI library for Rust. It was the first GUI library I ever tried, back slightly over a year ago when I was first learning the language—the first in any language, to be clear, not just in Rust—and it remains the one I’ve spent the most time with, because I haven’t done much GUI-related work since then.

Today, I finally revisited Iced for the first time since then. It’s advanced in that time, iterating from version 0.1 to version 0.3. I’ve advanced in that time, going from my first ever serious Rust project over to being an at-least-vaguely-experienced Rust programmer. So, in the context of all that advancement, have some thoughts:

The documentation is Not Easy to navigate. (This seems to be a recurring theme between Rust GUI libraries; the others I’ve looked into tend to be even worse in that regard.)
The example code, unlike the documentation, is relatively straightforward and easy-to-read. The more you can assemble your program out of chopped-up pieces of example code without needing to resort to the docs, the easier your life is likely to be.
Iced has two major traits you can use to define your application: Application, and Sandbox. Application is horribly intimidating, because it relies on Rust’s async-related features and I haven’t yet gotten around to learning those. (They’re my planned next project, once I figure out GUI-building.) Sandbox, on the other hand, is pleasingly straightforward and simple, and for my current purposes is more than enough to go with.
The library’s outputs are horribly bloated. A completely minimal Windows executable built with Iced—spawning an entirely empty window, with no update-related behavior—was about twenty megabytes.

That final point was a dealbreaker for me, when I noticed it; on that basis, I’ve decided to move my GUI-learning efforts over to Druid, whose minimal Windows executable size is a far-more-reasonable ~2.6MB, albeit at the cost of much less conveniently easy-to-eyeball example code. But, for those less concerned with bloated executable size than I am, I continue to find Iced to be the easiest-to-understand of the Rust GUI libraries I’ve looked into thus far.

Pest

Posted by Alyssa Riceman on 2021-06-04

Last weekend, my brain was unexpectedly hijacked—very thoroughly hijacked—by my discovery of a Rust library called pest. Pest’s basic concept is relatively simple: it’s a general-purpose parser which, instead of parsing Insert Specific Language Here (in the manner of e.g. the various Serde formats, quick-xml, et cetera), lets you define your own grammar for the parser to use and parse any input stream captured by that grammar.

The grammar-definition part of pest took me a decent number of hours to learn—learning that was the really seriously brain-hijacking bit—but it was conveniently facilitated by the highly-convenient editor located at the bottom of the pest website (linked above). I only stumbled my way into a left-recursive loop once, during that time. And now I’ve got a grammar defined which ought to dramatically simplify the back-end text-parsing section of my old dice-roller project (which I ultimately chose to use as the test case for teaching myself pest, since it’s neither excessively simple nor excessively complicated and since I’d been meaning to do a full backend rewrite of it anyway for unrelated reasons).

The output trees pest produces once the input has been slurped up and parsed, on the other hand, leave something to be desired. They work—they successfully represent the parsed information in a useful way—but they’re relatively code-intensive to work through and drill down into, without much in the way of easy simplifying methods built into them. (At least unless you bring in an external library like pest_consume, but when I tried that one out, I found it to be pretty inconvenient in its own right, and ultimately abandoned it in favor of default pest.)

I haven’t yet finished rewriting my dice roller’s back end. Maybe it’ll turn out there’s some horrible impediment, or just a pile of lots and lots of small inconveniences, such that pest is less of a convenience-boost than I currently think it is. Or maybe the dice roller will go fine, but I’ll discover major limitations of some sort when I try to throw pest at a more ambitious project. But, tentatively, for the moment, I’m inclined to say: I like pest. Its grammar side is beautiful and powerful and something I’m likely to continue admiring for A While. Its tree outputs are… substantially less beautiful… but they still, a far as I can tell, basically work. And its overall functionality is very broadly applicable, something I anticipate having a use for in lots of different programs I write in the future. So, overall, at least for the moment, I am A Fan.

Forcing Resources Out Of Scope

Posted by Alyssa Riceman on 2021-06-04

Rust’s scope-related rules are often highly convenient in their ability to ensure that things drop out of memory at appropriate times without the need for manual intervention. (And, occasionally, highly inconvenient in their ability to ensure that things drop out of memory while you’re still trying to use them; but I’m running afoul of that increasingly rarely as I improve my intuition for the language.) Memory is released when things fall out of scope, and there’s rarely any need to get more complicated than that.

Still, once in a while, manual intervention remains valuable; for example, when a resource has locked a file or a mutex, done everything it needs to do therewith, but not yet fallen out of scope in such a way as to unlock said file-or-mutex for the next resource that might want to access it.

When circumstances such as those arise, Rust has a convenient function to force a resource out of scope early. This function is drop. It is highly convenient (for those cases where the even-more-convenient scope rules fail to take precedence), and I’m glad to have discovered it.

Not All Operating Systems’ Time Measurements Have Nanosecond Precision

Posted by Alyssa Riceman on 2021-05-28

Rust is normally pretty good about cross-system support. However, I discovered yesterday that, at least within a certain narrow domain, its cross-platform support ends up being somewhat limited.

The SystemTime struct serves as a convenient way to get timestamps for things. And, under normal circumstances, run on Linux, it offers nanosecond precision.

However, I discovered the hard way, while writing some tests, that said precision is heavily platform-dependent. Run on Mac OS, it only has microsecond precision; on Windows, tenth-of-microsecond precision. The nanosecond precision on Linux is, as it turns out, the exception, not the rule, dependent on the precision of the system call it underlyingly relies on.

I’m not sure where the gap in precision between operating systems comes from. It’s not a matter of different underlying hardware; running it on a Linux VM hosted on my Mac, it works fine. It’s purely a difference between the operating systems themselves, in terms of how their respective system-time-retrieval functions work. And I don’t know why Microsoft and Apple wouldn’t offer nanosecond-precision time-checking.

But the fact of the matter is that they don’t, or at least not in any manner convenient enough for the Rust standard library’s developers to have taken advantage of it. Anyone planning on writing a program which expects nanosecond precision should be accordingly cautious.

Compiler Fences

Posted by Alyssa Riceman on 2021-05-28

I recently discovered a very useful Rust function: compiler_fence.

When performing code optimization, the compiler is unable to reorder memory reads and writes across a compiler fence. There are a variety of different exact fence types with different levels of restrictiveness regarding exactly what sorts of reordering can and can’t be done across them, but that’s the core principle: no reordering [of certain sorts] across this boundary.

Thus, when working with the highly inconvenient subset of code which works fine in debug mode but is nonfunctional in release mode due to compiler reordering, there’s no need for the sorts of awkward workarounds I was taught were necessary under equivalent circumstances in C, back when I was learning that. Instead, just add a compiler fence, and enjoy the lack of unwanted reordering with absolutely no overhead in unnecessary calculations!

An Introduction

Posted by Alyssa Riceman on 2021-05-28

This is my first time attempting a project of this sort. I’m not quite sure how it’s going to go. I suppose we’ll see.

On this blog, I intend to post about whatever interesting new code-related things I learn as I program. Useful libraries or functions I discover; odd behavioral quirks of one system or another; et cetera.

A warning: my posts here will not be exhaustively researched. My posts will be about my best understanding of a given topic at the time of writing, and I make no guarantees as to either their accuracy or their completeness, although if I discover that a given post was egregiously wrong in some manner I may go back and insert a correction.

With that warning out of the way, to anyone who may have found this blog by one means or another: welcome! And I hope that your time spent reading here is pleasant, informative, or otherwise worthwhile.