Alyssa's Coding Journal

About


Rust


On Code Ordering

Posted by Alyssa Riceman on

When I first learned to program, it was in Python. Python has a very simple principle for how to order your code: you put the dependencies before the things which depend on them. You can’t define a const on line 5 as the output of a function defined on line 18; the interpreter doesn’t yet know about that function, when interpreting the assignment of the const.

In the last few months, I’ve been working with a lot of makefiles. And, in makefiles, things tend to be structured opposite to how they’re structured in Python: the all recipe, which is likely to be among the most dependency-heavy recipes in the file is the first to be defined, and subsequent file structure tends (at least idiomatically) to follow the pattern onward, placing dependencies below the recipes which depend on them.

Both of these options—dependencies before dependers, dependencies after dependers—are perfectly reasonable and sensible ways to do code-ordering. They’re opposite, but they’re both clear and non-confusing. Where things get bad is when neither of these protocols is followed.

This afternoon, I was looking at some Haskell code, in anticipation of finally making an effort to learn the language (as I’ve been meaning to do for years). In particular, at the sample code on this page (archive). And I was struck: the code changed direction, midway through. At the beginning, it was defining its grammar dependency-last, starting with the highest-level pattern (top) and working its way down to lower-level ones on which that pattern depended. But then, suddenly, after it had defined its grammar, it went and defined a main function dependent on that grammar. Within the grammar, the dependencies were below the dependers; but, outside of the grammar, the dependencies were above the dependers.

Shifting directions like that, midway through a file, makes for a confusing reading experience.

Sometimes, perhaps, it’s mandatory due to the restrictions of the languages you’re working with. (To anyone designing languages: please don’t make that sort of direction-changing mandatory; it makes the code more confusing to essentially no benefit.) And sometimes, perhaps, it allows you to simplify the code on other axes enough to be worth it. (Earlier today, in a Rust program, I had an earlier-defined function call a later-defined function, because the later-defined function was part of a larger block of parser functions which sorted neatly together and which, as a group, were dependent on the earlier-defined helper functions as a group; reordering the functions to follow dependency order individually, rather than as blocks, would have reduced code readability more than the direction-changing did.)

But, as a general principle, code tends to be far more readable if you pick a single direction for your dependency-ordering and stick with it. I try my best to write my code that way, when practical, and I recommend that everyone else do the same.


Explicitly Marking Unreachable Code

Posted by Alyssa Riceman on

I discovered, this week, that there’s a way of explicitly marking unreachable code in Rust, without needing to resort to something like panic!("This should be unreachable"). Namely: the unreachable macro. When hit, it panics, because would-be unreachable code being actually reachable is panic-worthy; it has no functional difference from a normal panic statement. But it makes the code a bit more readable, and as such I’m happy to have discovered it.


Minimum Rust Binary Size

Posted by Alyssa Riceman on

Consider this very simple Rust program:

main() {
    ()
}

When compiled into a Windows executable in debug mode (target: x86_64-pc-windows-msvc), the resulting file is 135 KB. In release mode, 132 KB.

When compiled into a Linux binary in debug mode (target: x86_64-unknown-linux-gnu), the resulting file is 3.4 MB. In release mode, 3.3 MB.

When compiled into a Mac binary in debug mode (target: x86_64-apple-darwin), the resulting file size is 419 KB. In release mode, 414 KB.

I wish I knew enough about reading binaries to be able to get anything useful out of those files in a hex editor, because those file sizes, and especially the differences between them, are interesting. What’s all this machine code being run in order to start a program and immediately exit without doing anything? What are the differences between the systems (or between the compilers for the systems) which lead to such dramatic disparities in how much such code is run in order to do that? I don’t know, and I lack the skills necessary to find out, but I predict that the curiosity is going to keep gnawing at me now that I’ve discovered this.


Thoughts on Iced

Posted by Alyssa Riceman on

Iced is a GUI library for Rust. It was the first GUI library I ever tried, back slightly over a year ago when I was first learning the language—the first in any language, to be clear, not just in Rust—and it remains the one I’ve spent the most time with, because I haven’t done much GUI-related work since then.

Today, I finally revisited Iced for the first time since then. It’s advanced in that time, iterating from version 0.1 to version 0.3. I’ve advanced in that time, going from my first ever serious Rust project over to being an at-least-vaguely-experienced Rust programmer. So, in the context of all that advancement, have some thoughts:

That final point was a dealbreaker for me, when I noticed it; on that basis, I’ve decided to move my GUI-learning efforts over to Druid, whose minimal Windows executable size is a far-more-reasonable ~2.6MB, albeit at the cost of much less conveniently easy-to-eyeball example code. But, for those less concerned with bloated executable size than I am, I continue to find Iced to be the easiest-to-understand of the Rust GUI libraries I’ve looked into thus far.


Pest

Posted by Alyssa Riceman on

Last weekend, my brain was unexpectedly hijacked—very thoroughly hijacked—by my discovery of a Rust library called pest. Pest’s basic concept is relatively simple: it’s a general-purpose parser which, instead of parsing Insert Specific Language Here (in the manner of e.g. the various Serde formats, quick-xml, et cetera), lets you define your own grammar for the parser to use and parse any input stream captured by that grammar.

The grammar-definition part of pest took me a decent number of hours to learn—learning that was the really seriously brain-hijacking bit—but it was conveniently facilitated by the highly-convenient editor located at the bottom of the pest website (linked above). I only stumbled my way into a left-recursive loop once, during that time. And now I’ve got a grammar defined which ought to dramatically simplify the back-end text-parsing section of my old dice-roller project (which I ultimately chose to use as the test case for teaching myself pest, since it’s neither excessively simple nor excessively complicated and since I’d been meaning to do a full backend rewrite of it anyway for unrelated reasons).

The output trees pest produces once the input has been slurped up and parsed, on the other hand, leave something to be desired. They work—they successfully represent the parsed information in a useful way—but they’re relatively code-intensive to work through and drill down into, without much in the way of easy simplifying methods built into them. (At least unless you bring in an external library like pest_consume, but when I tried that one out, I found it to be pretty inconvenient in its own right, and ultimately abandoned it in favor of default pest.)

I haven’t yet finished rewriting my dice roller’s back end. Maybe it’ll turn out there’s some horrible impediment, or just a pile of lots and lots of small inconveniences, such that pest is less of a convenience-boost than I currently think it is. Or maybe the dice roller will go fine, but I’ll discover major limitations of some sort when I try to throw pest at a more ambitious project. But, tentatively, for the moment, I’m inclined to say: I like pest. Its grammar side is beautiful and powerful and something I’m likely to continue admiring for A While. Its tree outputs are… substantially less beautiful… but they still, a far as I can tell, basically work. And its overall functionality is very broadly applicable, something I anticipate having a use for in lots of different programs I write in the future. So, overall, at least for the moment, I am A Fan.


Forcing Resources Out Of Scope

Posted by Alyssa Riceman on

Rust’s scope-related rules are often highly convenient in their ability to ensure that things drop out of memory at appropriate times without the need for manual intervention. (And, occasionally, highly inconvenient in their ability to ensure that things drop out of memory while you’re still trying to use them; but I’m running afoul of that increasingly rarely as I improve my intuition for the language.) Memory is released when things fall out of scope, and there’s rarely any need to get more complicated than that.

Still, once in a while, manual intervention remains valuable; for example, when a resource has locked a file or a mutex, done everything it needs to do therewith, but not yet fallen out of scope in such a way as to unlock said file-or-mutex for the next resource that might want to access it.

When circumstances such as those arise, Rust has a convenient function to force a resource out of scope early. This function is drop. It is highly convenient (for those cases where the even-more-convenient scope rules fail to take precedence), and I’m glad to have discovered it.


Not All Operating Systems’ Time Measurements Have Nanosecond Precision

Posted by Alyssa Riceman on

Rust is normally pretty good about cross-system support. However, I discovered yesterday that, at least within a certain narrow domain, its cross-platform support ends up being somewhat limited.

The SystemTime struct serves as a convenient way to get timestamps for things. And, under normal circumstances, run on Linux, it offers nanosecond precision.

However, I discovered the hard way, while writing some tests, that said precision is heavily platform-dependent. Run on Mac OS, it only has microsecond precision; on Windows, tenth-of-microsecond precision. The nanosecond precision on Linux is, as it turns out, the exception, not the rule, dependent on the precision of the system call it underlyingly relies on.

I’m not sure where the gap in precision between operating systems comes from. It’s not a matter of different underlying hardware; running it on a Linux VM hosted on my Mac, it works fine. It’s purely a difference between the operating systems themselves, in terms of how their respective system-time-retrieval functions work. And I don’t know why Microsoft and Apple wouldn’t offer nanosecond-precision time-checking.

But the fact of the matter is that they don’t, or at least not in any manner convenient enough for the Rust standard library’s developers to have taken advantage of it. Anyone planning on writing a program which expects nanosecond precision should be accordingly cautious.


Compiler Fences

Posted by Alyssa Riceman on

I recently discovered a very useful Rust function: compiler_fence.

When performing code optimization, the compiler is unable to reorder memory reads and writes across a compiler fence. There are a variety of different exact fence types with different levels of restrictiveness regarding exactly what sorts of reordering can and can’t be done across them, but that’s the core principle: no reordering [of certain sorts] across this boundary.

Thus, when working with the highly inconvenient subset of code which works fine in debug mode but is nonfunctional in release mode due to compiler reordering, there’s no need for the sorts of awkward workarounds I was taught were necessary under equivalent circumstances in C, back when I was learning that. Instead, just add a compiler fence, and enjoy the lack of unwanted reordering with absolutely no overhead in unnecessary calculations!