I have just graduated and would like to get into the Compiler Engineering field.
Is the field still in demand? Are there any opportunities? What will I be working on? What tips can you give me?
I lead a PL/VMs/compilers research group and the number of organisations interested in hiring people in this area keeps growing and growing. Now, that's not to say that there are ever going to be hundreds of thousands people working in this field, but it turns out that lots of organisations have realised they have pressing language needs. It varies all the way from those wanting to make mainstream language implementations perform better on their codebases to those wanting to reimplement a major internal language in a modern fashion.
At the moment -- and I expect for the foreseeable future -- demand quite significantly outstrips supply. Because of that, organisations are often quite flexible in how they view potential hires in this area. Simplifying a bit, there are two main ways to get in to the field: obtain specific training in the field (e.g. a PhD); start / contribute to a project (not necessarily OSS) that shows expertise (e.g. if you start filing high quality PRs against Rust or LLVM, people will fairly quickly realise you know what you're doing).
Best of luck -- it's a really fun field, with lots of interesting challenges, and a good group of people!
Any comment on the salary in these positions? Are they higher than for the average software developer, and if so, by how much? I am wondering since to my experience, the application and development of logics and verification procedures to be used in programming language development requires vastly different and more rigorous personality and working methods than in your average web developer gig.
I worked as a compiler engineer at a big tech company in SV and lived very comfortably, was able to put a lot of money aside. The demand outstrips the supply, it's hard to hire qualified people, so the salaries have to be high to attract those qualified.
Tangentially, I think the supply is small because compiler engineering is not a fashionable field of study. Everyone wants to do "AI" these days.
I hope am not being pushy but could you please say how much you made and in which company? Thanks
I'm not a compiler engineer, but I'm willing to venture that the lower bounds of a "competitive salary" for an engineer in silicon valley is somewhere between 200 & 250k. Compiler engineers may make much more, I don't know, but it's unlikely they make less than that. Perhaps that answers your question?
> an engineer in silicon valley
Good point. The biggest factor that determines an engineer's salary is location, followed by years of experience. Not whether they work on compilers/embedded/web/whatever.
Not whether they work on compilers/embedded/web/whatever.
Not true! See: salaries for people who know kdb+, that embedded engineers almost always get screwed on comp.
> salaries for people who know kdb+
If you are suggestingn that kdb+ salaries are extremely high, like I often read here, I disagree. I did some research in this area, and despite the rumors, found lots of kbd+ jobs posted with very mediocre salaries. It seems unlikely that such jobs would been filled if salaries are as high as people have suggested on HN.
For example, here's a senior kbd+ deb job in Manhattan (high cost of living) that pays 165-185,000/year. I make almost that much as a senior dev in a low cost of living area. I'd expect to make at least $250,000/year in NYC or California.
I know multiple people who have jobs using kdb+, and I use an array language myself for work.
Every type of job site will have postings that are way below market rate. The market rate for kdb+ devs is well above that, I can assure you.
So would you please give me an idea of what market rate is for a senior dev working with kbd+ in NYC? Roughly? If it's well above 180,000, what is it? Thank you.
And I don't mean quants. I mean software developers.
And yes, it's true that every job category will have outliers (high and low), but after spending some time several years ago looking at jobs and speaking with recruiters, I'm skeptical. I've enjoyed learning about array languages, and had heard the rumors of high kbd+ salaries, which is why I was curious.
And I get that kbd+ is just a technology, and much more depends on individual qualifications. So let's say, an experienced, highly-qualified senior dev who has taught themselves kbd+, just for arguments sake.
Yes, on that order of magnitude, as a starting salary after grad studies.
Check levels.fyi for salary range at various companies. Most companies do not distinguish someone who work on compiler from the general software engineer, so software engineer salaries there directly apply.
I don't pretend to have a total picture of all the possibilities, and there's quite a bit of not-always-explainable variation in what I do know, but I think it would be fair to say that people in this field are quite some way from going hungry. It tends to be better remunerated than a typical software gig but it's certainly possible to earn more in AI/ML. A lot also depends on whether you want to be a "normal" engineer or are willing/capable of taking on managerial responsibility.
Do you mean that compiler engineers are getting paid shit nowadays?
I wonder if it's similar to embedded software development (which also requires "rigorous working methods"). Salaries for embedded are generally lower than for web development because there's much less demand for embedded skills.
Embedded software development has improved over the past years reasons being more silicon/asics in vehicles and other places, autonomous driving needs, robotics in supply chain, IOT, smart home. There are lot of opportunities for embedded sw these days.
Would you consider implementing the first part of the Tiger book as enough expertise to start in the field? This includes, in my case, generating x86-64 assembly and writing a register allocator.
I’ve worked in compilers (optimizations) at Microsoft (vc++) and NVIDIA(CUDA) - register allocators, schedulers and loop optimizers were super fun.
Then I worked on database (query) compilers. I managed Apache Hive at Hortonworks, and that has a compiler. These are very different beasts.
Now, my startup is working on code analysis and cross compilers for data engineering, including Spark.
Some of my colleagues are working in AI chip startups.
It’s hard to find high quality engineers in both kinds of compilers. Just like database internals engineers there is high demand.
Plug - I’m hiring at Prophecy.io for compiler engineers.
I work at Oracle Labs where GraalVM (graalvm.org) is being developed. I think this is one of the most exciting places to be right now for compiler engineers, and people with the right skills are certainly in demand. Some things that came out of this: partial escape analysis, the Graal JIT compiler, SubstrateVM, native-image, several Truffle language implementations (JS, Ruby, Python, LLVM), IGV for visualizing different layers of IRs, integrating GraalVM in the Oracle Database (MLE).
Basic/naive/stupid question that I've been genuinely curious about for a little while now:
What are the real-world motivations behind reimplementing the noted languages on top of Truffle? These projects are frankly incredible - hundreds (thousands) of total man-years of work - and all to _re_build something that already exists. I'm missing context that I don't know where to look for.
Is there anywhere I might look online to immerse myself in the relevant discussion of the business use cases that drive these types of huge projects?
Truffle languages don't require hundreds or thousands of man years of work. That's the point of them: implementing a language on Truffle is very easy compared to writing something like V8. You "just" write an interpreter and then extra code after that is about performance.
As for why, well, Truffle languages have a bunch of advantages over implementing engines in other ways:
- They all interop with each other, with JVM bytecode languages, with anything that compiles to LLVM, and anything that runs on WebAssembly. All in the same VM with the same heap (i.e. they can all use pointers to each others constructs).
- You get access to HotSpot and in particular its state of the art GCs.
- You get a lot of sophisticated visualisation and debugging tools, both to help you develop and optimise the language (IGV) but also to analyse programs in production (VisualVM, JFR).
- You can support native extensions using [Safe] Sulong.
- You get debugger support for free, because Truffle languages all support the Chrome DevTools debugger protocol.
- They just added support for the language server protocol so you get some automatic support for IDEs.
- You get a consistent sandboxing API for free.
- Probably a lot of other things I forgot.
So you get a better engine than most languages can afford, for way less cost, with way more features and access to a huge ecosystem of libraries from all the different languages. Pretty snazzy.
Communicating or interfacing massive legacy code bases together. You can write python code and call it from Ruby code in a slightly, but livable, way.
Also eventually I'm assuming there's some plans to compile these other languages with native-image. In my eyes native-image is Java's response to golang: reasonably sized small binaries that are perfect for containerization and quick to elastically scale up/down (startup time).
Not only is it still in demand, I think demand for compiler engineers is growing (and fast) due to the end of Moore’s law, since that means performance improvements are mainly (and soon only) happening due to improving software and making specialized hardware, and both of those involve heavy use of compilers and compiler writers.
JIT compiled scripting languages and DSLs (domain specific languages) are all over the place, GPUs and the APIs that run on GPSs all need special compilers, embedded processors and custom chips often need custom compilers and/or custom optimizations.
There’s certainly a ton of optimization work out there, so if you like designing optimizations on top of existing compilers, you can pretty much pick what kinds of API/software/hardware you want to work on.
Intel, IBM, Nvidia, Google, Red Hat, and MSFT are all currently hiring complier engineers. Experience with LLVM is probably useful for many of those opportunities. https://us-redhat.icims.com/jobs/75775/software-engineer---l... https://nvidia.wd5.myworkdayjobs.com/NVIDIAExternalCareerSit... https://careers.ibm.com/ShowJob/Id/764855/Software-Developer...
Arm is another company that is frequently and currently hiring compiler engineers (mostly for UK Manchester/Cambridge; looking at the current listings it's largely looking for people with some experience at the moment but there are also some intern/student placements.)
I second this: My alma mater (in Germany) has a compiler group, and, as far as I'm aware, at least some of its alumni went to the companies mentioned by parent.
IMO, you should forget about compiler engineering as working on the backend of, say, gcc, or llvm. Yes, there are such positions (e.g., intel needs people adapting compilers to new processor instruction sets) but they are few and very specialized.
Instead, think about languages in general. There is a myriad of opportunities between the js eco system, JVM and .NET, things like Rust and even low-level languages like LLVM. It should be straightforward to pick features from one eco system and port them to another. Say linear types from Rust to the JVM. Pick one or two such features, port them to your favorite eco system and provide a prototype compiler/macro library on github. With that kind of experience under your belt, you should be the goto guy for any software company that does tooling in that ecosystem. From that vantage point you could start looking for ever more interesting work.
I wonder if the growth of "low-code" will lead to many more well designed domain specific languages.
LookML in Looker is one example I can think of - imo it has enabled Looker to solve the self-service data issue much better via a low-code approach than Tableau and Qlikview (both "no-code").
This is absolutely valid. There is a comprehensive analytics platform that I work on where we went from (a litany of) REST services to our own domain language. It not only allowed us to go across databases (Graph / RDF / RDBMS) but even across processing engines (R/Python). This is really where I think the industry should go. I am sure someone will point out there is a great talk by Brian Kernighan on this on youtube.
minor nit: please use “goto person” if you’d like to keep all readers connected to your writing.
(or use “goto woman” to subvert so-called typical practitioner expectations)
thank you for considering
Please stop posting things like this.
Since there are people in the know here:
How would you go in the adjacent field of language design? I would be super interested in doing research on the mental load of different programming languages (and different representations of information). I.e. put a hacker in a fMRI and let her solve tasks in different languages.
People in academia are doing research in this area. Let me know if you want a PhD, I am hiring :)
This is an interesting idea.
I wonder how many people you'd need to measure to produce a usable stereotype model. (Hmm, I wonder if studies of other types of problem-solving challenges have asked the same question and published any useful data?)
I fear you'd need to parametrically plot developers, experience, and language together to get a useful result.
And then there's the "I get it" threshold point at which the brain suddenly finds a particular subject much easier to reason about, ie, fundamental learning. Everyone's on a spectrum there - one that seems to be very hard to self-assess.
There has been a resurgence due to the proliferation of more special-purpose processors, like Google TPU or Tesla FSD. I anticipate this trend to continue due to Moore's Law slowdown. However, if you're not really into this, there's no reason to not pursue some other subset as on a relative basis, higher levels of the stack likely will have many more opportunities than infrastructure.
Is there a way I can integrate Machine Learning to Compiler stuff? I've been meddling with ml for a while and it's pretty interesting too.
Certainly! We do so :)
Can you elaborate on that please? Like how do they merge together? I am really interested in those two fields
This is as good an overview as any: https://arxiv.org/pdf/1805.03441.pdf
In general we can afford to treat the problem as a search problem since our kernels are small enough. Here there is a lot of stuff that ML aided stochastic search is amazing at compared to normal compilers and even in some cases human hand optimized assembly (!).
Check what LLVM is doing with the MLIR project.
You may enjoy the paper "Learning to Optimize Tensor Programs": https://arxiv.org/abs/1805.08166
I have been curating this list  for the last little while, and made it public about a month ago.
While not a list or positions as the URL suggests, ands to help remind people that compilers are in way more companies than you think!
I'm curious about whether there is a large skills overlap between engineers working on compilers and embedded DSLs (for the case of external DSLs, such an overlap would IMO be quite significant).
By the way, are there any people here (or, perhaps, you could recommend someone) who is very capable and would be interested in doing some potential contract work on a DSL (I'm still pondering about embedded vs. external - obviously, each category has their own pros and cons, but platform stack selection adds an additional complexity dimension)? Several people who I have approached so far, are definitely very knowledgeable, but they all work in academia and, thus, practically, have no time for consulting or contract work. :-( Not to mention potential issues arising from IP clauses in some experts' university contracts, hence the question above.
I have no specific experience with compilers (I missed the chance to take the class in college due to scheduling conflicts).
My current day job, at a research nonprofit, is working on a system whose behavior is driven by an external DSL. I get to work on the parser/compiler type thing that generates executable units and the interpreter that actually runs them. This is certainly not the “big leagues” as far as compilers are concerned, we’re not executing anything directly on anyone’s CPU, but I do find it tremendously enjoyable.
Thank you for sharing your experience. Do you use a more or less well-known toolset like MPS or Xtext or it is some kind of a custom framework? Any lessons learned and best practices regarding external DSL design and development?
We built everything ourselves. The parser in particular was written in-house by a Person Who No Longer Works Here who was a parser enthusiast (Conway’s law). If we were starting from scratch today I would think we would use an off the shelf package. One excellent choice was to make the language  an open standard and now there is at least one totally novel implementation  in addition to ours .
I appreciate your clarifications. Until now, I was aware of , but not  and . Your implementation seems be much more popular than both the new alternative and a reference implementation. The (general and scientific) workflow engines universe is very large - it likely contains, at least, dozens of production-level implementations. Such diversity is both a blessing and a curse, though ...
There’s a lot of compiler work in query processors for analytical systems. High-level languages compiling down to machines at different levels of abstraction, LLVM being the lowest and database operator libraries a bit higher. I’d like to find someone interested in pursuing that direction.
There's been a resurgence due to interest in specialized processors and new computing architectures and systems in general, mostly for ML.
Quite a bit of work is happening around https://nim-lang.org/ and perhaps Status.im is still hiring.
i know an actual person who was hired to work on compilers/lang design there, so it's not out of the question, but it might require some kind of involvement with the language/community before that (if there are any open slots at all)
Mostly semiconductor companies. Compiler engineers do work for CPU/GPU/FPGA/AI chipsets.
Definitely in demand! For example, there's a bunch of really interesting commercial work happening building or working on compiler toolchains for WebAssembly.
What qualifies someone to be a Compiler Engineer?
Compsci degree? Working compilers? Opensource commits? Github repo full of parsers?
The ability to help deliver better compilers.
How you demonstrate that can be in a variety of ways. Any of those would probably be helpful if you can get the right person to notice them.
Having written one or more compilers, or large parts thereof. Obvious bootstrap problem notwithstanding..
Degree in CS, ideally with a thesis on the field.
This is the real answer, it's a huge career risk to make compilers your niche unless you have an academic background in it. The gatekeepers in most CS specialties are really good, and there's an attitude that if you're not at least pursuing a doctorate/working on a big Google project then you're probably a charlatan.
Meh. No doubt there are snobs in the field, but in my experience most of the grit and innovation tends to come from solid engineers without a language/compiler focus, let alone the phd.
> Github repo full of parsers?
Definitely not that one.
Could be a C++ parser
Apple has teams working on LLVM, Clang, and Swift. They have some open positions right now.
There's some work happening in the blockchain space because people are starting to create their own smart contract languages. I don't know of any other space seeing so many 'new' languages being created in such a short space.
I've seen jobs advertised by teams developing Cardano, Tezos, and Ethereum. It's worth looking at their project pages, seeing which teams work on it, and then checking their careers pages. I know IOHK have been hiring in the space although that's specifically functional compilers - however nothings up on their careers pages right now.
Are these compilers or mere transpilers?
Not sure if I am being sarcastic or serious.
This stopped being a useful distinction a long time ago.
A transpiler uses much of the same technology as a native code compiler.
What's a good paper (or book) reviewing current trends in compilers?
I found this, but it's just ML, and not easy to read:
A Survey on Compiler Autotuning using Machine Learning
Check out the annual proceedings of conferences like POPL, PLDI, and CC.
Check out http://www.compilerworks.com
I worked there for awhile but compilers aren't really my thing. They're building cool stuff, though. Their tech converts database queries between dialects, and can even run as a translating proxy (for example, run Oracle code on Postgres) and provide performance insights into your code. Their clients are big name companies who need to maintain millions of lines of DB code.
Thanks for the kind words Karl.
We're hiring compiler engineers at Horizon Quantum Computing (http://horizonquantum.com).
a bit off topic shot in the dark here, but no one seems to be able to tell me so far. what kind of machine code a quantum computer would execute. is it a instruction set like any computer or does it contain a lot of weirdness (instructions mapping to insane kind of micro-op sequences or something?) is there any information out on that topic at all? i am assuming it would still be mapping some electrical 1/0 kind of values to machine code up to higher level languages, but the concept just makes my brain fart a bit when i think about it. it would be super interesting to learn about or read up on
Quantum processors don't have the same gate sets as are used in conventional computers, since they can be in superpositions rather than in just the 0 and 1 state. Instead of NAND gates, a universal set of quantum gates will include a gate capable of producing quantum mechanical interference, such as a Hadamard gate. Almost all programming languages are currently at this gate level, often implemented through Python or another high-level language which you can use to build the circuit (e.g. IBM Qiskit, or Rigetti PyQuil). Our focus has been on building higher level languages to get to a point where the level of abstraction is such that the same code can be compiled for both conventional and quantum processors, which is not something that can be done with existing tools without resorting to the extremely inefficient direct simulation of quantum processors. The exact output we shoot for depends on the processor. For the Rigetti processors, for example, the native set of gates are limited to specific single qubit rotations and specific two-bit gates (CZ or parameterised XY gates). It is sufficient to construct a program using these gates, although we implement a transpiler to convert between any universal set of gates. For various reasons (usually to combat noise or increase performance of the processor) it is sometimes useful to be able to compile to lower level operations. In particular, for most quantum processors gate operations are implemented using laser/microwave/radiowave pulses, and so ultimately you aim to a sequence of such pulses. Being able to tweak the shape, phase, frequency and duration of such pulses is important for maximising the performance of the device. This is usually done to minimise error rates, but can also be done in an. attempt to realise faster logic gates etc.
Thanks a lot for the response. Even though admittedly I can hardly comprehend what it means, it certainly gives me finally some topics and concepts to dive a bit deeper into to try and get my head around such a complicated topic, especially your points on the logic gates used. i've always looked at it more from a down to up approach, but this explains exactly why that has gotten me no where so far. thanks a lot. :D
I'm interested in writing parsers, but my experience is only with small scale web editors for specific type of documents full of business rules
e.g user write documents with something like markdown and then editor renders document with things like auto numeration, displaying other referenced elements by syntax e.g '@element2'
Does there exist job in writing parsers?
I always thought "Program Vector Representations for Deep Learning" was an interesting combination of compiler technology (parsers, ASTs) and machine learning.
Obfuscators, I work for a company that builds mobile obfuscation at a compiler level (LLVM ir and Java bytecode). There are also gaming companies that use similar techniques and technology for anti cheat engines.
I also often see job postings for compiler developers in the embedded field.
Julia computing hires compiler talent
I saw this position for Dart (in Denmark) on twitter: https://twitter.com/mraleph/status/1218909246294765569
Codeplay Software develop compilers for new architectures and new high performance programming models. They are always looking to hire talented engineers.
Definitely in demand! We are hiring compiler developers at www.edument.se (remote or onsite in Sweden or Prauge), contact me at tore (at) edument . se
I work at an embedded OS company on the entire command-line toolchain (as opposed to the GUI-based IDE). That includes maintaining existing and doing new compiler ports.
We always have a great deal of trouble finding qualified candidates to fill open reqs. We find few are interested in this sort of technology, everyone want to either work on cool new applications (like self-driving automobiles) or high-profile web-based stuff. Our entire department is either waiting for retirement or a heart attack (whichever strikes first, and retirement is not a realistic expectation).
tl;dr compiler engineers are in demand and they work on compilers.
A lot of work with compilers in blockchain/Ethereum.
But if you're considering an education choice, don't overspecialise. You will most likely change specialisation in 5-10 years (different market, different technology), so if all you know is just compiler engineering, it will be way tougher for you.
Also, make sure you learn a lot about algorithms / computer science in general. If you know that, switching engineering fields will be easy.
You should add some contact information to your profile.
And/or get in touch.
mostly i see people working in probabilistic programming languages and formal methods
It is very sad how the engineer term has been corrupted.
You are an engineer only if you pass the FE exam and are a registered member of the state association of engineers
I agree, what's wrong with "developer"? I know what engineers do, and devs don't really do the same thing.
You’re gonna be replying to a lot of HN posts with this trope then.
nice try, oregon