zhasha wrote:
Admittedly I don't know how ARM passes arguments etc, however from what I read, the calling conventions are different from x86.
Calling convention is the least of your concerns when performing binary translation. Except for user mode emulation layers it's a non-issue - how the program does it internally doesn't matter.
zhasha wrote:
Now the goal of using LLVM would not be raw power, but more the ability to recompile to run the code natively.
That's what QEMU does. What do you mean by raw power exactly? You want recompilation because it's faster, correct? Otherwise there should be no reason not to be content with interpretive emulation, which is probably better in all other ways.
zhasha wrote:
It doesn't have to be JIT; it could easily be a full recompilation + optimization.
Actually, static recompilation has many complications that make it less ideal than dynamic and with poorer compatibility. Indirect branch search sets can't be directly known, and while heuristics exist they may not resolve all targets and will most likely cause more code to be generated than what is branched to. Probably the more you achieve the complete target set the more false positives (and hence unused "code") you'll end up with. Dynamic loading and self modifying code can't be handled at all, which is a more fundamental issue than may be evident. If intent on storing translated executables to disk then you'd also be passing a large amount of the conversion overhead bloat to your filesystem.
zhasha wrote:
The idea is to eliminate the need for any layer between WINE and the application, or changes to WINE that aren't purely because of compilation/runtime issues - not cross-arch related.
Converting the code to LLVM and recompiling it does nothing to accomplish this. In fact, such a thing wouldn't really work because WINE operates on x86 Windows executables and would almost definitely fail to understand a Windows executable that has been converted to ARM. To do things in this order (x86 Windows -> x86 ARM -> x86 Linux) you would need a completely different WINE implementation.
On the other hand, x86 Windows -> x86 Linux -> ARM Linux is a path that can work. By performing recompilation dynamically you can allow for this. WINE converts the application in memory to x86 Linux, and as this program is executed it is converted to ARM Linux.
zhasha wrote:
In the case of qemu-llvm, I can imagine that a large performance hit would come from not having all the application code at start time, thus you can't optimize as agressively.
This is a naive assumption and is most certainly not the root cause. You can't compare binary translation to backend compilation. LLVM is a language designed to facilitate compiler output, not provide an intermediary between different machine codes - it just doesn't model certain low level details well enough. Typical compiler output is actually a lot simpler than machine code and contains more useful information. You won't have this information when converting from x86 to LLVM, nor will you have something that nicely models flags and other machine behavior in a way that LLVM's ARM code generator can use. Full program optimization is nice at a high level, but when you're dealing with machine code most of the important context is sitting in registers that will usually have a limited liveness window for any particular allocation. Besides that, you can recompile adaptively to achieve a lot of "whole program" like optimizations, and you can recompile greedily to achieve things like inlining.
Actually, the big hit from llvm-qemu comes from using the intermediate macros to generate LLVM code from, rather than going straight from x86 (which would have been much, much more work and you can see why no one has done this yet). But what's telling is that it couldn't even do better than standard QEMU. Think about what's happening: normally QEMU (as of then) would compile blocks of code by pasting GCC generated function bodies together. So each code block is compiled in isolation. llvm-qemu would paste llvm-gcc generated function bodies together, then run the llvm optimizer on the entire block. This should have provided some level of register allocation, liveness analysis, propagation, and so on, and yet it was still worse than the original version. It could have been llvm-gcc's fault, but I have to wonder.
zhasha wrote:
The only remaining piece of the puzze would be the x86 frontend for LLVM, which is a huge piece of work - then comes all the stability/optimization work of course.
I'm not at all saying this is a good idea, I'm just saying it's a solution.
But so is using QEMU/WINE, and it's a solution where much more of the work is already done. In fact, it might be possible to run it in this manner right now, without any coding being necessary. Someone should try it.
zhasha wrote:
The idea of running x86 binaries on an ARM processor is ridiculous enough in itself, let alone binaries from Windows. All the good games have been ported anyway

Hm, I wouldn't count on most people agreeing with you, maybe in recent years some big name games have been ported but that isn't relevant. We're talking about games that ran on mid to late 90s PC hardware, how many of those were ported to Linux? Or do you think all of them are bad..?