When your disk drive had an 11ms access time, it didn't matter much how many layers of code were between your application and the write head. Ten years on, a good enterprise SSD can have a write access time of 11µs, but CPUs aren't anything remotely close to 1000x faster.Okay, just for grins, a typical desktop hard drive has an access time of about 9-15 ms, or 0.015 seconds at the slowest. So Pixy picks a number and says 11 ms, which is approximately correct.
The solution? Get rid of most of the operating system. Maybe.
That 11 ms, that's the time it takes from the moment the drive finishes receiving the read request until the moment that data starts flowing out of it. 11 ms is not a lot of time for us, as it is a smidge more than a hundredth of a second, but for your CPU it's an approximate eternity.
Understand that a processor with four cores running at 2.5 GHz will spend twenty-seven million clock cycles waiting for the data it asked for. When your data is stored on spinning metal, that latency means there will be times when your CPU is waiting for the hard drive, and there's plenty of time for it to handle other tasks. For the most part, in a single-user environment, that means that nearly all the time, your CPU is running the OS itself and is otherwise waiting for input.
In the old DOS days, this meant that perhaps 90% of the machine's total computing power was available for running programs. DOS was not a terribly complex operating system (simple enough that some people write their own for fun) and a lot of it had been programmed as assembly code. It could not multitask, had no real security, and was limited to arbitrarily small memory maps, but it was tight and fast and it stayed out of the way of applications. It worked well when clock speeds were single-digit megahertz and it cost ten dollars an hour to connect to the Internet at 300 baud.
The advent of graphical user interfaces like Windows and Mac OS meant more computing power had to go into running the OS. The presentation layer of the OS became vastly more complex and a lot of the gains from moving to 16-bit (and then 32-bit) processors was consumed by the needs of the OS itself.
But these days?
SSDs are at least an order of magnitude faster than any storage we've previously used for personal computers. Actually, it's more like three orders of magnitude, as an 11 us access time is one thousandth of the hard drive's 11 ms access time. At this level the CPU is still waiting for the data to start moving, but now it's waiting twenty-seven thousand clock cycles; since we're still using SATA (mostly) the data flows at 600 MB/s--which is a speed at which the processor could (in theory) actually execute the code as quickly as it comes off the drive. In that case it would still spend time waiting between instructions, about four clock cycles each.
That is how slow the fastest drives are. I/O is, and always has been, the bottleneck. CPUs have always waited; and so the OS has bloated up, because all that processing power was just sitting there most of the time, waiting. But in a world where access times and data rates are creeping higher, it's probably time to reengineer the OSes to take advantage of that speed.
There are three very important concepts that a good OS should be built on.
1) the actual user should never have to wait, at least not more than is necessary. Which is to say, when I click on the icon, there should be immediate feedback that the icon has been clicked and that the computer is doing something about it, even if the computer itself must wait to start acting.
2) the actual user should always have focus where he is working. So if I'm in window A, typing something, the computer should never pop up window B and steal focus. It is possible to pop up a window asking for input or giving a warning without stealing focus from the user. There is nothing that can happen on a modern personal computer which is such a dire emergency that it demands taking focus from the user's task. At least, not that the OS can do anything about.
3) The OS should never get in the way. By this, I mean several things.
a) The OS should never use more than maybe five percent of the total processing power and storage available; you should never have to look at limited storage space and say, "Well, I'll get rid of X because I need so much for the OS." (Had that happen this week. Certain client's computers use SSDs and the machines have 100 GB storage. Windows 10 takes 27 GB, and the page file took at least another 16 GB. Almost half the drive taken up by the OS. No.)
b) The OS should never, never, ever tell a user that they must restart right now and then take half an hour to come back up because it's installing updates. It does not take that long to copy the data from one part of the drive to another, not at SATA speeds. There is no excuse for this; again, there's no update which can be so mission-critical that it cannot wait for lunch hour or EOD.
c) And the OS should never prioritize any of its operations ahead of the user's. There's plenty of wait time between keystrokes or mouse movements for the OS to do what it needs to do.
Still, when you are doing something irreversible, the OS should always, always, always ask first. It doesn't have to be intrusive, either. One of the best inventions in OS functionality is the recycle bin/trash can. You delete something, it goes there automatically, where it can be recovered if you realize you made a mistake. If it can't go there (for example, it's too big) then the OS asks if you want to delete it permenently. Deleting files, erasing disks, overwriting old files, any time. I don't know what Macs are like these days but Windows has gotten pretty good at protecting us from stupid mistakes--and even the bestest users ever make mistakes.
True story: in the summer of 1983 I was working on writing a D&D character generator for the C64. It just automated the process of rolling the dice and writing everything down, and it would have (eventually) allowed one to print out his character sheet and to store the characters on disk. But something happened.
You see, I have (for a long time) had the habit of saving each bit of progress as a new file. So there's usually a subdirectory in each project folder called "old versions" and it will be full of filenames approximating LUDICROUS-DRECK-02-10-03 and LUDICROUS-DRECK-03-19-12 and so forth. Each file is a snapshot of the project as it was at the end of work for that day. This has two advantages: the document file doesn't get all crapped up with editing cruft (because Word saves that) and it keeps a record of my progress. If LUDICROUS-DRECK-05-01-19 coughs up a bucket of dicks in an unrecoverable fashion, I can (in theory) go to LUDICROUS-DRECK-04-29-19 and just retype what's missing.
I did that with my character generator. Every time I'd add some code, I'd save it as the latest version. But of course the floppy was filling up with program names, so I decided that I'd delete some of the oldest ones to save space. The C-1541 stored 170k on a single floppy and no one copy of this program was bigger than about 10k, but you could only have 128 entries in the directory. I didn't need the oldest versions on that disk, so I copied them to another disk.
This was 1983 and the DOS was the craptastic Commodore 1541 ROM. Deleting a file took this:
open 15,8,15,"s0:THE-FILE":close 15,8,15But that is not what I typed. Oh, no. I got one character wrong and typed N rather than S.
...and the thing obediently erased the disk. "S" was for "scratch" (delete) and "N" was for "new" (format). The "NEW" command syntax was actually "N0:[Disk label],[disk id]" but didn't kick at the lack of a disk ID; without the disk ID it just cleared the directory.
Of course it was my fault; I wasn't double-checking what I was doing. But even so, the OS should have stopped me: it should have required a disk ID with each format command regardless of the situation, precisely because the "format" command would erase the whole disk.
I had backups. Not recent enough. I gave up.
The way we make an OS right now is a layered structure:
applications...more-or-less and your mileage may vary. As you can see there is a lot of nonsense between the user and the hardware. Applications are the programs (WoW, Word). The shell is where they operate; in Windows it's called "Windows Explorer" and is otherwise known as "the desktop". "Libraries" are the zillions of files that end in ".DLL". Drivers--you know what those are; they tell the OS how to talk to devices. Kernal is the core of the OS, the thing that talks to the hardware; and "hardware" is self-explanatory.
Everything communicates with the layers next to it. Shell doesn't talk to hardware, applications don't talk to the kernal, etc. (Which is to say they are not supposed to and any program which does that will only work on a specific version of the OS--which is why they're not supposed to do that kind of thing. Programmers break these rules, pat themselves on the back for being 1337 hackers--and then the next time the OS is patched and their program breaks, they blame Microsoft.)
This works, but it's bloaty. For comparison, here's how the C64's architecture looked:
ApplicationsIt cut out the libraries and drivers because the hardware was dead simple and was not different from system to system, and didn't change, and the computer's ROM contained all the function calls needed to run it. In fact, the shell and the kernel were actually in the same layer.
But compare what a C64 can do to what a modern Windows 10 machine can.
There is some irreducible level of complexity that a modern computer will have. I don't think Windows 10 is an example of that; Windows 10--while better in many respects than prior efforts--is craptastic bloatware. The speed and power of a modern CPU makes it possible to write very elaborate programs that are very sloppy, but work well enough and perform satisfactorily. We don't notice the difference, either, as long as the new PC is faster than the old one--and that's held up, mostly, pretty well.
Moore's Law, however, has a practical end point. We haven't reached it quite yet; they're still finding ways to reduce feature sizes and I'm hearing tell that about 7 nm is the new hotness. (Compare that with the 70um processes everyone worried about in 1991, which is a mere 10,000 times bigger than the feature size Intel is having trouble with.) I don't think 7 um is the bottom--but the closer we get to the bottom the harder it is to make progress, and absent a breakthrough I'd think that the limit will be somewhere around 0.8-0.9 nm, which is so f-ing tiny that it boggles the mind we're even talking about it. (100,000 times smaller than 70 um, and it's about 4-6 years away, maybe. Maybe, if someone doesn't discover something new and interesting between now and then...and I won't place a bet, either way.)
Once we do hit the practical limit, though, the next thing will be stacking processors--32, 64, 256, 1024 cores--until that hits a limit of one sort or another. Not sure where that limit is, to be honest, but it's probably somewhere in the "power consumption" dimension. Each processor core needs power and the more you have, the more you use. No one wants a computer that uses as much power as an air conditioner.
After that, finally, any further performance gains will have to come from improving the software. But hardware will be improved first because that's actually easier than fixing the software is. There are many times the number of programmers as there are hardware engineers, and they'd all have to learn a different paradigm and work harder on their code.
* * *
When I say "practical limit" I mean just that. Moore's Law can be kept going a very long time if you're not worried about economy. Intel hit a practical limit with processor speed in the late 1990s; having bet on the Pentium 4 architecture they expected it to hit 10 GHz, but it ran into a wall around half that.
You can run a P4 at 10 GHz but it requires cooling with liquid nitrogen. Absent that, no. Cryocoolers are manufactured in job lots but they're spendy; effectively no one is going to spend $5,000 on a cooling rig for his $800 computer. (There will be some geeks who do. They are in a vanishingly small minority.) Multicore processors are easier and cheaper to manufacture than anything that runs faster than about 5 GHz.
Moore's Law will be the same. They'll keep cutting feature size in half for a good long time, but it'll be proof of what they can do in a laboratory rather than anything that can be manufactured on a large scale. "We made a transistor out of three atoms!" for example, or "We made a functioning atomic-scale computer." Which you can do with the right kind of gear, but "the right kind of gear" costs $100,000.
Optical computers won't help us. The scale is too big, optical wavelengths too long; recall that the optical equivalent of a Core i7 processor would take up about 60 square yards of real estate. Optical-electronic hybrids will be faster than pure electronic, but I wonder if they'll be fast enough to justify the expense?
Quantum machines will be very, very, very fast indeed, but I'm not sure how useful they'll be for running Word. The paradigm doesn't seem suited for general applications; you'd use a quantum processor for solving complex equations extremely fast but you wouldn't use it for putting words on a screen. I think there might be quantum coprocessors to handle math etc, which would speed up some operations, but not make the computer run faster in general. (Real-time ray tracing--that would be the "killer app" for quantum processors; but they're already starting to do that with conventional silicon.)
Overall, I am convinced that computers will continue to get faster and more powerful for quite a spell, yet. Like everyone else I can make educated guesses as to when that will end, but it almost certainly won't happen in my lifetime. The pace of progress is slowing, but it's going to be twenty years at least before any real serious resistance is encountered on the path upward, and probably thirty or forty before they reach the hard practical limit.
...after which the programs will have to start improving.