That was about eleven years ago -- the little boy in the video above is turning 13 tomorrow -- and the machine stopped working shortly after the video was taken.
This month I started investigating the cause of the trouble. I found that the +5V power was sagging severely when the machine was powered, which isn't terribly surprising because the CPU alone draws something like 40A! Evidently the industrial power supply I had purchased wasn't doing its job. Unfortunately, what used to be a $50 power supply was now upwards of $400, since the manufacturer had closed. A little looking around, I decided to replace the +5V only at 50A.
The resulting power supply got the machine to start up, but it was very erratic. It would crash at various weird places, and for no particular good reason. This had me going for a long while, but fortunately the machine was made to be debugged... you can select from two internal clocks: a crystal (the default) and an RC oscillator. You can adjust the frequency of the RC oscillator, but it's not very stable. You can also wire in a single-step switch, which I had done in the past. None of this seemed to help, but I learned that my frequency counter is pretty stable in the process.
Accidentally, I found that the machine would crash when I bumped the power supply wiring harness.... and found that the screws that connected the harness to the power supply were loose.
Tightening the screws fixed that problem!
But there were more problems. Several of them were easily resolved by polishing all of the card edge contacts. This is not hard, but takes some doing.
This got the machine to the point where it had been before it stopped working. It could load my custom, simple OS, EMON, and run programs entered from the console.
But I had previously wanted to load other software, could I do that now? Since I do not have any peripherals aside from serial lines, using a emulated TU58 tape drive seemed the best option. Since I last checked about a decade ago, tu58fs has been written, and seems to be the most convenient. So I tried to get XXDP and RT11 running, but both failed.
XXDP in particular was stopping at address 000550, this seemed to be similar to what was described here. Using the disassembly provided there, I agreed that this was a checksum error. But since the tape file booted on SIMH, I knew I had a good image.
By manually digging through what the boot loader had loaded into the pdp-11's memory, I verified that everything was perfectly loaded. The checksum was being computed incorrectly! I wrote a simple program to compute the checksum of that good block... and the checksum came back wrong. After simulating the code carefully in python, I determined that the cause was that the carry bit was getting set way too often. The carry was set whenever it should be, but also whenever the highest bit of the destination register was set.
I traced this back to the output of one multiplexer chip. It was being held low when it shouldn't be, but only about 0.9V... an uncertain level. Fearing the worst, I replaced the chip.
I socketed the replacement just in case I had to replace it again. This is perhaps unnecessary, and it might cause issues if I ever get a floating point unit. But since this board is the frontmost large board, it's not a problem.
Sadly, this didn't do the job... the carry still was getting set incorrectly. Tracing forward, it turned out that the destination register's highest bit has a special active LOW line that runs on a trace parallel to the carry bit. A tiny bit of solder had bridged those two lines at the following chip.
Evidently, this chip had been replaced previously -- not by me -- and the job was not done carefully. Cleaning up the solder job and removing the bridge fixed the carry problem.
The machine now boots XXDP!
But, as the screenshot shows, there are still problems... I originally had 32 kW of memory loaded, for a total of 28 kW available for programs. (The top 4kW is always reserved for memory mapped I/O.) I have verified that all of this memory can be accessed and used without issue. Although the memory management unit is not needed to access 28 kW of memory, apparently XXDP prefers to have it available. The memory management unit (embodied in the M8107 and M8108 boards) is actually installed, but it is evidently not working correctly.
In order to get XXDP to boot, I had to pull the top 16 kW out, which is the four cards above that are obviously displaced. (The memory lives in a separate cabinet above the CPU because I had trouble finding a sufficiently powerful power supply for it. Since it takes unusual voltages, I didn't fancy building an appropriate power supply from scratch.)
Here's a video of the entire boot process, in which I key in the boot loader using the front panel.
No comments:
Post a Comment