22-V-1987.

The day is approximate, and the story actually extends two weeks or more, probably into june.

The app I did for stambena grew gradually. Not so much in code, there wasn't much to do, except add the postal numbers table and move the data for any members from other municipalities into separate tables - they wanted to make one unit per municipality. As the thing grew.

This was a minor victory for me, as I wanted to have the table of postal numbers from day one, but Radoje said it'd be an overkill - and I was exactly a newbie then, this being on perhaps third day on the gig, a year ago. Now I had to pick all the possible names of places and put them together and assign them postal numbers. I've found that one place, with a two word name, was written seven ways - and that's all uppercase (everything was uppercase, a leftover habit from the pdp; this was on a Partner and the m$ Cobol for CP/M could use lowercase without a problem). Something like "First Second", "First Sec.", "Fi.Second", "Fi. Second", "F. Second", "F.Second" and "First S.". Wow. There were perhaps six places like that. It took me about two days to comb this out.

The next problem was the members' statements. They wouldn't match with the reports. For many new users the amount (of money deposited or debt, if any, or was it the materials bought) wouldn't match the amount I'd get when totalling, compared with what I'd get when printing them individually. It was quite baffling, and I tried this and that and eventually came with two separate ways to print the statement: one would go the standard way, by index. The other would go sequentially through the whole table and filter for that one member. For some, they were identical. For those who would have the discrepancy, the sequential go would always find more records.

After a while I found out that the problem began when the number of records exceeded 32767. Bingo. I've found an error in m$'s indexing routine. It just didn't index records beyond this number, or overwrote older index entries, don't remember which it was - it practically made any records over this number inaccessible via index. Radoje couldn't believe that such a newbie would find a bug in system software, but there it was. Being a fellow mathematician, he recognized proof when he saw it.

So he started calling Iskra Delta, their unit in Novi, to see what's up. We waited.

I still had some deadline to print all those statements... and came up with the idea to write a sort. Export the table, indexless, into a text file (what would later be called SDF format) with no header, records of fixed length. Then read each record into memory, use shellsort to put it in order, write it out. Yeah right, but the memory constraints... there was no way I could hold the whole table in memory when I had perhaps 36K of memory available. So I wrote it in chunks - with something like square root of the number of records being the size of a chunk. So I'd read one chunk, sort it, write it into a chunk file. Then next, until the end. Then I'd open n buffers, one for each table, and find the alphabetically first among the one record each one held. Write that out, take next from its table. Find which one is first now. Until all chunks are exhausted.

Radoje thought I was nuts, a sort was a thing of high level system software and it would take months to write and test. But I said I'm not writing a general sort routine, I'm writing one for this specific table and this specific key and I don't care about the general case. We discussed it on a friday; I said I'll write it on paper, and if by end of day on monday I don't see an end to it, we give up. I got it working by monday noon. I wrote it in Turbo Pascal 3.1 or thereabouts, which was blazingly fast, compared with Cobol.

The Iskra Delta eventually invited us to come and see what they had.

It turned out that they knew about the bug (but were silent like a turd in the grass), but the upper management said the M$ Cobol was for small business only and they never expected anyone would reach the deadly number of records. For those who did, they had an alternate set of indexing routines. It did involve some serious mumbo-jumbo, but we got the diskettes with this alternate set, with instructions, and I went on to implement it.

The trick was to create buffers, of size which had to be calculated, and which would have to contain a pointer to the record buffer itself, plus the length (learned that it's called string descriptor, later in the year). Luckily, we had a disassembler and some kind of debugger, so I could first compile it as it were, then find the location where the key in the record was, wrote down (on paper!) its address, then inverted the lower/upper byte (little-endian the code was), and wrote the number into the location-of-key field of the index buffer. Then also had to link the routine and initialize it... but then after that everything went sort of automagically. And, surprise, it ran about six times faster.

Then I went on to redo this for every bit of code we had - perhaps some 20 or 30 .cob files.


Mentions: 20-V-2008., Majkrosoft (m$), Novi Sad, Partner, PDP, Radoje Maletin, stambena zadruga, in serbian