2

Preface

I'm considering the idea of porting the time-shared BASIC (TSB) that was developed by Hewlett-Packard for the HP 2000F TSB system, which used two processors at the time (one for the I/O and the other for time-shared BASIC operations for each user on the system.)

I can elaborate on why this particular edition was so good, if asked. But let's suffice it that it was very good.

At the time, we (yes, I worked on TSB back in the day) worked with core memory (which remains one of the best inventions for non-volatile memory.) Texas Instruments is including FRAM (up to 256kb of it) in its MCU ICs. FRAM is not the same thing as core, but it has many of its excellent features. And it is attractive (to me) to consider adapting the HP 2000F TSB language features to the MCU. Many of the design choices made are appropriate for this kind of limited memory size (256 kB + 8 kb SRAM) and I think it would be a good fit.

Question

I'm facing a question, right now, for which I could use some thoughtful input. TSB only supports one numeric data type -- a floating point format. All variables, arrays, and matrix operations assume this single, simple format. There are no integers and, obviously, no variations in memory footprint. Every numeric value is floating point and occupies exactly the same space.

It worked well enough 'back in the day.' But it imposed some limitations on array sizes -- especially in cases where only integers were being kept there. We'd spend time "packing data" into FP for denser formating, with added code to achieve it. But it was a pain and required some care because FP doesn't follow some math rules (like the distributive property.)

Also, I'm planning on using the FRAM for storage of "compiled-save" and "ASCII-save" BASIC code, and also for "FILES" containing preserved data. I'd like to reserve the SRAM for running variable storage. This will limit the size occupied by all the arrays and variables.

The question is this: Is there a strong reason for supporting integer data types?

The downside for me is that expression execution will have to accommodate different types if I support them. This will increase the FRAM footprint for the simulator and will force me to carefully consider conversion rules. The reduction in remaining FRAM will impact saved code and data space. And I don't think there will be much advantage in execution time. On the other hand, it will allow smaller SRAM allocations for arrays of integers. And that may be worth the trouble.

This isn't an easy question for me, as I'm a little unsure of the market interest for end users. Only time will tell on that score. I'm also not looking to make any money on this. I make plenty already doing my regular activities -- way more than I need. But I enjoy writing interpreters (not the first by any shot) and I would like to build something that will help others. This 'balancing' issue is bothering me right now and I'm interested in any thoughtful comments.

Summary

This is something I will do. And I have the experience and background to complete it. (Done it before, at least.) My hope is to allow users to leverage the TI launchpad products (which can be used to program individual, external ICs on protoboards, for example) to generate their own custom-programmed MCUs that include the execution-code for BASIC, as well as creating, editing, and saving BASIC code in FRAM. Using the techniques developed with HP's TSB, RAM usage can be mitigated/adjusted by breaking the code up into multiple programs (the CHAIN command can be used to preserve certain variable values in SRAM while releasing others, allowing the 8kb to be more effectively used in tight situations.)

My main thrust right now is about the value/cost relationship of supporting more than one datatype. I'm not expecting 'the answer' that clarifies everything for me. But I'd love to hear some good arguments. I'll select the best, regardless of how much it actually helps me with this project.

Raffzahn
  • 222,541
  • 22
  • 631
  • 918
jonk
  • 170
  • 8
  • Would you mind to cutting this novel down to one answerable question? RC.SE does't work best when used to chitchat around. Focusing on a dedicated question supported by basic related information is most useful. – Raffzahn Jun 26 '22 at 12:02
  • @Raffzahn I did write, "The question is this: Is there a strong reason for supporting integer data types?" But perhaps I should be asking elsewhere. Do you have a better site to recommend? – jonk Jun 26 '22 at 12:04
  • 1
    RC.SE might be perfect for that question. The point is simply to cut that page long essay down to the question and it's base. While it's interesting to read your personal history, it isn't rally helpful to boil down a useful answer. – Raffzahn Jun 26 '22 at 12:06
  • @Raffzahn How would you improve it? I'm interested and will follow any good guidance. Perhaps I'll make the question the lead and bring it to the top? I do feel some context is required, though. (I admit I'm not a good writer.) And thanks for the edit. – jonk Jun 26 '22 at 12:07
  • Point is that most of the information given is just not helpful, as it describes your project. RC.Se fits best when asking for a specific question. It fails when looking for project consulting. You still need to do your own decisions in weighting your preferences. Personally I would have worded it in 3 lines : " I like to downport HP TSB (likes) to an MCU (liked)"; "HP TSB does not have an integer type lie many other BASIC do"; "Are there advantages in adding one?" One could elaborate a bit on the last by asking in addition for reasons to abstain. – Raffzahn Jun 26 '22 at 12:39
  • Asking for implementation details/optimization of implementation would be already a subsequent question, as such are utterly dependant on answering the need to do so first. – Raffzahn Jun 26 '22 at 12:41
  • @Raffzahn I'm in the early stages. Perhaps that's the flaw and this is not the right place, then. I probably should instead spend time with local universities (I used to teach at one.) There I can have a useful dialog. I may have been misguided writing here at all. Thanks. – jonk Jun 26 '22 at 12:45
  • 2
    If it were my project, I think I'd consider a build-time configuration for my interpreter. That way, the user can decide which data type(s) are important to support and whether to pay the "toll" to have it built into the interpreter. – Brian H Jun 26 '22 at 12:57
  • @BrianH That is counter to one of my goals, which is to have the complete interpreter located in the IC. I had considered the idea of splitting things up so that only the execution engine occupied FRAM, with other parts (compiled to scrunch format vs ASCIi, for example) residing in the PC itself. But decided against it as I want the end user to decide the circumstances leading to modifications, which may not include a PC present. Having everything needed in FRAM means it is O/S independent and only requires ASCII input by whatever means. That said, I will think about what you wrote. – jonk Jun 26 '22 at 13:05
  • 1
    @BrianH It may be possible to consider, taking both your own point as well as mine, to offer a command that would remove the code portions that relate to supporting development of code, leaving only the run-interpreter in FRAM. Further development would be then blocked. I think this might be useful to consider. Thanks! – jonk Jun 26 '22 at 21:13
  • @Raffzahn I'd like to find one or more individuals who'd consider tearing down stupid thoughts. I am not looking for coding help. I'll write the TSB implementation. That's my problem. But I really could use advice and crafted thinking of others to help avoid stewing in my own juices. Would this site tolerate very specific and narrowly scoped questions that come to mind from time to time? Or, if not, any thoughts where I may go to find such individual(s)? – jonk Jul 02 '22 at 22:35
  • @jonk " consider tearing down stupid thoughts" ??? not really sure what that is supposed to mean. There are some really proficient minds on the site. More than one having already written their own BASIC (or likewise) So shoot your questions at us. That's what this site is about. Now if you're looking for a more close cooperation, you may want to setup some cooperation tool. One of the most easy open platform to start with might be some Discord setup. Especially helpful for throwing ideas around in a time spread fashion (i.e. chatting independent of time zones), – Raffzahn Jul 02 '22 at 22:44
  • @Raffzahn I am slowly working through the TSB manual. Something I will NOT support are IBM 2741 terminals. In fact, I'm going to ignore parity and just process 7-bit ASCII. But I'm currently struggling over whether or not to support FLASH. (FRAM is an easy slam-dunk.) There are some trade-offs I'll have to make if I choose to support FLASH. Discussing these choices will be a matter of opinion, not settled fact. And I worry that this kind of thing doesn't fit this site's mission. I'm thinking perhaps more of a blog, where I post up thoughtful but vague questions and gather comments from others. – jonk Jul 02 '22 at 22:48
  • @Raffzahn I've decided the issue with respect to FLASH vs FRAM. That's done, now. I have other issues. But I think I can resolve them, one at a time. The simh simulator system, supporting both the IOP and the MAIN processors, provides me with a precise model for testing and verification. So I'm in good shape, there. I'll be fine, I think. It would be nice to have someone to debate the issues. But I think I will be okay on my own. I'm moving forward. And it looks good, right now. Just FYI. Thanks for everything!! – jonk Jul 04 '22 at 07:16
  • Imagine using floating-point FOR loops. The improved HPL in HP9825's had a choice of split-floating or integers. – Tony Stewart EE75 Aug 03 '22 at 21:14
  • @Tony I've decided already to follow Jerry's advice. For now. Time will tell about the rest. – jonk Aug 04 '22 at 06:13

3 Answers3

5

The question is this: Is there a strong reason for supporting integer data types?

In the MCU with limited memory? Sure. But you can get most of the benefits without a new type with very little additional code.

In this former thread the format on the Sinclair machines was discussed. Their BASIC set a flag in the data to indicate it was "short" and then short-circuited the evals in those cases. This gives you better performance on integers. However, I see a major flaw in their implementation - they used the same storage format for these values, meaning they took up the same amount of memory even for small constants. The space in the variable value table is really a non-issue because it tends to be small (~70 vars in the largest program I have), but in an app with lots of DATA statements or doing file reads into an array, their implementation could be greatly improved by using a second storage format.

Answering these exact questions is why I wrote RetroBASIC. It has some basic stats gathering that I found extremely revealing. For instance, an integer type seems like a very good idea when you consider that 669 of the 712 numbers (not including line numbers) in Super Star Trek are 16-bit ints. But that obscures another important stat that 1/3rd of all the numbers are 1 or 0, and the vast majority of those are found in logic statements (IF X>0...) or loop increments. An example: there are 101 zeroes in Super Star Trek, and 77 of those are in =0 or [<|>|=]0 tests. So if you're trying to save memory in a typical program, you're better off making a token for "1" and "0", or perhaps looking for and tokenizing the entire A=A+1 into inc(A) and things like IF X>0... into ``IF GT_ZER(X)...` which will reduce the size of the tokenized code and improve performance.

Maury Markowitz
  • 19,803
  • 1
  • 47
  • 138
  • 1
    I'll need to reread this when I wake up. Yeah, I should have been asleep hours so. Thanks! – jonk Jun 26 '22 at 12:42
  • Okay. Thanks, Maury. That's a VERY interesting link and I'm glad to be aware of it, now! (RetroBASIC.) I was around and used 'paper tape BASIC' at a time when Microsoft was just a tiny company. I have the full source to QB 4.5, also. But it was terribly implemented and used the very worst methods for the transcendentals (Taylor's instead of Chebychev + non-linear minimax, which HP used correctly.) I'm interested in taking advantage of HP's approach to compiling-before-running. (Which isn't really compiling, but greatly improves interpretation.) And also their methods for the variable tables. – jonk Jun 26 '22 at 17:45
  • Which, once 'compiled' the variables within the code were adjusted to use address links to the table, instead, so there was no lookup required while running. Finally, and most importantly to me, HP's full support for matrix operations is an essential goal. I will definitely now spend some time looking over your own work, though. It will be worth the effort, no question, and may help me think a little more and find interesting concepts. (Or, if nothing else, recall and strengthen some things I didn't like if you followed Microsoft's approach too closely.) – jonk Jun 26 '22 at 17:48
  • I do need to think more about your thoughts on tokenizing certain constants. I'm going to look over some old code (there is a VERY large contributed library set that HP kept track of and is now available, I believe, at bitsavers.) So I can do some analysis of that code to see what is worth doing, perhaps. (Though my goals are to make this useful for embedded MCUs, with peripherals, where the old minis had somewhat different target application spaces.) – jonk Jun 26 '22 at 17:52
  • Just skimming, it looks like you do generate a linked list of statements. But it appears that variables are not compiled prior to the run. I will replace variable names with address references to the variable table, so code has consistent, predictable execution times. (This is what HP did, too.) All line number references in code (computed goto, for example, should also directly link to addresses before execution starts. I'll be coding entirely by hand-assembly. I can perform topological code inversions (my term) that no compiler even today can compete with. – jonk Jun 26 '22 at 19:07
  • jonk, if you do run it on the HP library, PLEASE post the results! I am still slowly working my way through the HP code in What to Do After Hitting Return. – Maury Markowitz Jun 26 '22 at 21:25
  • Retro does compile a list of variables during the scan, before runtime. This is one of the changes I made from gnbasic, which did variable parsing lazy at runtime like MS. I did this for collecting stats though, not performance or size. You could hook into interpreter_post_parse and loop over the statement list and add a pointer into the variable storage tree to each of the variables found in code. – Maury Markowitz Jun 26 '22 at 21:34
  • I'd be glad to let you know when and if. The mathematical approaches taken by HP engineers is peerless. It took another 30 years before I saw the same technology applied in other BASIC libraries. Not done with RSTS. Not done with anything from Microsoft. Not for many decades. They were behind for a very long time. I need to think about how to write something to analyze all that contributed library, though, with an eye towards the hint you provided. Much could be learned. I just need to work out how to learn it. ;) – jonk Jun 26 '22 at 21:35
  • I can't use your code. Unfortunately. (And thanks for the correction with respect to the variables. I had noticed interpreter_post_parse, already. And was planning to look more into it, later.) I'll be writing this in straight-up MSP430 assembly. It must entirely execute within the FRAM space on a genuine, real MCU IC. Nothing will run under Windows or Linux or FreeBSD, other than PuTTY terminal code to access the virtual COM port via the USB connection during development. (Saves the trouble of a VGA shield and hosting HID over USB for keyboard.) – jonk Jun 26 '22 at 21:37
  • Given the context, I would strongly recommend the Sinclair solution, but add another token for "short constant" so you don't have to write out the exponent or extra digits in the source code. At runtime, expand the value back out to the full size and add the exponent 80 when copying into the variable table or FP "registers". In the FP code itself, check the token to see if you need to do decimal adjust at the start or end, and change the token if any of the ops overflow it so the variable table gets updated. – Maury Markowitz Jun 27 '22 at 13:41
  • I very much like the short-constant idea. It's good. People don't write long, drawn out constants... often (maybe PI or E or a few others.) (Hmm... perhaps I should make certain constant names part of the language.) Well, these are (at this time) 2nd order details. I've a lot bigger fish to fry, right now. But I'll tuck these away. I should be able to get the entire system done in under 15kb of FRAM. (If not, I'm not trying hard enough.) – jonk Jun 27 '22 at 17:23
  • "perhaps I should make certain constant names part of the language" - oh yes for sure, for the cost of a token you can reduce source size considerably in some cases. Some dialects use functions like PI(), others meta-vars like &PI. Choose which you prefer! – Maury Markowitz Jun 27 '22 at 18:40
  • Thanks, Maury! I did select "the answer" that actually convinced me about moving forward on the issue I presented. But that doesn't mean I haven't enjoyed the time you've also offered me here. I'm in the process of writing up my selection of features and the reasoning for including them, taken from HP 2000F TSB -- the CHAIN command and the COMMON statement are key elements that have very important value in the context I'm considering, for example. I appreciate all the thoughts! – jonk Jun 27 '22 at 18:44
2

As far as I can see the question here is:

Is there a strong reason for supporting integer data types?

Well there is no definitive answer. It's, as usual, defined by naming advantage/disadvantage and prioritze them according to your needs/profile. In addition there's a historical dimension, especially when recreating classic software

  • Original BASIC was intended to be a simple (learners) language. So simple it should be.

  • Data was only numeric (*1).

  • Float can cover (*2) integer fine.

  • At that point is was a similar to using a combined type today.

  • When (much later) the need for string did arise the added typing was rather a hack than a planned development

  • Original Dartmouth BASIC and many early BASICs - including HP - did not add other types at all, or only very late in the game.

  • Main reason to add integers is space saving. and integer is some 1..4 bytes of data storage compared to 4..8 for float. Most notably with shorter int sizes (like 16 bit).

  • Space saving was important at both ends:

    • in professional computing it allowed to fit large(r) data sets into limited Memory
    • in home/micros with their initially small memory it might have enabled useful applications at all
  • Especially with early micros space saving was important.

  • Using integer may increase speed. In searching as well as in execution

  • The later may require a good deal of additional code not needed otherwise.


So ask yourself, do you ...

  • ... see any advantage of dedicated integers in your project?

    • That is speed or storage size.
  • ... want to diverge from the original language by adding?

    • Or better stay as close as possible to the original language?
  • ... want to consider various ways of adding integer?

    • Like adding
      • type suffix ('%') or
      • type definition (DEFINT A-F *3) or
      • internal discrimination.
  • ... want to only change the internal workings by adding an automated integer type - to realize the speed advantage without changing the language?

    • That is, if speed is an issue at all?

You port, your language, your choice.


Now, if you allow a personal opinion, I would go ahead an do it as close to HP TSB as possible do all materials about that language could be used 1:1. At the same time I would think about extending on the original idea of Dartmouth BASIC of not having types at all.

This can be easy done by only noting the type as result of a variable being set using implicit type conversion. If a certain type is needed, explicit conversion can be forced with the usual functions (INT, STR, etc.), eventually adding a FLOAT() for completeness.

It would add back simplicity for average small programs while preserving the ability to force certain formats when needed.

Under the hood (aka in the interpreter) one can do any other optimization as well without telling the user - like for example Sinclair did with their int/float handling.


*1 - Strings were eye candy for printing, like with most early computing at the time.

*2 - Within limits that are fine for a learners language.

*3 - Like Microsoft did with Extended BASIC. Here any variable name or name range could be (pre) defined as a specific type, saving the need to add a type suffix. Made much sense as it allowed to just program ahead and add specific types only later to optimize.

Raffzahn
  • 222,541
  • 22
  • 631
  • 918
  • My tendency is to stay with the original design. But that doesn't mean a good argument that I've missed cannot be mounted. If I knew more about who may use this and in what circumstances... But I don't. Consumers today are in no way similar to those then. (Primarily education then. Not so, now ) Only time will tell. Perhaps that's the answer. – jonk Jun 26 '22 at 12:38
  • @jonk Well, that pessimistic 'not so now' part might be the key to go with the original language. After all, it was exactly the idea of BASIC not to confuse users with the need to decide if they want an integer or a float. they just had to use a name to ´name the variable content. Much like modern script languages have rediscovered as a feature :)) Then again, while keeping the Language the same, one could still improve below the hood. For size and speed, like Maury argues. Or go one step ahead and get rid of typing by adding automatic type conversion - reviving a basic idea of BASIC. – Raffzahn Jun 26 '22 at 12:45
  • I taught computer architecture and concurrent programming and operating systems at the largest university in my state for some years. One thing I learned from that process is just how different the mix today is from when I was learning. I had a student actually tell me, during office hours, that they struggled over computer science vs accounting. In my day, that dilemma simple didn't happen. There are new tiers in the pyramid today that didn't exist in my day. – jonk Jun 26 '22 at 12:52
  • 1
    @jonk don't tell me :)) I'm in the business since the late 70s. It changed. But it changed for the better. Much what we often see as 'degrading' quality in students or new job entries is in fact a sign of way higher over all education. Back when I started, career in EDP was very special. only a tiny percentage of a people went there and with with high dedication. Today IT education is mainstream. it's needed everywhere and everyone needs to learn about - not just the ones that are drawn there by nature. Sometimes on could compare us to monks when the printing press was established ... – Raffzahn Jun 26 '22 at 13:09
  • @jonk... and literally everyone learned to read and write. Yes, books no longer were all as beautiful, and worse, 'those' everyman started writing without even an idea what they were doing or how. looking from a distance apart it wasn't a downturn but an over all uplift. We'd never been where we we are without giving everyone the chance to read and write. Endless great works would never have existed. Same with IT. It's everymans science now. Let's enjoy that - and propel it. Like by making HP TSB a thing again :)) – Raffzahn Jun 26 '22 at 13:10
  • Hehe. Thanks. Yes, what your write is true enough. Even secretaries are programmers today. In my day, I built my own computer from 7400 parts using my own design. (Newspaper reporters showed up to report in the kid who made his own computer in 1974.) I suffered very much too get there. But I wanted it so bad, too. Like a monk, perhaps. The problem I have is that I'm not representative of those I'm hoping to reach. – jonk Jun 26 '22 at 13:20
  • 1
    @jonk you're in good company here on RC.SE :)) – Raffzahn Jun 26 '22 at 13:22
  • 2
    By the way, I added time shared assembly and a symbolic assembler and linker to the HP 2000F system in 1975, worked on the Unix v6 kernel code in 1978 where I learned to use C, and worked at Intel doing BX chipset design. Been around. And the times are a wonderland to me today. FPGAs are so accessible now. I could have only wished back in the day. Youth has so little idea how much they have today. (Just built up the very nice PiDP-11/70 front panel from Oscar. Cool.) – jonk Jun 26 '22 at 13:26
2

My immediate inclination would be to say no, don't add it, at least not now.

Although I'm a long ways from being a religious believer in Agile methodology, I think in this respect they have at least something of a point. It's best to start with some "minimum viable product", and add features primarily in response to demand from users.

I'd say TSB probably doesn't need features added to be viable. As such, I wouldn't add features until or unless users ask for it.

But I'll admit that's something of a blind assumption. If memory serves, HP had previously been purely an instrumentation company, and the 2000 was their first computer, so it was used primarily for controlling instrumentation (and similar). So the obvious question would be whether that's the market you're addressing today, or whether your likely users are doing things that are different enough that they demand different features.

The other obvious question would be whether during its life there were features that were heavily demanded for TSB, but (for whatever reason) simply weren't practical to implement at the time. Given that you're basically rebooting the project, it may well be worth considering including things you know people wanted but you just couldn't do at the time.

But ultimately, I'd go back to the basic idea: don't add features until you have some factual basis to support doing so. And right now, the available facts say TSB was pretty solid exactly as it was. You don't yet know enough about how anybody is likely to use it today to make a meaningful decision about what changes will be good, so it's probably better to leave it alone.

Jerry Coffin
  • 4,842
  • 16
  • 24
  • Jerry, you start off with a convincing conclusion -- I think you've decided the issue for me and you hardly wrote much about why. Just you are right about it! But I think you are right. So you have given me the answer here. But TSB, in my experience, was mostly sold into educational institutions -- particularly ones larger than a single school, which could not afford it on their own. So regional (educational service districts) was very common. I never saw one used for controlling instrumentation. These were time sharing systems with up to 32 dial-up users, after all! – jonk Jun 27 '22 at 17:43
  • 1
    (There was HEAVY software support in the IOP for dial-up users -- it was very complex code to only serve that one purpose.) I can't say what their actual goals were *before* they sold their first unit. But I do know, for sure, that their key end-user was expected to be sitting at a "user terminal" and that their manuals would often refer to "beginners" and "teaching". So I think its target market really was "schools" just as BASIC was developed as an early language to learn. – jonk Jun 27 '22 at 17:47
  • @jonk: Okay. Sounds like my guess at its original market was dead wrong. I don't think that has much effect on the fundamental idea of adding features only when it becomes apparent they'll be needed though. – Jerry Coffin Jun 27 '22 at 17:59
  • Correct. You were spot-on and cleared my mind on this issue. Thanks very much for that! I knew you were right the moment I read the first few sentences from you. Just took me seconds to close the issue in my mind after reading that. Once in a while an answer is like that -- clear as a bell. – jonk Jun 27 '22 at 18:00
  • @jonk: Cool. Glad it was helpful. – Jerry Coffin Jun 27 '22 at 18:02
  • It was the kick in the pants that I had needed to have. – jonk Jun 27 '22 at 18:02