[about] [   ANALYZER   ] [   SYNTHESIZER   ] [   OPTIMIZER   ] [   INTERPRETER + DEBUGGER   ] [download] [contact]
About

ZX-Basicus is a PC console program written in C++ that synthesizes, analyzes, optimizes and runs directly ZX Spectrum 48K Sinclair BASIC programs (it also does some management of .TAP, .TZX, .TAB and .SNA files). Under the hood, it uses a formal grammar specification that is extremely respectful of the original language, and also the ZXEcosystem library for interpreting programs with the original ZX Spectrum 48K peripherals.

The program is easy to use. All its options can be shown simply by calling it with --help, or you can consult the help for only one option with --help <opt-name-without-first-dash>. If you are more visual, you may use the integrated graphical option selector, that is opened if you set no option at all, although that is only visual sugar for the specification of parameters (it is not an IDE -yet-) and we strongly recommend the console mode.

More detailed description of the toolkit are in the sections below (the file management is really straightforward and therefore omitted; just look for option -f in the help, or write zxbasicus --help f in console).

Over the years I have also written other software related to the ZX Spectrum that might interest you. In particular, if you wish to program the PC with ZX behaviour in modern C++ instead of the original Sinclair BASIC, you can use the ZXEcosystem library.

ANALYZER

As an analyzer, ZX-Basicus is maybe the most intelligent tool out there; it extracts information from the BASIC program contained in a file and perform a number of automated analyses on that. It can work with:

  • A snapshot of the ZX memory (.sna 48K format), extracting the BASIC source code and also information about the existing variables, UDGs -User Defined Graphics- and raw screen data.
  • Any BASIC program in a .tap file, extracting then the BASIC source code and also information about the existing variables.
  • A BASIC source code plain text file (.bas).

In all cases, it allows the user to inspect the internals of existing programs, obtain help for coding efficiently, and also test compliance with the original ZX Sinclair BASIC for 48K. Moreover: it is able to detect diverse obfuscation techniques and clean the code in the case of binary programs (.sna and .tap).

The general syntax for this tool is:

zxbasicus -a [options] -i bas-tap-tab-or-sna-file
The options are:

--obfuscated
. If the input file is not binary, this option is ignored. Otherwise, the analyzer tries to fix the following obfuscation tricks and recover a clean BASIC code:
  • Hidden numbers that do not match previous number literals or tokens (the latter are forced to be numeric literals that match the hidden values).
  • Excess of lines (unaddressable bytes after the end of the line; the excess of bytes is returned as a new REM statement with those bytes as content).
  • Variables used but not created (no error is issued when this is found).
  • Unrecognizable tokens or elements in the code (they are dropped).

SYNTHESIZER

As a synthesizer, ZX-Basicus creates, from a plain text file containing a BASIC program source, a tape file (in .tap format) that can be loaded by any emulator. With this you can write your own BASIC programs in any code editor you like (which will certainly be more comfortable than writing them directly in an emulator of the original ZX!).

Again, the syntax is straightforward:

zxbasicus -s [--line 10] [--progname NAME] [--obfsnum] -i bas-file -o tap-file
where the
--line
optional parameter indicates the starting execution line when the file is loaded into an emulator,
--progname
provides a name for the header of the tap file, and
--obfsnum
substitutes all number literals by "1" to save space, keeping the hidden numbers valid; it also substitutes the negation of numbers by "1", storing the already negated number in the hidden value, which saves the ZX BASIC interpreter one mathematical operation in runtime.

The synthesizer is also able to generate some helper code automatically for your BASIC programs. In particular:

--defadd defline:actline:memloc
. Add to the input BASIC file two subroutines to do the DEFADD trick for fast memory copies, and then save the resulting BASIC source code. (The DEFADD trick needs a hardware emulator for execution; the resulting program will not work well in the ZX-Basicus interpreter). The parameter
memloc
indicates the location in RAM where the 18 bytes needed for the trick will be stored, the parameter
defline
is the BASIC line where the first subroutine, in charge of installing those bytes, will be placed, and
actline
will be the line where the second subroutine, that makes a fast copy of
L
bytes of memory from address
O
to address
D
, will be placed. Both lines must be beyond the last line existing in the input program.

--comprscr firstrow:lastrow
. Read a ZX entire screen from the input file and compresses it for BASIC usage in a "Wudang" style (the pure BASIC game that won the bytemaniacos 2020 contest). Concretely, it synthesizes a .tap file that contains two elements: a code element with the screen data in a compressed format, based on a possibly new character set, and a BASIC program that loads and shows the compressed screen just using PRINT (you have to run the program in a hardware emulator to work, not in ZX-Basicus, since it uses the DEFADD trick). The part of the original screen that is processed is the one within character rows
firstrow:lastrow
(both rows included). The code element within the .tap result is a binary compression of the screen with the following format:
  • 0..8
    a fake DEFADD to hold a pointer to the compressed screen and be used consequently as a string variable P$ that can be printed.
  • 9+
    the new charset for the compressed screen, if it needs one.
  • 9+
    the content of the compressed screen, including chars and color controls.

OPTIMIZER

ZX-Basicus contains a number of tools that transform BASIC source files. The main goal is increasing the execution efficiency in a ZX Spectrum, but some transformations are useful for other things, such as publishing BASIC code in web pages. The optimizing tools are inspired in the way the ZX ROM interpreter works and what it needs to execute faster.

You can find complete and detailed explanations of the optimizations in these posts.

The general syntax for these tools is:

zxbasicus -t [--tool [parameter]] -i bas-file -o resulting-bas-file
The tools are:

--delempty
Delete empty statements.

--delrem [parm]
. Delete all REM statements which comments do not begin with any of the letters included in the string parameter, or all the REM statements if that paramter is 'all'.

--subscts [w:preff]
[
--subs1var
in v1.*.* versions]. Substitute all uses of a variable when it is assigned only once to a literal value (either string or number) in a LET statement; take the literal value of that assignment for the substitutions. In other words, consider that variable as a constant. If
[w:preff]
is provided,
w
must be either
*
(to detect all constants, strings and numbers),
n
(to detect only numeric constants) or
s
(to detect only string constants), while
preff
, that could be empty, is a preffix string that all constant variables should have at their names to be considered in this optimization. If the parameter is missing, it is taken as
*:
(strings + numbers, no preffix).

--summexprs
. Reduce all expressions that can be partially or entirely evaluated before runtime.

--delunreach
. Delete all statements that cannot be reached in the execution flow of the program: those after GOTO, RUN or NEW, but also those IF that have an always-false or always-true condition (in the former case, all statements of the IF body are deleted, not only the IF).

--delunusedv
. Delete all statements that assign/write to a variable that is unused in the program.

--delunusedfn
. Delete all statements that define a function (DEFFN) which is never used in the program.

--shortenv
. Shorten all scalar variable names that are longer than one character, as long and as much as it can.

--mergelines
. Merge into one all contiguous lines that it can without affecting the control flow of the program, therefore making lines longer and fewer. It also takes care of not affecting the speed of the program: for instance, FOR statements that are at the beginning of their lines are not merged up with previous line, since that would produce slower iterations (the FOR statement is to be searched by the interpreter within the line after each NEXT).

--alloptim
. Do the previous optimization options in the pre-defined sequence shown above.

--valtrick firstline:lastline
. Substitute automatically isolated numeric literals by the VAL trick or similar in the program section enclosed within [firstline,lastline].

--move parm
. Move, if possible, a given set of lines to another place, renumbering them. It must be followed by an argument with the format
nnnn:mmmm:dddd:iiii
, where
nnnn
is the first line to move,
mmmm
the last line to move,
dddd
the first destination line, and
iiii
the line increment to use after moving for renumbering the moved lines.

--strlitwchr
. Change all non-standard ASCII chars in string literals into calls to the CHR$() function.

--HTML parm
. With a parameter that can be either
dark
or
light
, produce source code in HTML format for dark or light backgrounds, respectively.


Below you can see an example of the optimizations done on the file "test_alloptim.bas" included in the ZX-Basicus download; a more detailed video is also available.

INTERPRETER + DEBUGGER

As an interpreter, ZX-Basicus runs ZX Spectrum 48K BASIC programs from their source text form (plain ASCII files, both .BAS and .TAB), not by emulating the internal hardware of the ZX, but by directly interpreting each statement and expression at PC speed!

These programs can be designed to be synthesized for later execution in a real ZX or in a ZX hardware emulator, using ZX-Basicus as a development aid that is more convenient than the original machine, or they can, from the beginning, be intended for the PC, becoming then more powerful than the original ZX BASIC software both in speed and memory space: they are not limited either in code size (although for compatibility reasons they can reach only to 32767 lines and have up to 127 statements per line) or in the number of variables, none of these residing in the 64KB memory space of the running environment. The ROM is not there either (USRs calls do not work), its place becoming just another part of the RAM; the 64KB RAM serves then to hold the screen, a few system vars (again for the sake of compatibility, because it is really common that BASIC programs use them through POKE/PEEK), UDGs and fonts, and to be POKEd with any raw data the user program needs to store.

The interpreter has a built-in console debugger that can aid in developing pure BASIC programs, with the usual debugging options (step-by-step execution, variable inspection, expression watch, breakpoints, etc.).

The syntax of the interpreter is like this:

zxbasicus -r [options] -i bas-file
The available options are:

-im parm
. The
parm
is either
mico
or
full
(default). The interpreter uses the ZXEcosystem library for executing programs with full-fledge original ZX Spectrum sound, screen and keyboard in
full
mode, but you can choose to execute them in a much simpler black/white, mute PC console if you wish (that would be a sort of retro-scripting language for the PC!) with
mico
mode.

-fm parm
. Mode of file management when executing file operations (LOAD, SAVE, etc.). It is either
raw
(binary mode),
tap:<filename>
(tape mode) or
TAP:<filename>
(also tape mode, but forcing to call the tape manager at each tape operation). In binary mode, each tape operation will work on an individual disk file provided in the name argument of the operation (that does not have the limitations of ZX BASIC file name arguments) and with the data in raw, binary format; in tape mode, the provided filename will correspond to a .tap file where all operations will occur using the .tap format.

--delay parm
. Delay
parm
milliseconds after executing each BASIC statement; if 0 (default), no delay. If running a program originally written for a ZX, this can be useful to reduce the execution speed (usually, delays of 5-10 millis are enough), and also for not overloading the CPU time in your computer: if your program does no pauses, it will tend to consume all the available CPU power!

--line parm
. The
parm
is the number of the line where the program starts (default is 0).

--profile parm
. If speficied, dump to the file given by
parm
the results of profiling the execution.

--debug
. If specified, the interpreter will enable debug mode.

--ign-usrn
. If specified, any machine code call (USR n) will be executed without effect. Otherwise, they stop the interpreter.

--randomize
. If specified, the interpreter will fill the seed of the randomize engine of the ZX with a random number at starting. Otherwise, its default original value is used.

--stopatend
. If specified, the interpreter will pause after the program stops without closing the window until the user press a key.

--silent
. To prevent any output in console except the one the BASIC program may produce by using the stream #3 (printer, see below).

The ZX-Basicus interpreter can execute old programs, or you can write your own.

The interpreter has been throughly tested with many programs of the 80s but also with programs taken from the BASIC Jam 2017 contest, some of the Spanish "bytemaniacos" contests, and several books preserved at proyectoBasicZX.



The particularities of the interpreter in full mode are the following:

  • You can observe the inner bitmap of the screen at any time by pressing F2. This "decolour" mode does not affect at all the behaviour of the BASIC program (it will still "see" the colours, so to speak), and can be toggled back pressing again F2.
  • You can access a tape manager -only if
    -fm tap:*
    ,
    -fm TAP:*
    or when running a .tab file- if pressing F3.
  • Kempston mouse support with the same ports and behaviour as in the original ZX.
  • UDGs and font bitmaps will be filled initially in an automatic way; they can vary their locations through BASIC programming. However, if they are placed in the ZX memory such that their addresses would wrap around to 0 when passed the 64K of RAM, that will not happen in the interpreter.
  • It makes use of the following system vars only (with their original behaviour unless specified):
    • LASTK
    • UDG
    • CHARS
    • BORDCR (set the attrs for K and for the border; in the original ZX set the attrs for K and, after the next beep or key tic, set the border)
    • FLAGS2 (only bit 3, which sets the caps lock status)
    • SCRCT
  • Printing/inputting at streams other than #0, #1, #2 or #3 produces errors. Printing at stream #3 redirects the output to the console where ZX-Basicus is running (also LPRINT).
  • All substatements of a PRINT are evaluated before the PRINT takes place, thus, during the substatements of a PRINT, any change in sysvars is not accessible by the program (e.g., scroll count). For instance, if the same PRINT has a reading of such change (e.g., PEEK SCRLCT), it will not be updated until the next statement (internally, the change is correctly updated). Also, if there is some STR$() in the PRINT, it will not show the side effect it has in the original ZX (there, it restores global colors).
  • It is not possible to delete the enclosing quotes in string INPUT.
  • Any token typed in INPUT with a single key pressing appears as a multiple char string and is deleted char by char, not as a token. The token texts are the same as in the original ZX except if they have a space in between, which does not exist here. Also, some tokens does not exist here (all of this is a byproduct of using the current BASIC lexer and grammar).
  • A numeric INPUT do not produce any error if it is left empty; after ENTER is pressed, the result will be 0.
  • PAUSE, INPUT, INKEY$ do not respond to special keys in the PC keyboard, only to those in the ZX keyboard.
  • INKEY$ returns a text with all the characters of a token if a token is pressed through SYMBOL_SHIFT, instead of the token code (single byte) in the original ZX. INKEY$ returns '?' when CAPS + 2 is pressed (in the original ZX it returns the comma ctrl char) or CAPS + 1 (edit ctrl char) or CAPS + 3 (0x04 char) or CAPS + 4 (0x05 char) or CAPS + 9 (0x0f char) or CAPS + 0 (delete ctrl char) or CAPS + SYMB (14 char).
  • An error in a numeric input is not shown exactly as in the original ZX (with a flashing '?' marker in the error spot).
  • IN function and OUT statement only work for Kempston mouse ports and on port 254 with keyboard and border. In the rest, IN gets 0xff always and OUT does nothing. The keyboard membrane works as in the original ZX but without the phantom pressings due to completing a rectangle in the physical membrane; the highest bits are set as in most emulators when doing a reading of port 254.

On the other hand, the particularities when using the simpler mico mode are:

  • No graphical output.
  • No graphical input: ATTR always yield 0, SCREEN$ always yield a space, POINT always yield 0).
  • No effects of AT or # (stream).
  • The number of columns is unlimited.
  • TAB moves column to the value of its parameter, without doing modulus 32 on that parameter, inserting spaces.
  • APOSTROPHE works as usual (inserts a new line if it is not the last arg of PRINT/INPUT).
  • COMMA works as usual (as TAB to move to the next 32 portion of the columns, but without newlines).
  • Floating point numbers are not printed with the same original ZX algorithm but with a standard C++ one.
  • No effects of BORDER, CLS, PAPER, INK, etc.
  • No graphical characters or UDGs. They former are printed as 'G' and the latter as their lower letters.
  • Control characters (<' ') are not printed.
  • Characters corresponding to tokens are printed as the token names.
  • Other non-standard ASCII characters are printed as '?'.
  • Unless a newline is printed, the console may not show its content.
  • INPUT and INKEY$ admit escaped codes.
  • No sound output (just pauses are done for the duration of BEEP).
  • No pause is breakable except PAUSE 0.
  • IN() always yield 0xff.
Download

The development history of ZX-Basicus started in Spring 2019 (v1.0.0). It is listed below along with some downloads:

  • [Jul 2023] ZX-Basicus v2.2.2. Improved detection and management of obfuscated elements in binary BASIC programs; improved management of tape items with diverse flags; improved listing of tokens in ZXBasicus Analyzer.
  • [Jul 2023] ZX-Basicus v2.2.1. Fixed a minor bug when parsing optimization options of procedural tapes.
  • [May 2023] ZX-Basicus v2.2.0. Re-written BASIC language expression parser that enables now to reduce any part of expressions and thus produce better optimization; add a new option to the "alloptim" transformer that generates a dynamic HTML file showing the changes made by the optimizer; improved documentation and minor bugs fixed.
  • [Feb 2023] ZX-Basicus v2.1.5. Changed internally the "all" optimization option to be more modular, improved its documentation and also improved some of the options to get better effects. Fixed a bug with recognizing creation of variables in BASIC source code.
  • [Jan 2023] ZX-Basicus v2.1.4. Fixed several bugs related to the interpreter.
  • [Jan 2023] ZX-Basicus v2.1.3. New optimization flag in procedural tapes for not obfuscating number literals; minor bugs fixed.
  • [Jan 2023] ZX-Basicus v2.1.2. Now the conversion from procedural tape to conventional tape generates a file with info about the optimizations used, if any; fixed a bug in obfuscating number literals. All versions compiled for 64 bits.
  • [Jan 2023] ZX-Basicus v2.1.1. Minor bugs fixed. Previously: ZX-Basicus v2.0.1. Added the possibility of only subsituting constant strings to --subscts; added the detection of IF with always-true condition to --delunreach; added support for loading .png images to procedural tapes. Windows version compiled for 64 bits.
  • [Dec 2022 - Jan 2023] ZX-Basicus v2.0.1. Improved and more robust optimization procedures in the optimizer tool; added a new optimizer option --summexprs to evaluate expressions before runtime; dropped the optimizer option --subs1var for substituting it by the more powerful --subscts; added a new optimizer option --delunreach to delete all statements unreachable in the code; add a new --obfsnum option to the synthesizer for obfuscating number literals; improved some reports of the analyzer; fixed a number of bugs. Windows version compiled for 64 bits.
  • [Dec 2022] ZX-Basicus v1.12.0 (erroneously labelled as v2.0.0). Support for the new procedural tape format (.tab); included a tape manager for the manual use of tapes during execution; support for BASIC keywords with inner spaces (e.g., both "DEF FN" and "DEFFN" are valid) and for empty lines within the code to better separate meaningful sections; support for escaped control codes written in source text at any place; improved some reports of the analyzer; fixed minor bugs. Windows version compiled for 64 bits.
  • [May 2022] ZX-Basicus v1.11.1. Emulation of the NEWPPC:NSPPC and PPC:SUBPPC system variables for allowing to do GOTO to particular statements through POKE in those variables; fixed minor bugs. Windows version compiled for 64 bits.
  • [Mar 2022] ZX-Basicus v1.11.0. Added options to the file manager for creating TZX files from TAP files (and vice versa) and to create WAV files from TAPs; fixed minor bugs. Windows version compiled for 32 bits.
  • [Oct 2021] ZX-Basicus v1.10.0. Fixed minor bugs, added --progname option to specify the BASIC name of the program in the synthesized tape file, Windows version compiled for 64 bits.
  • [June 2020] ZX-Basicus v1.9.0. Fixed several minor bugs, increased robustness, improved fidelity to the original ZX interpreter behaviour, better console help, new BASIC code generators in the synthesizer (for the DEFADD trick and compression of screens), new "decolour" mode in the interpreter and viewer of files for inspecting screen bitmaps.
  • [May 2020] ZX-Basicus v1.8.1. Fixed several minor bugs.
  • [May 2020] ZX-Basicus v1.8.0. Added a simple graphical GUI for selecting options of the different modules of the program if it is called without any option; added option --silent to the interpreter to not produce any output in console except stream #3; added support for stream #3 (printer) to the full executor redirected to the console, as well as LPRINT statement; added support for Kempston mouse in proper ports in the full mode of the interpreter; added the --valtrick option to the optimizer for substituting automatically isolated numeric literals by the VAL trick or similar; added the --alloptim option to the optimizer to perform several optimizations sequentially and automatically; fixed a few minor bugs.
  • [May 2020] ZX-Basicus v1.7.0. Added the new file management component (-f); extended de utility of the --obfuscated option to both interpreter and transformer, with improved functionality; changed the encoding of tokens and other non-ASCII codes in source files to a more powerful and clear syntax; improved binary program token listing and included binary variables listing; improvements in the clarity of produced analysis reports; improved robustness in the management of files and in the fidelity of the interpreter with the original ZX; fixed several minor bugs.
  • [Apr 2020] ZX-Basicus v1.6.0. Added support for analyzing .TAP files, improved detection of obfuscated hidden numbers, proper detection that the program is empty in a .SNA file, and changed / clarified the order of dumping info into files while analyzing.
  • [Apr 2020] ZX-Basicus v1.5.1. Fixed a (stupid) bug that prevented to do correct synthesizing.
  • [Apr 2020] ZX-Basicus v1.5.0 . This version improves the information got from the lexical analyzer about the tokens in a SNA file, useful to detect invalid code, and adds a new option to the analyzer for dealing with obfuscated code.
  • [Feb 2020] ZX-Basicus v1.4.1. This version fixes a bug with the --mergelines transformation tool.
  • [Jan 2020] ZX-Basicus v1.4.0 . This version comes with substantial improvements with respect to previous ones, including the optimizer and debugger as main novelties.
  • [Dec 2019] ZX-Basicus v1.3.1. This version fixes some minor bugs.
  • [Dec 2019] ZX-Basicus v1.3.0. This version fixes some bugs, improves fidelity to the original ZX and supports additional options.
  • [Nov 2019] ZX-Basicus v1.2.1. This version supports transparent BRIGHT and FLASH and fixes some minor bugs in the previous one. It has also been tested with more BASIC programs taken from several Spanish "bytemaniacos" contests.
  • [Sep-Oct 2019] ZX-Basicus v1.2.0 both for win32 (tested in Windows 10) and lin64 (tested in Debian 10), and also some BASIC programs taken from here and there for you to test and play.
  • [Aug 2019] ZX-Basicus v1.1.0 for Windows 7 and up and for Linux 64 bits, (not including the interpreter), plus a SNA test file ready to be analyzed and a BASIC plain text test file ready to be synthesized.
Contact

ZX-Basicus has been developed by Juan-Antonio Fernández-Madrigal. If you are interested in contacting me, you can use "software" (remove quotes) at jafma.net.