Appendix D. Backend Details

Table of Contents

Limits
Datastructure Limits
Symbol Uniqueness
Performance and Efficiency considerations
Math
Strings
Interrupts

Note

This appendix is under development

This appendix describes details of SISC on particular backends. This is not intended to guide programming. The programmer should code according to the main body of this document. However, this section still describes useful performance tips and limitations of SISC's operation.

Limits

This appendix lays out the various limits in SISC running on a JVM backend. These limits are not specifications for an expected set of limits on all platforms, but serve as a real-world guide.

Datastructure Limits

Table D.1. SISC Limits

DescriptionLimit
Fixed-point Exact Integers-231 < n < +231-1
AP Exact Integers -2(232-1) < x < +2(232-1)-1.
Inexacts (32Float)See IEEE 754-1985 Floating Point Standard
Inexacts (64Float)See IEEE 754-1985 Floating Point Standard
Inexact Mantissa (APFloat)same as AP Exact Integers
Inexact Exponent (APFloat)same as fixed-point exact integer
Max vector elementsSame as max fixed-point integer
Max string elementsSame as max fixed-point integer (?)
Representable characterssee the section called “ Characters ”
Maximum formal parametersSame as max fixed-point integer
Maximum lexical depthSame as max fixed-point integer
Maximum symbolic-environment bindingsSame as max fixed-point integer (?)
Addressable file sizemin of 264-1 and operating system limit


Arbitrary precision integers aren't quite arbitrary precision. SISC has a hard limit to the number of bits in an exact integer and thus to the range of representable numbers. Exact integers are stored as two's complement signed integers, with a bit limit (including the sign bit) of 232. This limits the range of representable exact integers to the numbers quoted above.

Likewise, arbitrary precision inexact numbers (when present) have a similar hard limit. The arbitrary precision inexact is constructed with an arbitrary precision exact number with the limits described above as the number's mantissa, and an exponent whose range is equivalent to that of a fixed-point exact integer. The inexact is then then mantissa*10exponent.

Symbol Uniqueness

In order to support compilation in multiple threads, on multiple machines, or in multiple times, generated symbols for module bindings, lexical variables, etc. must be 4d-unique, that is they must be unique across space and time. SISC attempts to balance this requirement with the space inefficiency of generating symbols with very long names.

SISC's unique symbols are generated by creating a number of the form current-time + (random-16-bit-natural*311040000000) + (counter*155520000000). The current time variable is only updated when the value of counter reaches 65536. In this manner, two entities that generate the same symbol will only generate a colliding symbol if they generate the symbol on the same system millisecond, with the same counter value, and the same random number, or, if one entity happens to generate the symbol with the same millisecond and does so (counter-1)*50 years, (counter-2)*100 years... in the future, and with the same random or (random-1)*100 years, (random-2)*200 years, etc in the future. Only if all of these factors align will a colliding symbol be generated. This is not as unlikely as say a Microsoft GUID or Java VMID number, but it should be sufficiently unlikely. The advantage over other GUID algorithms is that the value produced by SISC's is significantly smaller (and thus does not bloat expanded code).

Performance and Efficiency considerations

Math

The SISC numeric library is most efficient when operating on fixed bitlength numbers. Exact numbers are in their fixed bitlength mode if they are in the representable range for fixed exact integers, as described in Table D.1, “SISC Limits”. Fixed bitlength inexact numbers are only available in the 64Float and 32Float libraries. For SISC on Java on the x86 32-bit architecture, the 64Float library is generally more efficient than the 32Float library, while both are more efficient than the APFloat library.

Fixed bitlength exact integers are only used for whole numbers. Rational numbers use arbitrary precision components and thus are less efficient than whole fixed integers.

Arbitrary precision inexact numbers are progressively slower as the bitlength of the mantissa and the scale of the exponent increase. Using the precision constraints can prevent an unbounded increase in the scale of arbitrary precision inexacts which will very rapidly slow calculations.

Strings

At the time of this writing, the Scheme string type can be represented either as a character array, a native string, or simultaneously as both. The character array representation allows efficient, constant time modification of a mutable string (using string-set! for example). The native string representation allows efficient output to ports, string comparison, and substring operations.

By default, SISC allows the Scheme string to contain both representations simultaneously, ensuring that there is not a costly representation conversion necessary to perform certain operations. However, in this default mode, strings may occupy twice the memory as a string in a single representation. If a program uses many strings or several very large strings, the programmer may wish to create strings that may only be in one representation at any given time. SISC provides a parameter to control this behavior.

parameter: (compact-string-rep [boolean]) => #t/#f

This parameter, if set #t, will force strings to be represented either as a character array, or as a native string, but never both. If false, simultaneous representations are possible.

Interrupts

Interrupts allow running Scheme code to be forcibly broken from another thread, causing the Scheme code to raise an error. The interrupt signal handling code does add an appreciable overhead (usually between 1-5% depending on the JVM) to execution. It can disabled using the sisc.permitInterrupts system property.