cobertos 2 hours ago

You mention integer ranges as a specific type you hope gets handled. But how does something like that work? It feels somewhere between the general type and the specialization type? Would this just be a higher level in the integer section of the specialization lattice? Or would you want it out of the lattice structure completely? Because it would kinda conflict with integer unions, when thinking about it lattice-y

(Admittedly I'm in the same boat, I don't know the math but stumbled upon implementing my own type system recently)

  • sparkie 2 hours ago

    Every _BitInt(N) is a subtype of _BitInt(N+1). For practical purposes though, you'd usually make each intN a subtype of intN*2, up to some top integer type which supports arbitrary precision.

        int8 <= int16 <= int32 <= int64 <= bigint
    
    For this mini subtype hierarchy, you can literally just number the types in increasing order, and use `<=`, rather than bitmasks.

    For inclusion of unsigned integers, the same rule applies, but each `unsigned _BitInt(N)` is also a subtype of `signed _BitInt(N+1)`. It might make sense to number the unsigned integers one higher than their signed variants of the same width to simplify testing for compatibility. A specific bit, such as 1000b could indicate a signed integer.

        UNSIGNED    = 0000b;
        SIGNED      = 1000b;
    
        enum IntType {
            UINT8   = UNSIGNED | 1,
            UINT16  = UNSIGNED | 2,
            UINT32  = UNSIGNED | 3,
            UINT64  = UNSIGNED | 4,
            UBIGINT = UNSIGNED | 7,
    
            INT8    = SIGNED | 0,
            INT16   = SIGNED | 1,
            INT32   = SIGNED | 2,
            INT64   = SIGNED | 3,
            BIGINT  = SIGNED | 7,
        };
    
        is_uint8_subtype(ty)    = (ty | UNSIGNED) <= UINT8
        is_uint16_subtype(ty)   = (ty | UNSIGNED) <= UINT16
        is_uint32_subtype(ty)   = (ty | UNSIGNED) <= UINT32
        is_uint64_subtype(ty)   = (ty | UNSIGNED) <= UINT64
        is_natural(ty)          = (ty | UNSIGNED) <= UBIGINT
    
        is_int8_subtype(ty)     = (ty | SIGNED) <= INT8
        is_int16_subtype(ty)    = (ty | SIGNED) <= INT16
        is_int32_subtype(ty)    = (ty | SIGNED) <= INT32
        is_int64_subtype(ty)    = (ty | SIGNED) <= INT64
        is_integer(ty)          = (ty | SIGNED) <= BIGINT
    
    This is approximately how I handle it the numerical tower in my interpreter (extended with some ad-hoc rules to also support rationals, real, complex and quaternions).
noelwelsh 18 hours ago

No parametric polymorphism aka generic types?

  • choeger 17 hours ago

    There's little value for parametric polymorphism at this level as it requires parameters (not only type parameters, but also value parameters, as otherwise the type parameter are dangling).

    It's better to think of these types as different. Maybe "primitive types" or "shapes".

  • tekknolagi 18 hours ago

    It's in the context of a Python JIT, where we're looking for a different kind of type information

    • abeppu 18 hours ago

      JIT internals are very much not my area so this might be an ignorant question but ... why doesn't the parametric issue come up in this context? The author starts with the example of a value which could be a List or a String, where the code calls for its `len`. But one could also need to reason about `x: list[list[str]] | list[str]` and where we'll do `y = len(x[0]) if len(x) > 0 else -1` which requires the system to be able to represent both the type of x and x[0] right?

      • deredede 17 hours ago

        You typically don't represent that information in the type of the object itself but rather as separate information for the type of the values (ie fields). So maybe it's "a list, and also the values are lists" or maybe it's "a list, and also the first value is a string, and the other values are integers"... and maybe it's "a list, and the values are strings, and also it has a field named 'foo' that's an integer for some reason".

        You can still have the same information (depending on the actual system), just represented differently - parametric polymorphism is a bit too rigid here because not everything fits neatly into generic schemas.

        • tekknolagi 17 hours ago

          And another bit of info: it's much harder to reliably track that, because as soon as the list (for example) escapes, its elements could have any type and it could be any size. PyPy has storage strategies and speculative typing (with guards and deoptimization) that help with this kind of thing.

  • nsajko 16 hours ago

    Julia does this for parametric types, too.

PhilipRoman 18 hours ago

wow I guess great minds really do think alike, I did almost the same exact thing a few years ago, but eventually gave up as my type hierarchy grew too complex.

You could probably represent a lot more complex relations with similar strategies by adding one or two cleanup instructions to union/intersection operations, but whenever I've tried to do it, my head gets dizzy from all the possibilities. And so far I've been unable to find software that can assist in generating such functions.

  • sparkie 14 hours ago

    One possible trick for unions is to use bloom filters, providing you have some way of hashing types. If you create a bloom filter `Bloom(t1, t2)`, it's the same as doing `Bloom(t1) | Bloom(t2)`, where `|` is just bitwise-OR.

    Obviously not perfect as it can produce false positives, but if we keep a filter of sufficient size, this will be low, and still be more space-efficient than keeping a bit per type in a large type hierarchy.

    Intersections can also be done, but with a potentially higher false-positive rate. The result of `Bloom(t1, t2)` has at least the bits set by `Bloom(t1) & Bloom(t2)`.

    • recursive 11 hours ago

      I don't think space is going to be the first issue that causes this to fall. It will probably be the human mind's ability to conceptualize the space.

mncharity 11 hours ago

OT, I'm so very looking forward to some language supporting a pushout (path independent) lattice of theories (types, operators, laws). So abstraction has math-like locality and composability and robustness.

pizlonator 11 hours ago

Looks almost exactly like what JavaScriptCore calls SpeculatedType.

  • tekknolagi 11 hours ago

    Excellent, thanks for the pointer. Mentioned!

IsTom 9 hours ago

This is also how Erlang's Dialyzer works.

  • tekknolagi 9 hours ago

    Have a link to the implementation?

pjs_ 16 hours ago

Come on man just do duck typing. You are killing me with this stuff.

It all reads so technical and long and mathematically academic but it’s just a bit mask.

I absolutely hate writing python now because I have to reason about a “list of list of errors” type defined by a teenager and they get mad at me if I don’t then define a new “list of list of errors, or none” type when I manipulate it. You guys are now employed by VSCode to make those hints look beautiful. VSCode is your boss now and your job is to make VSCode happy

I predict that in five years we will retvrn to duck typing in the same way that we are now retvrning to server side rendering and running on bare metal. Looking forward to the viral “you don’t need compound types” post on here. “Amazing - you can write code which does normal business tasks without learning or ever thinking about homotopy type theory”

Yes I get it if we are writing embedded code or navigation systems or graphics or whatever, please help yourself from the types bucket. Go ahead and define a dictionary of lists where one list can contain strings but all the other lists either contain 8-bit integers or None. But the academic cachet of insanely complex composable type systems bleeds through into a web server that renders a little more than “hello world” and it ruins my life

  • cobertos 2 hours ago

    Duck typing requires the primitives of the properties to also be typed. This is a building block you must built to build up to duck typing.

    I'm also unsure how you would represent a "list of list of errors" succintly without calling it just that. A list of a list of errors. An `Error[][]`. How would "duck typing" as you describe make this type somehow less complex?

  • tekknolagi 15 hours ago

    This is not about surface level typing. This is about compiler internals.

    • tekknolagi 15 hours ago

      I added a couple of sentences in the intro to clarify.

  • __MatrixMan__ 11 hours ago

    I don't think there will be a movement away from type hints in python any time soon, they're too useful as guardrails for an LLM. But even without using an LLM I'm definitely faster with type hints enforced because I can just sort of feel around with autocomplete. Sure, the transition sucks, but once you're there the benefits start stacking up pretty fast.

  • Dr_Incelheimer 15 hours ago

    >filtered by Python types

    lmaooo sub-2σ golem detected