back to article Haskell, Erlang, and Frank walk into a bar – and begin new project to work in Unison

At the Strange Loop conference in St. Louis, Missouri, earlier this month, Paul Chiusano, founder of Unison Computing, gave the audience a tour of Unison, an emerging programming language designed for building distributed systems. Created by Chiusano, Arya Irani and Rúnar Bjarnason, Unison was inspired by Haskell, Erlang, and …

  1. sorry, what?
    WTF?

    Making a hash of it...?

    A couple of things sprung to mind when reading this (and, no, I haven't gone about finding out any more by reading any of the cited links):

    1. How are they going to deal with hash collisions? Hashes are NOT unique identifiers; they are semi-unique at best.

    2. How do you ensure that a bug fix, which changes the hash of a function, is correctly applied (updating all reference to the broken function with the new one)?

    Yeah, my brain isn't big enough to handle Erlang so I'm guessing it would go "*phut*" in Unison...

    1. richard?

      Re: Making a hash of it...?

      I guess the chances of a hash collision are so much smaller than the collisions we already have with a combination of name + types that it isn't a priority to worry about.

      The second point is probably an advantage - no unexpected changes, the references must be _deliberately_ updated; it also allows parallel testing and stuff in a real environment, and you since you can uniquely identify a function you can easily see if the old one is being used.

    2. Charlie Clark Silver badge

      Re: Making a hash of it...?

      When collisions are detected the hash algorithm is considered broken. For something of deterministic complexity like a function's AST the chances of conflicts of hashe with the AST of another function might even be provable. Basically: a different function that has the same hash, has the same AST.

    3. Anonymous Coward
      Anonymous Coward

      Re: Making a hash of it...?

      1. These hashes are essentially random 512-bit numbers. To get an idea how big a space of numbers that is, imagine a two-dimensional plane with every atom in the known universe on the X axis, then again on the Y axis. A 512-bit hash is basically throwing a dart at a dartboard that size. You are more likely to win every lottery on the planet in the same week, while simultaneously getting struck by lightning and hit by a meteror, than you are to find a hash collision in 512-bit SHA3.

      2. I imagine a bug fix is done much the same way it's done in other languages, by shipping new code (which would have a new hash).

      1. Brewster's Angle Grinder Silver badge

        You've ignored the birthday paradox. There's a 50% chance two of them will share a hash if you have a pool of 1E77 programs.

        1. hythlodaeus

          So if you hashed a billion programs per second, the expected time to the first collision would be 10^66 years or so :)

        2. Michael Wojcik Silver badge

          I've got 1077 problems, and an SHA3-512 collision ain't one.

      2. sorry, what?

        Re: Making a hash of it...?

        1. They are deterministically computed, not random. Yeah, the likelihood is slim but it COULD still happen. If it does, this language is hosed. My point stands.

        2. My point is that the original "function" was referenced by one hash and now it has another. I think you are implying that you have to build/deploy all the code again so that the referer code correctly references the bug fixed code, but I'm not sure. That implies you must "own" all the code in the solution since it all has to be re-built and re-deployed to resolve all those references, so don't even think about models like shared libraries, right?

        (And why hide behind anonymous posting?)

        1. Michael Wojcik Silver badge

          Re: Making a hash of it...?

          Yeah, the likelihood is slim but it COULD still happen. If it does, this language is hosed. My point stands.

          It's much, much, much, much slimmer than, say, the likelihood of all life on earth being wiped out by a GRB. Your threat model is idiotic, and your point does not stand.

        2. hythlodaeus

          Re: Making a hash of it...?

          Hello, I’m one of the creators of the Unison language. Happy to answer your questions!

          While it’s true that a hash collision is theoretically possible, the chances are extremely remote. If you found one, you would win some kind of cryptography reward and you would be famous. We don’t expect any two Unison programs to have the same hash for at least a few trillion trillion trillion trillion trillion trillion years. :)

          And if they do, the language isn’t “hosed”, we just outlaw the offending hash and move on. Every program turns out to have an infinite number of equivalent programs that have a different hash.

          I’m not sure I understand the second problem. If you want to have a shared library and publish an update, it’s always going to be true that users of your library have to change (or at least relink) their code to get your latest version. In Unison you accomplish this by publishing a patch that upgrades users’ code. They just apply the patch and now their code references the latest hash. Maybe I’m misunderstanding what you mean.

  2. sum_of_squares
    Trollface

    Yet another lisp flavoured haskell?

    Oh well, why not. One can never have enough functional crossbreeds..

    1. hammarbtyp

      waskell?

    2. MiguelC Silver badge
      Trollface

      "hathkell"

      1. BebopWeBop
        Happy

        You are obviously a lithpel enthusiast

  3. SVV

    we don't have to keep running the same tests over and over

    Once you discover that unexpected side effects resulting from changes elsewhere will still propagate through the system, you will have to admit that you will have to keep running the same tests over and over. Distributed systems are always much more tricky when it comes to testing, when changes occur in one place that other places aren't aware of. With the wild claims being made, as they hype yet another niche language that they think will revolutionise the world of programming, they might have better called it Unicorn. There are always downsides to any approach, and I can just feel in my old programmmer bones that this extreme form of dynamic linking will have many.

    1. hythlodaeus

      Re: we don't have to keep running the same tests over and over

      Hello, I'm one of the creators of Unison. You are of course correct that if a test has a side-effect then we cannot cache the results. This is going to be the case for things like integration tests. So Unison will not cache the results of all tests. But we can actually tell which tests are going to have side effects and which won't (Unison's type system gives us this ability), and it turns out that most unit tests can be run without any side effects at all.

      Your comment of this being an "extreme form of dynamic linking" got me thinking about that. I too have nightmares about dynamic linking and "DLL hell". But I think Unison's approach is actually a form of extreme *static* linking. That is, the problem with dynamic linking is that the address of linked code isn't known in advance--you only have a link which is maybe a name, a version, and an offset. Links can be broken so dynamic linking can fail and routinely does. But hashes in Unison are essentially static pointers into a vast shared memory space. So the referent of a given hash is always going to be at the same address in that space, just like in a statically linked program.

    2. Claptrap314 Silver badge

      Re: we don't have to keep running the same tests over and over

      I agree with your warning instincts, but you've not articulated the problem quite the way I see it.

      The problem is not that unit tests might have side effects. After all, it's a paddling if they do. The problem is that unit tests might be affected by things that are not part of the unit test suite at all. In the end, unit tests are themselves just code, and therefore just as vulnerable to problematic thinking as the code that they are testing. There are any number of thinks that someone might assume are constant that are not. I personally observed a unit test that would fail if executed in the vicinity of daylight savings time change.

      So maybe I'm being a princess about this pea, but I really like to see that unit test suite run in its entirety on a very regular basis.

      1. hythlodaeus

        Re: we don't have to keep running the same tests over and over

        I think you’ll find that if your test is able to observe a DST change, it has a side-effect. Cached Unison tests are not allowed to observe the system time. Tests that do this do not get cached.

        1. Claptrap314 Silver badge

          Re: we don't have to keep running the same tests over and over

          Having side effects is about making external state changes. I'm talking about being vulnerable to them.

          1. hythlodaeus

            Re: we don't have to keep running the same tests over and over

            Either way, Unison is absolutely aware of whether a test (or any code) is able to observe or manipulate the outside world.

  4. adam 40 Silver badge
    Joke

    Closed shop

    "Unison can transfer an arbitrary computation, including its dependencies, to a remote node"

    No it won't, Brother! One out, all out!!!

  5. Arthur the cat Silver badge

    Frank

    I thought Frank was the chap the government wanted you to talk to about drugs.

    1. Michael Wojcik Silver badge

      Re: Frank

      Nah, Frank is "a strict functional programming language designed from the ground up around a novel variant of

      Plotkin and Pretnar’s effect handler abstraction".

      You've probably already guessed this, but "in Frank, the equational theory is taken to be the free theory, in which there are no equations".

  6. Claptrap314 Silver badge

    How does this speed refactoring?

    The idea of hash-based addressing is intriguing. But do we really want our binaries to require 512 bits for every call? That's a huge cost. Or is there some sort of index table in the executable used solely to mange updates?

    And as for renaming, that's a job for the source code. Nothing saved there except for the recompile step. The big win is not that it makes recompiling unnecessary, but rather that it can be used as a check--if the recompile changes anything, then the rename was bad!

    1. hythlodaeus

      Re: How does this speed refactoring?

      Hello, I’m one of the creators of Unison. It’s true that 512 bits (64 bytes) is a bit larger than what today’s CPUs typically use for pointers (8 times larger to be exact), but this alone is not going to contribute significantly to the memory footprint of the typical Unison program (user data is going to do that). We think this is a fair price to pay for the abilities we get from content-addressing code.

      Regarding renaming... if you have e.g. a Java library where you’ve named something x, and lots of user code refers to it as x, then if you rename it to y and republish your library you’re going to break everyone’s code. Names are really important in traditional languages. But in Unison, the name is just metadata. You can rename a function from x to y, republish your code, and everyone else’s code still works! Because they weren’t referring to the name anyway. Their code was referencing the hash.

      Because of the hashes, Unison knows a lot more about the structure of your codebase than a typical IDE, so we can make refactoring a very controlled experience. They typical workflow in most languages is that you make a change and your codebase is broken until you finish propagating that change (manually) throughout. But a Unison codebase is never broken that way, even in the middle of a refactoring.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like