• ExLisperA
    link
    fedilink
    arrow-up
    1
    ·
    1 day ago

    Microbenchmarks which are heavily gamed

    Which benchmarks aren’t?

    • FizzyOrange@programming.dev
      link
      fedilink
      arrow-up
      1
      ·
      21 hours ago

      Private or obscure ones I guess.

      Real-world (macro) benchmarks are at least harder to game, e.g. how long does it take to launch chrome and open Gmail? That’s actually a useful task so if you speed it up, great!

      Also these benchmarks are particularly easy to game because it’s the actual benchmark itself that gets gamed (i.e. the code for each language); not the thing you are trying to measure with the benchmark (the compilers). Usually the benchmark is fixed and it’s the targets that contort themselves to it, which is at least a little harder.

      For example some of the benchmarks for language X literally just call into C libraries to do the work.

      • ExLisperA
        link
        fedilink
        arrow-up
        1
        ·
        21 hours ago

        Private or obscure ones I guess.

        Private and obscure benchmarks are very often gamed by the benchmarkers. It’s very difficult to design a fair benchmark (e.g chrome can be optimized to load Gmail for obvious reasons. maybe we should choose a more fair website when comparing browsers? but which? how can we know that neither browser has optimizations specific for page X?). Obscure benchmarks are useless because we don’t know if they measure the same thing. Private benchmarks are definitely fun but only useful to the author.

        If a benchmark is well established you can be sure everyone is trying to game it.