Почему эта программа OCaml быстрее, чем моя программа C?

Данные должны всегда развертываться для проживания с помощью сценариев, которые могут репетироваться так много раз, как они требуются, чтобы разбираться в нем на dev. Когда существуют зависимые данные для сценария, чтобы работать правильно на dev, подготовить его соответственно - Вам не может сойти с рук этот шаг, если Вы действительно хотите быть осторожными.

11
задан Jason Baker 19 October 2009 в 00:27
поделиться

5 ответов

Your C code isn't the equivalent of the OCaml code - you used 'else if' in the OCaml to avoid having to recompute moduli quite so much.

There's an awful lot of code in that 'read the long integer'. Why not just use fscanf(); it skips blanks and all that automatically, and avoids you doing the malloc() etc. I don't often recommend using fscanf(), but this looks like a setup for it - a single line, possibly with spaces either side, no funny stuff.


Curiosity killed the cat - but in this case, not the Leopard.

I downloaded OCaml 3.11.1 for MacOS X Intel and copied the OCaml code from the question into xxx.ml (OCaml), and compiled that into an object file xxx (using "ocamlc -o xxx xxx.ml"); I copied the C code verbatim into yyy.c and created a variant zzz.c using fscanf() and fclose(), and compiled them using "gcc -O -o yyy yyy.c" and "gcc -O -o zzz zzz.c". I created a file 'file3' containing: "987654" plus a newline. I created a shell script runthem.sh as shown. Note that 'time' is a pesky command that thinks its output must go to stderr even when you'd rather it didn't - you have to work quite hard to get the output where you want it to go. (The range command generates numbers in the given range, inclusive - hence 11 values per program.)

Osiris JL: cat runthem.sh
for prog in "ocaml xxx.ml" ./xxx ./yyy ./zzz
do
    for iter in $(range 0 10)
    do
        r=$(sh -c "time $prog file3 >/dev/null" 2>&1)
        echo $prog: $r
    done
done
Osiris JL: 

I ran all this on a modern MacBook Pro (3 GHz Core 2 Duo etc, 4GB RAM) running Leopard (10.5.8). The timings I got are shown by:

Osiris JL: sh runthem.sh
ocaml xxx.ml: real 0m0.961s user 0m0.524s sys 0m0.432s
ocaml xxx.ml: real 0m0.953s user 0m0.516s sys 0m0.430s
ocaml xxx.ml: real 0m0.959s user 0m0.517s sys 0m0.431s
ocaml xxx.ml: real 0m0.951s user 0m0.517s sys 0m0.430s
ocaml xxx.ml: real 0m0.952s user 0m0.516s sys 0m0.431s
ocaml xxx.ml: real 0m0.952s user 0m0.514s sys 0m0.431s
ocaml xxx.ml: real 0m0.951s user 0m0.515s sys 0m0.431s
ocaml xxx.ml: real 0m0.959s user 0m0.515s sys 0m0.431s
ocaml xxx.ml: real 0m0.950s user 0m0.515s sys 0m0.431s
ocaml xxx.ml: real 0m0.956s user 0m0.516s sys 0m0.431s
ocaml xxx.ml: real 0m0.952s user 0m0.514s sys 0m0.432s
./xxx: real 0m0.928s user 0m0.494s sys 0m0.430s
./xxx: real 0m0.938s user 0m0.494s sys 0m0.430s
./xxx: real 0m0.927s user 0m0.494s sys 0m0.430s
./xxx: real 0m0.928s user 0m0.492s sys 0m0.430s
./xxx: real 0m0.928s user 0m0.493s sys 0m0.430s
./xxx: real 0m0.927s user 0m0.493s sys 0m0.430s
./xxx: real 0m0.928s user 0m0.492s sys 0m0.430s
./xxx: real 0m0.933s user 0m0.497s sys 0m0.428s
./xxx: real 0m0.926s user 0m0.494s sys 0m0.429s
./xxx: real 0m0.921s user 0m0.492s sys 0m0.428s
./xxx: real 0m0.925s user 0m0.494s sys 0m0.428s
./yyy: real 0m0.027s user 0m0.026s sys 0m0.001s
./yyy: real 0m0.031s user 0m0.026s sys 0m0.002s
./yyy: real 0m0.028s user 0m0.026s sys 0m0.001s
./yyy: real 0m0.029s user 0m0.026s sys 0m0.002s
./yyy: real 0m0.028s user 0m0.026s sys 0m0.001s
./yyy: real 0m0.029s user 0m0.026s sys 0m0.002s
./yyy: real 0m0.028s user 0m0.026s sys 0m0.001s
./yyy: real 0m0.031s user 0m0.026s sys 0m0.002s
./yyy: real 0m0.028s user 0m0.026s sys 0m0.001s
./yyy: real 0m0.030s user 0m0.026s sys 0m0.002s
./yyy: real 0m0.028s user 0m0.026s sys 0m0.001s
./zzz: real 0m0.030s user 0m0.027s sys 0m0.002s
./zzz: real 0m0.029s user 0m0.027s sys 0m0.001s
./zzz: real 0m0.030s user 0m0.027s sys 0m0.002s
./zzz: real 0m0.029s user 0m0.027s sys 0m0.001s
./zzz: real 0m0.030s user 0m0.027s sys 0m0.002s
./zzz: real 0m0.029s user 0m0.027s sys 0m0.001s
./zzz: real 0m0.030s user 0m0.027s sys 0m0.002s
./zzz: real 0m0.029s user 0m0.027s sys 0m0.001s
./zzz: real 0m0.030s user 0m0.027s sys 0m0.002s
./zzz: real 0m0.029s user 0m0.027s sys 0m0.001s
./zzz: real 0m0.030s user 0m0.027s sys 0m0.002s
Osiris JL:

I don't see the OCaml code running faster than the C code. I ran the tests with smaller numbers in the file that was read, and the results were similarly in favour of the C code:

Stop number: 345

ocaml xxx.ml: real 0m0.027s user 0m0.020s sys 0m0.005s
ocaml xxx.ml: real 0m0.021s user 0m0.016s sys 0m0.005s
ocaml xxx.ml: real 0m0.025s user 0m0.016s sys 0m0.004s
ocaml xxx.ml: real 0m0.020s user 0m0.015s sys 0m0.003s
ocaml xxx.ml: real 0m0.022s user 0m0.016s sys 0m0.004s
ocaml xxx.ml: real 0m0.019s user 0m0.015s sys 0m0.003s
ocaml xxx.ml: real 0m0.021s user 0m0.016s sys 0m0.004s
ocaml xxx.ml: real 0m0.020s user 0m0.015s sys 0m0.004s
ocaml xxx.ml: real 0m0.021s user 0m0.016s sys 0m0.004s
ocaml xxx.ml: real 0m0.020s user 0m0.015s sys 0m0.004s
ocaml xxx.ml: real 0m0.021s user 0m0.016s sys 0m0.004s
./xxx: real 0m0.003s user 0m0.001s sys 0m0.002s
./xxx: real 0m0.003s user 0m0.001s sys 0m0.001s
./xxx: real 0m0.003s user 0m0.001s sys 0m0.001s
./xxx: real 0m0.002s user 0m0.001s sys 0m0.001s
./xxx: real 0m0.003s user 0m0.001s sys 0m0.001s
./xxx: real 0m0.005s user 0m0.001s sys 0m0.002s
./xxx: real 0m0.003s user 0m0.001s sys 0m0.002s
./xxx: real 0m0.003s user 0m0.001s sys 0m0.001s
./xxx: real 0m0.003s user 0m0.001s sys 0m0.001s
./xxx: real 0m0.003s user 0m0.001s sys 0m0.001s
./xxx: real 0m0.003s user 0m0.001s sys 0m0.001s
./yyy: real 0m0.002s user 0m0.000s sys 0m0.001s
./yyy: real 0m0.003s user 0m0.000s sys 0m0.001s
./yyy: real 0m0.002s user 0m0.000s sys 0m0.001s
./yyy: real 0m0.002s user 0m0.000s sys 0m0.001s
./yyy: real 0m0.002s user 0m0.000s sys 0m0.001s
./yyy: real 0m0.002s user 0m0.000s sys 0m0.001s
./yyy: real 0m0.002s user 0m0.000s sys 0m0.001s
./yyy: real 0m0.001s user 0m0.000s sys 0m0.001s
./yyy: real 0m0.002s user 0m0.000s sys 0m0.001s
./yyy: real 0m0.002s user 0m0.000s sys 0m0.001s
./yyy: real 0m0.003s user 0m0.000s sys 0m0.002s
./zzz: real 0m0.002s user 0m0.000s sys 0m0.001s
./zzz: real 0m0.002s user 0m0.000s sys 0m0.001s
./zzz: real 0m0.002s user 0m0.000s sys 0m0.001s
./zzz: real 0m0.002s user 0m0.000s sys 0m0.001s
./zzz: real 0m0.002s user 0m0.000s sys 0m0.001s
./zzz: real 0m0.002s user 0m0.000s sys 0m0.001s
./zzz: real 0m0.001s user 0m0.000s sys 0m0.001s
./zzz: real 0m0.002s user 0m0.000s sys 0m0.001s
./zzz: real 0m0.003s user 0m0.000s sys 0m0.002s
./zzz: real 0m0.002s user 0m0.000s sys 0m0.001s
./zzz: real 0m0.002s user 0m0.000s sys 0m0.001s

Stop number: 87654

ocaml xxx.ml: real 0m0.102s user 0m0.059s sys 0m0.041s
ocaml xxx.ml: real 0m0.102s user 0m0.059s sys 0m0.040s
ocaml xxx.ml: real 0m0.101s user 0m0.060s sys 0m0.040s
ocaml xxx.ml: real 0m0.103s user 0m0.059s sys 0m0.041s
ocaml xxx.ml: real 0m0.102s user 0m0.059s sys 0m0.041s
ocaml xxx.ml: real 0m0.101s user 0m0.059s sys 0m0.041s
ocaml xxx.ml: real 0m0.102s user 0m0.059s sys 0m0.040s
ocaml xxx.ml: real 0m0.103s user 0m0.059s sys 0m0.040s
ocaml xxx.ml: real 0m0.101s user 0m0.059s sys 0m0.040s
ocaml xxx.ml: real 0m0.102s user 0m0.059s sys 0m0.040s
ocaml xxx.ml: real 0m0.105s user 0m0.059s sys 0m0.041s
./xxx: real 0m0.092s user 0m0.044s sys 0m0.038s
./xxx: real 0m0.087s user 0m0.044s sys 0m0.039s
./xxx: real 0m0.085s user 0m0.044s sys 0m0.038s
./xxx: real 0m0.084s user 0m0.044s sys 0m0.038s
./xxx: real 0m0.085s user 0m0.044s sys 0m0.039s
./xxx: real 0m0.086s user 0m0.045s sys 0m0.039s
./xxx: real 0m0.085s user 0m0.044s sys 0m0.039s
./xxx: real 0m0.085s user 0m0.044s sys 0m0.038s
./xxx: real 0m0.084s user 0m0.044s sys 0m0.038s
./xxx: real 0m0.084s user 0m0.044s sys 0m0.039s
./xxx: real 0m0.083s user 0m0.044s sys 0m0.038s
./yyy: real 0m0.004s user 0m0.003s sys 0m0.001s
./yyy: real 0m0.004s user 0m0.003s sys 0m0.001s
./yyy: real 0m0.004s user 0m0.003s sys 0m0.001s
./yyy: real 0m0.004s user 0m0.003s sys 0m0.001s
./yyy: real 0m0.005s user 0m0.003s sys 0m0.001s
./yyy: real 0m0.005s user 0m0.003s sys 0m0.001s
./yyy: real 0m0.004s user 0m0.003s sys 0m0.001s
./yyy: real 0m0.004s user 0m0.003s sys 0m0.001s
./yyy: real 0m0.004s user 0m0.003s sys 0m0.001s
./yyy: real 0m0.004s user 0m0.003s sys 0m0.001s
./yyy: real 0m0.006s user 0m0.003s sys 0m0.002s
./zzz: real 0m0.004s user 0m0.003s sys 0m0.001s
./zzz: real 0m0.004s user 0m0.003s sys 0m0.001s
./zzz: real 0m0.004s user 0m0.003s sys 0m0.001s
./zzz: real 0m0.004s user 0m0.003s sys 0m0.001s
./zzz: real 0m0.004s user 0m0.003s sys 0m0.001s
./zzz: real 0m0.005s user 0m0.003s sys 0m0.002s
./zzz: real 0m0.004s user 0m0.003s sys 0m0.001s
./zzz: real 0m0.004s user 0m0.003s sys 0m0.001s
./zzz: real 0m0.004s user 0m0.003s sys 0m0.001s
./zzz: real 0m0.004s user 0m0.003s sys 0m0.001s
./zzz: real 0m0.005s user 0m0.003s sys 0m0.001s

Obviously, YMMV - but it appears that OCaml is slower than C by a considerable margin, but that if the number in the given file is small enough, then start up and file reading dominate the process time.

The C timings, especially at the smaller numbers, are so fast that they are not all that reliable.

9
ответ дан 3 December 2019 в 05:34
поделиться

Time under 0.05 can be a simple noise. Repeat the main program enough times to actually get ~1s of execution time in C. (I mean repeating it in a loop in the program itself, not by running it again)

Did you compile your code with optimisations turned on? Did you try reducing the number of branches? (and comparisons)

if (i % 3 == 0) {
  if (i % 5 == 0) {
    printf("Hop\n");
    continue;
  }
  printf("Hoppity\n");
} else if (i % 5 == 0){
  printf("Hophop\n");
}

Did you try looking at the assembler output?

Also printf is pretty slow. Try puts("Hop") instead, since you don't use the formating anyways.

8
ответ дан 3 December 2019 в 05:34
поделиться

Мне было бы интересно узнать, сколько времени тратится на get_count ().

Я не уверен, насколько это важно, но вы читаете долгое время строка, что означает, что строка не может быть больше 20 байтов или 10 байтов (2 ^ 64 = какое-то десятичное число длиной 20 символов или 2 ^ 32 = какое-то десятичное число длиной 10 символов), поэтому вам не нужен цикл while в get_count. Кроме того, вы можете выделить file_text в стеке, а не вызывать calloc - но я думаю, вам все равно нужно обнулить его или иным способом найти длину и установить последний байт в null.

file_length = lseek(fileptr, 0, SEEK_END);
1
ответ дан 3 December 2019 в 05:34
поделиться

Any program which mostly involves opening a file and reading it is limited by the speed of opening a file and reading it. The C computations you are doing here will take between 1 millionth and one thousandth of the time of opening the file and reading it.

I thought this site was useful: http://norvig.com/21-days.html#answers

1
ответ дан 3 December 2019 в 05:34
поделиться

In a tiny program like this, it's often hard to guess why things work out the way they do. I think if I was doing it, I'd write the code like this (leaving out error checking for the moment):

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char **argv) { 
    static char buffer[20];   
    int limit, i;

    freopen(argv[1], "r", stdin);
 fgets(buffer, sizeof(buffer), stdin);
    limit = atoi(buffer);

    for (i=1; i<=limit; i++) {
        int div3=i%3==0;
        int div5=i%5==0;
        if (div3 && div5) 
            puts("Hop");
        else if (div3)
            puts("Hoppity");
        else if (div5)
            puts("HopHop");
    }
    return 0;
}

Using freopen avoids creating another file stream, and instead just connects standard input to the specified file. No guarantee that it's faster, but it's unlikely to be any slower anyway.

Likewise, a good compiler may note that i is constant throughout the loop body, and factor out the two remainder operations so it only does each once. Here I've done that by hand, which might not be any faster, but almost certainly won't be any slower.

Using puts instead of printf is fairly similar -- it might not be any faster, but it almost certainly won't be any slower. Using printf the way you have, it has to scan through the entire string looking for a '%', in case you're asking for a conversion, but since puts doesn't do any conversions, it doesn't have to do that.

With such a tiny program as this, there's another factor that could be considerably more significant: puts will usually be considerably smaller than printf. You haven't said how you're doing the timing, but if it includes the time to load the code, really small code may well make more difference than the execution time.

2
ответ дан 3 December 2019 в 05:34
поделиться
Другие вопросы по тегам:

Похожие вопросы: