Карта, кластеризирующая алгоритм

Question

Карта, кластеризирующая алгоритм

mirrors.kernel.org

20

php performance algorithm google-maps cluster-analysis

задан Chris B 29 December 2009 в 16:11

8 ответов

Похоже, что ускорение функции pixelDistance () может быть частью вашего решения, поскольку она выполняется внутри цикла. Это хорошее место, чтобы сначала поискать, но вы не включили этот код, поэтому я не могу быть уверен.

1

ответ дан 30 November 2019 в 01:16

Продолжая то, что сказал Джон, я думаю, вам следует попробовать встроить эту функцию. Вызов функций в PHP очень медленный, поэтому вы должны получить приличное ускорение.

4

ответ дан 30 November 2019 в 01:16

I can see two more possible improvements here:

Can you just loop through $markers with a for loop instead of popping them off the array? The array popping is completely unneeded - you should really only be using arrays as queues if you're adding and removing items to them at the same time (which you aren't; you're just processing them then throwing them away)
You should try calculating the count() of the arrays at the beginning, and then manually increase or decrease a $count variable. Recalculating the size of an array each loop is wasteful.

1

ответ дан 30 November 2019 в 01:16

The following are some ideas you can implement in case performance is of great issue:

Reduce the dimensionality of the data: you have 2d data of long/lat, perhaps you can try projecting it down to 1D using something like Multidimensional Scaling (MDS) which tries to reduce the number of dimensions while preserving the distances in the original space, thus the distance function will only have to deal with one feature instead of two. An alternative way is to use Principal component analysis (PCA)
Faster search: the step of computing the distance to each marker can be improved using KD-trees.

Both of these technique are applied in an offline setting, so they are usually computed once and then used many times..

1

ответ дан 30 November 2019 в 01:16

A simple optimization would be to take advantage that sqrt(x) < sqrt(y) is true iff x < y, so you can omit the sqrt in the pixelDistance and calculate $distance squared outside the loop. You can also calculate the 21 - $zoomLevel outside the loop, you'll have to multiply it by 2 if you're comparing the squared values. Another small optimization would be to save 2 multiplies by doing $x1-$x2 before scaling by 10000000. And for a tiny bit more, storing the delta in a var and multiplying it by itself is probably faster than the pow function. And for some more you can inline the pixeldistance function. This kind of optimization will only yield a constant factor speedup.

For a larger speedup you'll need some kind of acceleration datastructure. An easy one would be to bin markers into distance sized squares. Then you can run over the bins look for markers to cluster with in only the same bin and 3 others chosen depending on which side of the center of the bin the marker falls. This will result in linear time clustering that will beat any optimizations to the quadratic algorithm for larger resultsets.

1

ответ дан 30 November 2019 в 01:16

Если можете, отсортируйте маркеры по долготе на первоначальный поиск; тогда, как только маркер будет больше, чем ширина маркера от следующего маркера в отсортированном списке, вы определенно будете знать, что оставшиеся маркеры не будут перекрываться, поэтому вы можете разорвать цикл foreach и сэкономить массу времени на обработку. Я реализовал это на своем собственном сайте, и он работает очень эффективно.

У меня есть исходный код на Python здесь .

вы точно знаете, что оставшиеся маркеры не будут перекрываться, поэтому вы можете разорвать цикл foreach и сэкономить массу времени на обработку. Я реализовал это на своем собственном сайте, и он работает очень эффективно.

У меня есть исходный код на Python здесь .

вы точно знаете, что оставшиеся маркеры не будут перекрываться, поэтому вы можете разорвать цикл foreach и сэкономить массу времени на обработку. Я реализовал это на своем собственном сайте, и он работает очень эффективно.

У меня есть исходный код на Python здесь .

1

ответ дан 30 November 2019 в 01:16

Вот что я сделал - добавил в таблицу маркеров(точек) две дополнительные колонки, в которых меркатор преобразовывал значения для широты и долготы, используя следующие функции:

public static $offset = 268435456;
public static $radius = 85445659.44705395; /* $offset / pi(); */

function LonToX($lon)
{
    return round(self::$offset + self::$radius * $lon * pi() / 180);        
}

function LatToY($lat)
{
    return round(self::$offset - self::$radius * log((1 + sin($lat * pi() / 180)) / (1 - sin($lat * pi() / 180))) / 2);
}

Таким образом, я мог получить точно расположенные кластеры. Я до сих пор пытаюсь придумать, как избежать использования array_pop и каждый раз циклически проезжать через него. Пока это довольно эффективно с под-1000 маркерами. Результаты для маркеров +5K и +10K я выложу позже.

Избегая функции pixelDistance и имея ее в строке, я увеличиваю производительность почти наполовину!

.

2

ответ дан 30 November 2019 в 01:16

Другие вопросы по тегам:

php performance algorithm google-maps cluster-analysis

Карта, кластеризирующая алгоритм

8 ответов

Похожие вопросы: