Алгоритмы: Интересный diffing алгоритм

Question

Алгоритмы: Интересный diffing алгоритм

Прежде всего, я думаю, что ты не должен делать то, что ты делаешь! Но вот как вы можете это сделать (кстати, это предполагает, что вы добавляете только границы в вашу сетку):

if (myGrid.Children.Count > 5)
{
   (myGrid.Children[myGrid.Children.Count - 6] as Border).Visibility = Visibility.Hidden;
}

также я рекомендую удалить, чтобы не скрывать ребенка, так как в противном случае он останется в существовании без какой-либо точки.

myGrid.Children.Remove(myGrid.Children[0]);

8

algorithm list language-agnostic diff

задан Bill the Lizard 18 September 2012 в 14:19

8 ответов

Другие вопросы по тегам:

algorithm list language-agnostic diff

Похожие вопросы:

score 3 · Answer 1

Есть способ сделать это в O (n), если вы хотите сделать копию одного из списков в другой структуре данных. Это классический компромисс между временем и пространством.

Создайте хэш-карту списка R, где ключом является элемент, а значением является исходный индекс в массиве; в C ++ вы можете использовать unordered_map из tr1 или boost.

Сохраните индекс для необработанной части списка R, инициализированной для первого элемента.

Для каждого элемента в списке L проверьте карту хеша для совпадения в списке R. Если вы не нашли ни одного, выведите ( Значение L, NULL). Если есть совпадение, получите соответствующий индекс из хэш-карты. Для каждого необработанного элемента в списке R вплоть до соответствующего индекса выведите (NULL, значение R). Для совпадения выведите (значение, значение).

Когда вы дойдете до конца списка L, просмотрите оставшиеся элементы списка R и выведите (NULL, значение R).

Редактировать: Здесь такое решение в Python. Для тех, кто говорит, что это решение зависит от наличия хорошей функции хеширования - конечно, это так. Оригинальный плакат может добавить дополнительные ограничения к вопросу, если это проблема, но я буду занимать оптимистическую позицию до тех пор.

def FindMatches(listL, listR):
    result=[]
    lookupR={}
    for i in range(0, len(listR)):
        lookupR[listR[i]] = i
    unprocessedR = 0
    for left in listL:
        if left in lookupR:
            for right in listR[unprocessedR:lookupR[left]]:
                result.append((None,right))
            result.append((left,left))
            unprocessedR = lookupR[left] + 1
        else:
            result.append((left,None))
    for right in listR[unprocessedR:]:
        result.append((None,right))
    return result

>>> FindMatches(('d'),('a','b','c'))
[('d', None), (None, 'a'), (None, 'b'), (None, 'c')]
>>> FindMatches(('a','b','c','d','e'),('b','q','c','d','g','e'))
[('a', None), ('b', 'b'), (None, 'q'), ('c', 'c'), ('d', 'd'), (None, 'g'), ('e','e')]

score 1 · Answer 2

Diffing ordered lists can be done in linear time by traversing both lists and matching as you go. I will try to post some psuedo Java code in an update.

Since we don't know the ordering algorithm and can't determine any ordering based on less than or greater than operators, we must consider the lists unordered. Also, given how the results are to be formatted you are faced with scanning both lists (at least until you find a match and then you can bookmark and start from there again). It will still be O(n^2) performance, or yes more specifically O(nm).

score 2 · Answer 3

Наихудший случай, определенный и использующий только равенство, должен быть O (n * m). Рассмотрим следующие два списка:

A [] = {a, b, c, d, e, f, g}

B [] = {h, i, j, k, l, m, n}

Предположим, что существует ровно одно совпадение между этими двумя «упорядоченными» списками. Потребуется O (n * m) сравнений, поскольку не существует сравнения, которое устраняет необходимость в других сравнениях позже.

Итак, любой алгоритм, который вы придумаете, будет O (n * m) или хуже ,

score 0 · Answer 4

This is a pretty simple problem since you already have an ordered list.

//this is very rough pseudocode
stack aList;
stack bList;
List resultList;
char aVal;
char bVal;

while(aList.Count > 0 || bList.Count > 0)
{
  aVal = aList.Peek; //grab the top item in A
  bVal = bList.Peek; //grab the top item in B

  if(aVal < bVal || bVal == null)
  {
     resultList.Add(new Tuple(aList.Pop(), null)));
  }
  if(bVal < aVal || aVal == null)
  {
     resultList.Add(new Tuple(null, bList.Pop()));
  }
  else //equal
  {
     resultList.Add(new Tuple(aList.Pop(), bList.Pop()));
  }
}

Note... this code WILL NOT compile. It is just meant as a guide.

EDIT Based on the OP comments

If the ordering algorithm is not exposed, then the lists must be considered unordered. If the lists are unordered, then the algorithm has a time complexity of O(n^2), specifically O(nm) where n and m are the number of items in each list.

EDIT Algorithm to solve this

L(a,b,c,d,e) R(b,q,c,d,g,e)

//pseudo code... will not compile
//Note, this modifies aList and bList, so make copies.
List aList;
List bList;
List resultList;
var aVal;
var bVal;

while(aList.Count > 0)
{
   aVal = aList.Pop();
   for(int bIndex = 0; bIndex < bList.Count; bIndex++)
   {
      bVal = bList.Peek();
      if(aVal.RelevantlyEquivalentTo(bVal)
      {
         //The bList items that come BEFORE the match, are definetly not in aList
         for(int tempIndex = 0; tempIndex < bIndex; tempIndex++)
         {
             resultList.Add(new Tuple(null, bList.Pop()));
         }
         //This 'popped' item is the same as bVal right now
         resultList.Add(new Tuple(aVal, bList.Pop()));

         //Set aVal to null so it doesn't get added to resultList again
         aVal = null;

         //Break because it's guaranteed not to be in the rest of the list
         break;
      }
   }
   //No Matches
   if(aVal != null)
   {
      resultList.Add(new Tuple(aVal, null));
   }
}
//aList is now empty, and all the items left in bList need to be added to result set
while(bList.Count > 0)
{
   resultList.Add(new Tuple(null, bList.Pop()));
}

The result set will be

L(a,b,c,d,e) R(b,q,c,d,g,e)

Result ((a,null),(b,b),(null,q),(c,c),(d,d),(null,g),(e,e))

score 0 · Answer 5

Я не думаю, что у вас достаточно информации. Все, что вы утверждали, это то, что элементы, которые совпадают, совпадают в том же порядке, но поиск первой совпадающей пары является операцией O (nm), если у вас нет другого порядка, который вы можете определить.

score -1 · Answer 6

ВЫБЕРИТЕ отдельный l.element, r.element
FROM LeftList l
OUTER JOIN RightList r
ON l.element = r.element
ORDER BY l.id, r.id

Assumes the ID of each element is its ordering. And of course, that your lists are contained in a Relational Database :)

score 0 · Answer 7

Нет реального осязаемого ответа, только смутная интуиция. Поскольку вы не знаете алгоритм упорядочения, только то, что данные упорядочены в каждом списке, это отдаленно похоже на алгоритмы, используемые для "сравнения" файлов (например, в Beyond Compare) и сопоставления последовательностей строк вместе. Или также отдаленно похожий на алгоритмы регулярного выражения.

~~Также может быть несколько решений.~~ (неважно, если нет строго упорядоченных повторяющихся элементов. Я слишком много думал о сравнении файлов)

score 1 · Answer 8

Это похоже на выравнивание последовательностей, вы можете использовать алгоритм Нидлмана-Вунша для ее решения. Ссылка включает код на Python. Просто убедитесь, что вы установили оценку так, чтобы несоответствие было отрицательным, совпадение было положительным, а выравнивание с пробелом было 0 при максимизации. Алгоритм работает во времени и пространстве O (n * m), но его пространственную сложность можно улучшить.

Функция оценки

int score(char x, char y){
    if ((x == ' ') || (y == ' ')){
        return 0;
    }
    else if (x != y){
        return -1;
    }
    else if (x == y){
        return 1;
    }
    else{
        puts("Error!");
        exit(2);
    }
}

Код

#include <stdio.h>
#include <stdbool.h>

int max(int a, int b, int c){
    bool ab, ac, bc;
    ab = (a > b);
    ac = (a > c);
    bc = (b > c);
    if (ab && ac){
        return a;
    }
    if (!ab && bc){
        return b;
    }
    if (!ac && !bc){
        return c;
    }
}

int score(char x, char y){
    if ((x == ' ') || (y == ' ')){
        return 0;
    }
    else if (x != y){
        return -1;
    }
    else if (x == y){
        return 1;
    }
    else{
        puts("Error!");
        exit(2);
    }
}


void print_table(int **table, char str1[], char str2[]){
    unsigned int i, j, len1, len2;
    len1 = strlen(str1) + 1;
    len2 = strlen(str2) + 1;
    for (j = 0; j < len2; j++){
        if (j != 0){
            printf("%3c", str2[j - 1]);
        }
        else{
            printf("%3c%3c", ' ', ' ');
        }
    }
    putchar('\n');
    for (i = 0; i < len1; i++){
        if (i != 0){
            printf("%3c", str1[i - 1]);
        }
        else{
            printf("%3c", ' ');
        }
        for (j = 0; j < len2; j++){
            printf("%3d", table[i][j]);
        }
        putchar('\n');
    }
}

int **optimal_global_alignment_table(char str1[], char str2[]){
    unsigned int len1, len2, i, j;
    int **table;
    len1 = strlen(str1) + 1;
    len2 = strlen(str2) + 1;
    table = malloc(sizeof(int*) * len1);
    for (i = 0; i < len1; i++){
        table[i] = calloc(len2, sizeof(int));
    }
    for (i = 0; i < len1; i++){
        table[i][0] += i * score(str1[i], ' ');
    }
    for (j = 0; j < len1; j++){
        table[0][j] += j * score(str1[j], ' ');
    }
    for (i = 1; i < len1; i++){
        for (j = 1; j < len2; j++){
            table[i][j] = max(
                table[i - 1][j - 1] + score(str1[i - 1], str2[j - 1]),
                table[i - 1][j] + score(str1[i - 1], ' '),
                table[i][j - 1] + score(' ', str2[j - 1])
            );
        }
    }
    return table;
}

void prefix_char(char ch, char str[]){
    int i;
    for (i = strlen(str); i >= 0; i--){
        str[i+1] = str[i];
    }   
    str[0] = ch;
}

void optimal_global_alignment(int **table, char str1[], char str2[]){
    unsigned int i, j;
    char *align1, *align2;
    i = strlen(str1);
    j = strlen(str2);
    align1 = malloc(sizeof(char) * (i * j));
    align2 = malloc(sizeof(char) * (i * j));
    align1[0] = align2[0] = '\0';
    while((i > 0) && (j > 0)){
        if (table[i][j] == (table[i - 1][j - 1] + score(str1[i - 1], str2[j - 1]))){
            prefix_char(str1[i - 1], align1);
            prefix_char(str2[j - 1], align2);
            i--;
            j--;
        }
        else if (table[i][j] == (table[i - 1][j] + score(str1[i-1], ' '))){
            prefix_char(str1[i - 1], align1);
            prefix_char('_', align2);
            i--;
        }
        else if (table[i][j] == (table[i][j - 1] + score(' ', str2[j - 1]))){
            prefix_char('_', align1);
            prefix_char(str2[j - 1], align2);
            j--;
        }
    }
    while (i > 0){
        prefix_char(str1[i - 1], align1);
        prefix_char('_', align2);
        i--;
    }
    while(j > 0){
        prefix_char('_', align1);
        prefix_char(str2[j - 1], align2);
        j--;
    }
    puts(align1);
    puts(align2);
}

int main(int argc, char * argv[]){
    int **table;
    if (argc == 3){
        table = optimal_global_alignment_table(argv[1], argv[2]);
        print_table(table, argv[1], argv[2]);
        optimal_global_alignment(table, argv[1], argv[2]);
    }
    else{
        puts("Reqires to string arguments!");
    }
    return 0;
}

Пример ввода-вывода

$ cc dynamic_programming.c && ./a.out aab bba
__aab
bb_a_
$ cc dynamic_programming.c && ./a.out d abc
___d
abc_
$ cc dynamic_programming.c && ./a.out abcde bqcdge
ab_cd_e
_bqcdge