Если Вы готовы пойти коммерческий выезд маршрута FishEye из Atlassian (также посмотрите их демонстрационный сайт ). FishEye также поддерживает добавление плагины (хотя это, кажется, очень хорошо не поддерживается в это время).
This could be as bad as O(n^2), but it works:
create table QuestionTags (questionid int, tag int);
select q1.questionid, q2.questionid, count(*) as commontags
from QuestionTags q1 join QuestionTags q2
where q1.tag = q2.tag and q1.questionid < q2.questionid
group by q1.questionid, q2.questionid order by commontags desc;
I didn't have time to optimize the WHERE IN() clause which is slow as death. I also didn't both to include indexes or specify the engine or collations but this should hopefully do the trick. Please point out any mistakes as I have not actually tested this sloppy code:
CREATE TABLE question (
id INT(11) UNSIGNED NOT NULL PRIMARY KEY AUTO_INCREMENT,
title VARCHAR(255) NOT NULL,
question MEDIUMTEXT
);
CREATE TABLE question_rel_tag (
id INT(11) UNSIGNED NOT NULL PRIMARY KEY AUTO_INCREMENT,
question_id INT(11) UNSIGNED NOT NULL,
tag_id INT(11) UNSIGNED NOT NULL
);
CREATE TABLE tag (
id INT(11) UNSIGNED NOT NULL PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(20) NOT NULL
);
SELECT question.id, question.title, question.question, count(tag.id) AS `count`
FROM question
INNER JOIN question_rel_tag ON (question.id = question_rel_tag.question_id)
INNER JOIN tag ON (question_rel_tag.tag_id = tag.id)
WHERE question.id != YOUR_QUESTION_ID_HERE
AND tag.id IN
(SELECT tag.id FROM question
INNER JOIN question_rel_tag ON (question.id = question_rel_tag.question_id)
INNER JOIN tag ON (question_rel_tag.tag_id = tag.id)
WHERE question.id = YOUR_QUESTION_ID_HERE)
GROUP BY tag.id
ORDER BY `count` DESC
Suppose I have the following:
A table of taggable entities:
Taggable
id, stuff
A table of tags:
Tag
id, tagName
A join table of what tags are associated with what Taggables
TagInfo
taggableId tagId
Then :
SELECT COUNT(Taggable_1.id) AS score, Taggable_1.id
FROM dbo.Taggable INNER JOIN
dbo.TagInfo ON dbo.Taggable.id = dbo.TagInfo.taggableId INNER JOIN
dbo.TagInfo TagInfo_1 ON dbo.TagInfo.tagId = TagInfo_1.tagId INNER JOIN
dbo.Taggable Taggable_1 ON TagInfo_1.taggableId = Taggable_1.id
WHERE (dbo.Taggable.id = 1) AND (Taggable_1.id <> 1)
GROUP BY Taggable_1.id
would join the tag-join table against itself (for the queried question id; 1 in the above sql) and count up the results for a score.
Assuming a table Questions
with a primary key column Id
, a table Tags
with a primary key column Id
, too, and a join table QuestionTags
with a composite primary key QuestionId
and TagId
referencing the primary keys of the two former tables, the following query will give the desired result (in SQL Server 2005).
SELECT q1.Id AS Id1, q2.Id AS Id2,
(SELECT COUNT(*) FROM QuestionTags qt1 INNER JOIN QuestionTags qt2
ON qt1.QuestionId = q1.Id AND qt2.QuestionId = q2.Id AND qt1.TagId = qt2.TagId) AS TagCount
FROM Questions q1 INNER JOIN Questions q2 ON q1.Id < q2.Id
ORDER BY TagCount DESC
This can be improved to the following.
SELECT qt1.QuestionId AS Id1, qt2.QuestionId AS Id2, COUNT(*) AS TagCount
FROM QuestionTags qt1 INNER JOIN QuestionTags qt2
ON qt1.QuestionId < qt2.QuestionId AND qt1.TagId = qt2.TagId
GROUP BY qt1.QuestionId, qt2.QuestionId
ORDER BY TagCount DESC
select *
, count(t.*) as matched_tag_count
from questions q
join tags_to_questions t
on t.question_id = q.id
and t.tag_id in (select tag_id
from tags_to_questions
where question_id = ?)
group
by q.id
order
by matched_tag_count desc