2007年8月18日 星期六

Range Search的演算法

原本找到kd-tree覺得夠威了,沒想到還有很多有趣的相關研究。像是”Data Structures for Range Searching”鞭闢入理地分析了6種方法,針對hyperrectangle range query (或稱orthogonal range search)的time complexity、建data structure的time complexity和所需的space complexity,簡潔易懂。結論是一般來說kd-tree最適用,cell(切成格狀)在特殊條件下最適用。

摘錄自1.8 Comparison of Methods:

Both the cell and k-d tree structures are appropriate in situations where the query restricts several of the attributes. If the approximate size and shape of the queries are roughly constant and known in advance, then cells defined by a fixed grid with size and shape similar to those of the expected queries is most advantageous. For queries with sizes and shapes that differ considerably from the design,

For most applications of range searching that are not characterized in the preceding paragraphs, k-d trees are likely to be the method of choice.

但這篇沒提到R-tree,也許那時R-tree還沒發明,得再評估一下R-tree的可能性。

關於演算法的早期研究就是這麼美麗、完備,但現在也沒什麼發展空間,只能拿來用就是了,cite ”Data Structures for Range Searching”的兩篇range query延伸研究分別是“On the Average Performance of Orthogonal Range Search in Multidimensional Data Structures”和”Improving the performance of multidimensional search using fingers”,前者看來像理論分析,後者則是實作分析。

原本感到奇怪,為什麼這些研究都是用orthogonal range search,而不是圓形(球形)的range query,後來才想通,圓形的range query可以先用正方形的range query取出資料,再用圓形的條件去掉多餘的資料即可,效率分析差異不大。

沒有留言:

張貼留言