Google Obtains Patent for Similarity Engine

January 09, 2007

It was recently announced that Google Inc. obtained a patent directed to the calculation of similarity metrics for objects, including web pages. U.S. Patent No. 7,158,961, entitled “Methods and apparatus for estimating similarity,” issued on January 2, 2007, and includes 24 claims on systems and computer-implemented methods for generating a compact representation of objects. The method of claim 1 comprises the identification of a set of features corresponding to a first object, the generation of a hashing vector having n coordinates for each feature, “summing the hashing vectors to obtain a summed vector,” and the creation of “an nx-bit representation of the summed vector by calculating an x-bit value for each coordinate of the summed vector, the nx-bit representation of the summed vector defining the compact representation of the first object.” According to the patent, and from the search engine's perspective, “one problem in cataloging the large number of available web pages is that multiple ones of the web documents are often identical or nearly identical,” and that “[s]eparately cataloging similar documents is inefficient and can be frustrating for the user if, in response to a request, a list of nearly identical documents is returned.” The patent further states that “it is desirable for the search engine to identify documents that are similar or "roughly the same" so that this type of redundancy in search results can be avoided,” and that “there is a need in the art for improved techniques for determining similarity between documents.” According to the DailyTech news article (link below), several companies, including IBM and Hitachi, have also filed patent applications for “similarity-engines” over the past decade.

U.S. Patent No. 7,158,961: LINK
DailyTech News Article: LINK

0 comments:

Disclaimer

Copyright 2006-2010, Mark Reichel. The Daily Dose of IP is my personal website, and I am not providing any legal advice or financial analysis. Any views expressed herein should not be viewed as being the views of my employer, Ice Miller LLP. Any comments submitted to this blog will not be held in confidence and will not be considered as establishing an attorney-client relationship. Information submitted to this blog should be considered as being public information, and the submitter takes full responsibility for any consequences of any information submitted. No claims, promises, or guarantees are made or available regarding the completeness or accuracy of the information contained in this blog or otherwise available by searching from or linking away from this blog.

January 09, 2007

0 comments:

Post a Comment

WIPO Press Releases

WIPO General News

Patent References

Disclaimer

USPTO Press Releases - 2010

EPO Updates

Trademark References

Blog Archive

The DDIP Author

Subscribe/Feedback

Fellow Blogs/Bloggers