Personalization of Content Ranking in the Context of Local Search

My paper entitled “Personalization of Content Ranking in the Context of Local Search” has been accepted for publication at the IEEE/WIC/ACM International Conference on Web Intelligence (WI2009) as a regular paper. The acceptance rate was pretty low at 16%. The conference will be co-located with the Intelligent Agent Technology Conference (IAT2009) in Milan, Italy. My colleague and co-author Philip O’Brien and I will be presenting the paper. The other co-authors are Xiao Luo, Weizheng Gao, and Shujie Li. The conference will be held in September 15-18, 2009 at Università degli Studi di Milano Bicocca.

If you’re planning to attend WI/AIT 2009 and would like to connect, contact me.

The abstract of the paper follows:

Ranking search results using a single ranking function for all search engine visitors is inherently bounded in the performance the ranking algorithm can achieve when considering the variety of requirements of Web searchers and the proliferation of topics and types of data modern search engines rank. Adding a geographical dimension to the mix by way of local search engines further reduces the average satisfaction a ranking algorithm can garner from local search users. Personalization has been proposed in Web search with some success but has not, to our knowledge, been investigated thoroughly in local search. As initial steps in local search personalization, we propose a model for personalizing search results in a local search engine using a hybrid of profile- and click-based user modeling methods. User profiles are used to compare local search results to the topical interests of users and the specific businesses in which they have shown interest by way of search result “clicks”. Our model is tested through a user study and is shown to result in significantly improved mean average precision over the baseline ranking system.

Although experiments were conducted in the context of Local Search, the framework for modelling user interests is transferable to any domain where semantic similarity between users or user and objects is of interest. If you have any questions or comments about our approach, I’d be happy to discuss it in person or electronically – just give me a shout!

I would like to thank the staff at GenieKnows.com for their assistance and feedback both during the experiments and during writing the paper. Special shouts go to Stephanie Armsworthy, Jason Hines, and Jacek Wolkowicz, and Tapajyoti Das. Additional acknowledgements will appear in the paper.

Advice from Successful Ladies and Men

CNN Money has an excellent articles where they interview 22 people from political leaders, company executives, investors, entrepreneurs, and even a chef, asking them to share the best advice they ever got. Below is a summary of their answers:

  • “Keep it Simple” – Tiger Woods
  • “Show, don’t tell” – Jim Sinegal
  • “Do what you love” – Mort Zuckerman
  • “Empower a subordinate” – Lloyd Blankfein
  • “Push beyond your comfort zone” – Mohamed El-Erian
  • “Ignore conventional wisdom” – David Axelrod
  • “Trust your instincts” – Tory Burch
  • “Read everything” – Jim Rogers
  • “Be effective, not popular” – Scott Boras
  • “Use failure to motivate yourself” – Mika Brzezinski
  • “Focus on performance, not power” – Colin Powell
  • “Take advice from smart people” – Shai Agassi
  • “Make an impression” – Sukhinder Singh Cassidy
  • “Hire a coach” – Eric Schmidt
  • “Set realistic goals” – Meredith Whitney
  • “Listen” – Lauren Zalaznick
  • “Don’t talk shop” – Julian Robertson
  • “Treat it like it’s yours” – Thomas Keller
  • “Underpromise and overdeliver” – Robin Li
  • “Don’t pursue titles and dollars” – Miles White
  • “Self-doubt is normal” – Aaron Patzer
  • “Be nice to people” – Niklas Savander

My favourite three from the list are “take advice from smart people”, “Keep it Simple”, and “Show, don’t tell.” Each of these three advices I have come to experience as being true wisdom and highly effective in a diversity of situations.

To read more about these advice, the context, and where they came from, see the original article: Best advice I ever got.

Halifax Tattoo Festival Shrinking Every Year

Royal NS Tattoo Crest of ArmsThe Royal Nova Scotia Tattoo Festival is a yearly tradition in Halifax, NS featuring performers and miltary bands from around the world. The first time attended the festival was back in 2005 and I absolutely loved it. It was such a well organized show, so many performer, amazing bands, dancer, acrobats, and clowns. And the best part of it all was the finale were all the bands would play in unison creating a grand orchestra.

I loved the NS Tattoo so much that I went back the following year in 2006. It was not as big or as good as the 2006 show, but it was still enjoyable. In all honesty, there was a little bit of disappointment that it was getting smaller, yet the performance of 2006 made up for big part of that. The 2006 year was special for NS tattoo, it was the year it was given the Royal status by Her Majesty the Queen for her 80th birthday.

In 2007, the cast was shrinking even more. I thought I’d give it the benefit of doubt and attended anyway. Maybe if it was my first time attending I would’ve liked it, but seeing how it didn’t live up to the 2006 show, leave alone the 2005 performance, I must say I was greatly disappointed. I felt cheated in many ways because the prices didn’t go down, but the quality did. I had hopes that achieving the Royal designation would help the festival grow, but it was not to be.

I had to skip 2008 for personal reasons – by son was under a year old, I was 2 month away from defending by PhD dissertation – not a good time for big fun events.

This year, in 2009, “performer from around the world” really refers to only nine countries: Canada, Belgium, Denmark, Estonia, France, Germany, Sweden, United Kingdom, and United States. There are a total of 30 groups confirmed, but 20 of them are from Canada. This is very different from previous years were we had more international participation. I am especially disappointed that the Russian and Ukrainian dancers are not in this year’s show.

My plan for this year is to go see the parade and then decide whether to attend the big show or not.

One piece of advice: if you’re planning to attend, go for the cheap tickets. I’ve tried different seating and they were all equally great. There is typically plenty of empty seats, so moving around is possible if your seat turns out to be not so great.

Future of the Web: Mobile + Social + Linked

By now, it should be clear to everyone that the future of the Web is mobile devices, social networking, and linked data. Although this realization did not come to me until I attended the World Wide Web Conference (WWW2009) in Madrid, Spain, at the end of April. Several keynote speakers, tutorials/workshops, and dedicated tracks and sessions emphasized this fact. I’ll say a few word on each of the three pillars:

Mobile Web

As Sir Tim Burners-Lee said in his keynote presentation: “more people will have their first encounter with the Web through a mobile device than a laptop.” The World Wide Web Consortium (W3C) has task forces and standards to promote and standardize the mobile Web. They’ve been doing it for years, but only now that people are listening seriously because it is already happening.

What this means to the Web community is that Web sites should have a mobile version that conforms to standards and guidelines and works well on mobile devices.

Social Networking

It is a given that social networks are taking over the Web. But that’s not what “the future of the Web is social networking” means. Social networking on the Web refers to enabling the end users to interact with each other on your Web site. Without this interaction, your site will fail to deliver the expected value by the growing savvy Web population.

Linked Data

The Semantic Web is dubbed as Web 3.0. Many believe that a widespread of the Semantic Web is at least a decade away. Linked Data, on the hand, is already a reality. Linked data consists of the now-feasible subset of the Semantic Web. At the very basic level, it is RDF + URI: a common data representation format and standard for addressing and linking data items. The academic community emphasizes the important of Linked Open Data – making your proprietary databases accessible on the Web in the Linked Data format. This movement is new but strong. Many tools are being developed to facilitate the transition to Linked Data for the masses.

Summary

In short, as a Web site owner, if you want to survive on the Web for the years to come, you better ensure that your site is accessible from mobile devices, that you facilitate social interaction among your users, and that your data is open and linked, and linkable.

Thanks to Christopher Gutteridge for this wonderful tool to search and browse the WWW2009 proceedings.

WWW2009 Reading List

There were several interesting papers published at the 18th World Wide Web Conference (WWW2009) in Madrid last month. Below is a list of papers that I’m planning to read:

This list will be updated as I remember/discover more interesting papers.

If you’d like to recommend other papers to be added this list, contact me!

Two Papers Accepted at the WWW 2009 Developer Track

The World Wide Web Conference (WWW2009) will be held in Madrid, Spain, April 20-24, 2009. I will be co-presenting two papers at the Developer track. The first paper is co-authored with Jason Hines and entitled “Query GeoParser: A Spatial-Keyword Query Parser Using Regular Expressions”. The second paper is co-authored with Christopher Adams and entitled “Creating Your Own Web-Deployed Street Map Using Open Source Software and Free Data”. Both papers will be presented in the afternoon of Friday, April 20. The schedule, as well as a link to the proceedings, can be found on the Developer’s Track page. The paper abstracts follow.

Query GeoParser: A Spatial-Keyword Query Parser Using Regular Expressions

Abstract: There has been a growing commercial interest in local information within Geographic Information Retrieval, or GIR, systems. Local search engines enable the user to search for entities that contain both textual and spatial information, such as Web pages containing addresses or a business directory. Thus, queries to these systems may contain both spatial and textual components—spatial-keyword queries. Parsing the queries requires breaking the query into textual keywords, and identifying components of the geo-spatial description. For example, the query ‘Hotels near 1567 Argyle St, Halifax, NS’ could be parsed as having the keyword ‘Hotels’, the preposition ‘near’, the street number ‘1567’, the street name ‘Argyle’, the street suffix ‘St’, the city ‘Halifax’, and the province ‘NS’. Developing an accurate query parser is essential to providing relevant search results. Such a query parser can also be utilized in extracting geographic information from Web pages.

One approach to developing such a parser is to use regular expressions. Our Query GeoParser is a simple, but powerful, regular expression-based spatial-keyword query parser. Query GeoParser is implemented in Perl and utilizes many of Perl’s capabilities in optimizing regular expressions. By starting with regular expression building blocks for common entities such as number and streets, and combining them into larger regular expressions, we are able handle over 400 different cases while keeping the code manageable and easy to maintain. We employ the mark-and-match technique to improve the parsing efficiency. First we mark numbers, city names, and states. Following, we use matching to extract keywords and geographic entities. The advantages of our approach include manageability, performance, and easy exception handling. Drawbacks include a lack of geographic hierarchy and the inherent difficulty in dealing with misspellings. We comment on our overall experience using such a parser in a production environment, what we have learnt, and suggest possible ways to deal with the drawbacks.

Creating Your Own Web-Deployed Street Map Using Open Source Software and Free Data

Abstract: Street maps are a key element to Local Search; they make the connection between the search results, and the geography. Adding a map to your website can be easily done, using an API from a popular local search provider. However, the lists of restrictions are lengthy and customization can be costly, or impossible. It is possible to create a fully customizable web-deployed street map without sponsoring the corporate leviathans, at only the cost of your time and your server. Being able to freely style and customize your map is essential; it will distinguish your website from websites with shrink wrapped maps that everyone has seen. Using open source software adds to the level of customizability – you will not have to wait two years for the next release and then maybe get the anticipated new feature or the bug fix; you can make the change yourself. Using free data rids you of contracts, costly transactions, and hefty startup fees. As an example, we walk through creating a street map for the United States of America.

A Web-deployed street map consists of a server and a client. The server stores the map data including any custom refinements. The client requests a portion of the map and the server renders that portion and returns it to the client, which in turn displays it to the user. The map data used in this example is the Tiger/LINE data. Tiger/LINE data covers the whole of the USA. Another source of free road network data is OpenStreetMap, which is not as complete as Tiger/LINE but includes additional data such as points of interest and streets for other countries. Sometimes the original data is not formatted in a manner that attributes to a good looking, concise map. In such cases, data refinement is desired. For instance, performance and aesthetics of a map can be improved by transforming the street center lines to street polygons. For this task, we use the Python language, which has many extensions that make map data refinement easy. The rendering application employed is MapServer. MapServer allows you to specify a configuration file for your map, which consists of layers referencing geographical information, as well as the style attributes to specify how the layers are visualized. MapServer contains utilities to speed up the rendering process, and organize similar data. On the front end, we need a web-page embeddable client that can process requests for map movements, and scale changes in real time. In our experience, OpenLayers is this best tool for this task; it supports many existing protocols for requesting map tiles and is fast, customizable, and user friendly. Thus, deploying a street map service on the Web is feasible for individuals and not limited to big corporations.

First Earth Day Celebration

This year was the first time I participated in Earth Day. From 8:30 PM to 9:30 PM Atlantic Time, all the lights in my house were off. My family and I enjoyed a romantic dinner lit by numerous candles.

We had a few objectives in doing this. First, to participate in a growing international movement and save power consumption during this special hour. Second, to remind oursevles of the importance to conserve energy usage and pledge not to be wasteful. Third, to enjoy an evening with candle lights, something that we rarely do.

The New Face of GenieKnows.com

business-202GenieKnows.com relaunched their site with a new image and a new functionality – it is now the home of version 2.0 of their local search platform. This new release brings user-contributed reviews of local businesses along with user-contributed photos. Other notable features include a cleaner interface and improved functionality.

With this move, GenieKnows positions itself as a serious player in the local search market going head-to-head against the big boys as well as the newer smaller players.

WordPress Automatic Upgrade to 2.7.1 Fails on 1and1 Hosting

BannerI’ve been trying to upgrade to the latest WordPress (version 2.7.1) for a while without success. Every time I initiate the upgrade, whether through WordPress’ built-in automatic upgrade feature or using the automatic upgrade plugin, the process stalls when it reaches the step of downloading the latest zip file. After checking the the compatibility matrix I discovered that my hosting provider, 1and1, runs PHP4 by default, which incompatible with the upgrade script. This problem can be easily fixed by forcing PHP5. This can be achieved by adding the following line to your .htaccess file:

AddType x-mapp-php5 .php