Your Mom's A Recommendation System

Thursday, April 10, 2008

Wikilens vs Netflix

WARNING: This is going to sound more like an op-ed piece than a scientific analysis

After a couple of weeks of playing with WikiLens I can't help but compare it to Netflix. Yes, it is very different from Netflix. It allows new categories to be made by any user. It is not confided to movies (though it does have a movie category which appears to have the most data). Users can add their own entries without permission. The engine does NOT wait to your input for it to give you a prediction value. And life is great at Wikilens. Wait a sec...

It isn't as fun. I enjoyed filling in those little stars on Netflix. I enjoyed seeing the results that came back as a results of my previous entry. Somehow it doesn't feel the same way over at Wikilens. Maybe because the interface doesn't have nice little pictures of the movie poster... No. Maybe because I know I'm not going to get a red envelope in the mail... Nope.

The reason is the manner in which WikiLens presents the next "item" for rating/prediction. It feels more like I'm taking a test and am ready to bubble in the next answer. Netflix's interface feels more like the end of a magic trick. The prediction is magically plucked from the inside of a hat like a rabbit (no 'Harvey' jokes please).

Earlier I mentioned that the prediction were given before I rated anything. Perhaps this is where the magic is lost. To me, the user, it doesn't feel like my tedious work of adding smiley faces is affecting the results. I know that it is deep down in my subconscious. It just doesn't recommend the feeling to me.

It still remains to be a wonderful tool. Now if I can just convince more cyclist in Charlotte to enter data for me.

Trust issues

After reading the article for today's class I thought about a project I worked on last semester in an HCI class that focused on security and privacy. The goal of the project was to determine whether or not the average UNCC student was more likely to reveal private information, that could compromise their UNCC email account, to a computer or to a random stranger person. The survey was disguised as a Facebook survey due to the high number of surveys that occur on campus that focus on Facebook. The results found that among the students surveyed they were equally likely to reveal information to either a computer survey or a student surveyor.

The link below details the project. The data doesn't appear to be there for some reason. I'll work on that.

The survey.

http://hci.sis.uncc.edu:8080/itis6010-fall07/26

Thursday, April 3, 2008

Interesting Slashdot post about Netflix Prize

This is a couple of weeks old but I thought it should at least be posted here. It is about one of the leading competitors that emerged recently on the Netflix Prize scene. They claim that he does not have a computing or math background. A little research shows that he does.

http://developers.slashdot.org/developers/08/03/04/2348257.shtml

New Version of UC Berkley's Joke Recommender

This article describes UC Berkley's joke recommender and the updates that were just released. Two pieces of software were released into new versions. EigenTaste is now in version 5 and Jester is now version 4. The system now has over 4 million ratings. The article is an elementary look at this collaborative filtering algorithm run system. It does describe the features and some of the jokes. ;) Prof. Ken Goldberg notes that the engine can be applied to anything with large inventories. Donation Dashboard is one other application that he mentions. I seem to remember talking about this earlier in the semester. Is that right?

I have noticed a trend in systems I've read about recently. Taking into account the user's most recent input seems to be popular.

The article: http://www.networkworld.com/community/node/26480

The Slasdot article: http://idle.slashdot.org/idle/08/03/31/1633223.shtml

The updated software: http://eigentaste.berkeley.edu/info.php

Wikilens Charlotte Bicycle Shops

I have started a new category on WikiLens. It is "Charlotte Bicycle Shops." The goal is to enter all bicycle shops in the area and get users to rate them. Finding a shop that suites a cyclist's needs is not always easy. Different shops sell different brands, offer different services, open different hours and attract different styles of riders. Shops have a overlap in this areas. As a cyclist I usually am loyal to one shop. However, I do find the need to shop at multiple shops for various reasons. Often when I try a new shop I am comparing to pat ones and looking for the same "feel."

To start on a new system on WikiLens you have to create a new category and new items(shops) for that category. Not the easiest data insertion process... but what is?

Currently I have a reasonable number of the shops in the area entered and am twisting the arms of my friends to rate the shops. It isn't producing recommendations yet. But I don't have much user data so far and don't want to create fake data for it.

Thursday, March 27, 2008

Transition from Youtube Journaling to WikiLens

It appears that YouTube is not a personalized recommendation system. Instead YouTube is only keeping track of Favorites so the user can go back and re-watch the videos again. The ratings are not being used to recommend videos with similarities and that are enjoyed by users with similar interests in videos. The ratings are only being used to rank how popular a video is. Popularity determines rank in search results. If two users with dissimilar interests that have rated videos query the same thing, they will receive the same videos. Those videos will always link to the same videos regardless of declared preference. The linking is determined by the user that has submitted the video and by videos with similar words in the title.

Since this is the case, I will be switching the focus of my journal to WikiLens. It is a recommender engine that I have covered in one of my earlier posts. I will begin establishing a more in depth user profile and will also begin a new recommender topic using the engine. I think that it will most likely have something to do with cycling. At the moment I am testing the feasibility of routes, bikes, races, teams, shops and vehicles for cyclists.

Recommender system and method for generating implicit ratings based on user interactions with handheld devices

This recommends items based on how a user interacts with a handheld device. Users are given a repetitive task to do on a mobile device. Their interactions are recorded in a history. This history is used to create recommendations. Unlike Amazon's content based system, this system takes into account how recent an event or interaction occurred. More recent interactions are considered to be more relevant.

According to the claims, the calculations for the recommendations are made in the following three ways:

"rating(item)=number of interactions(item) since datetime(item acquired)/number of total interactions (item) since datetime(item acquired)."
"rating(item)=total interaction time(item)/size(item)"
rating(item)=[total interaction time(item)/size(item)*exp(−damping coefficient]*(date−time acquired).

The first method takes into account the recency of the interaction. The second method takes into account the amount of time that the user spent on a particular interaction. The third method takes into account something that I don't understand. I can't figure out what the difference is between claim 2 and claim 3. They appear to be the same with the exception of the calculations.

Mobile devices that the patent covers in its data gathering include cell phones, mp3 players, PDAs, and electronic book readers.

**WARNING this was probably written by a lawyer***

http://www.freepatentsonline.com/6947922.html

http://www.google.com/patents?id=QkEWAAAAEBAJ&dq=09/596070