History of recommender systems
A brief and superficial overview
In our daily lives we are getting into decision-making situations countless times, often even unnoticed. What should we put on in the morning which is suitable for our daily program? Which menu to choose in the dining room? Which task should we perform first? Which school should we enroll our child to? We answer thousands of crucial or everyday questions like this in our lifetime.
In the past we have often asked experts or friends to help us in these decisions, but for some time we also have other possibilities available. Not only the librarian or bookstore attendant can help us to select our next book anymore, but even a website selling books (as well), such as Amazon. The videos suggested by YouTube are all based on our previous browsing and offer audiovisual content that pleases us with a relatively high hit rate.
Compared to the conventional solution, which is based on the suggestions of our friends, the site mentioned above makes its proposal based on the world’s largest video library. Since our friends’ collective insight is only a fraction of that, it has an unbeatable advantage. This way we can discover songs of bands we would most likely never have met in any other way. Since using this method not only our acquaintances can offer us content, but everyone in the world does so – unintentionally – through the recommender system.
However, it is important to state here that – as we will see later-, the recommender systems also have their limitations, so probably (and hopefully) in certain areas these will never know us as well as our friends and relatives. Considering these, the author of this post would like to avoid the impression that he is encouraging anyone to replace ordinary human relationships with recommender systems. We should consider these more as an opportunity, which can help us to decide and save time in certain – less personal – questions, or find entertainment which we may have never discovered without these systems.
An alternative definition
Recommender system: an information-filtering system supporting the user in a given decision making situation by narrowing the set of possible options and prioritizing its elements in a specific context. Prioritization can be based on the user’s explicitly or implicitly expressed preferences and also on the previous behavior of users with similar preferences.
The history of recommender systems
Information has always been an inseparable part of nature. When winter arrives, when birds are migrating, where to find fish and how to catch them; which berries can we eat and which bring certain death? Our ancestors have been passing on their experience even in ancient times. They were driven by the instinct to survive. They saw themselves in their offspring, and thus wanted, mostly unconsciously, to achieve their immortality. After all they didn’t struggle with their offspring to have them die after gorging on a handful of berries.
Information could only recently, five millennia ago turn into power, with the invention of writing, which became the cornerstone of the modern world. Writing down words preserves the idea for a long time. They display the vibrations of the human soul in written form, thus creating a new form of eternal life.
To understand the essence of information we had to wait until the 20th century. Claude Shannon has realized that the information content of a message depends solely on how much it differs from the average. Therefore, it is not closely related to the content or the length of the message, what only counts is whether it is unexpected or not. A politician on the take has no information content nowadays, but if it turns out that one is managing the public wealth trusted to him or her with proper care that is almost a sensation. If yesterday’s news were announced on the TV today that would have zero information content, except maybe the knowledge that the news editor has gone mad.
Shannon has also discovered that information can be transformed to numbers, or precisely to two state systems, thus creating the basic unit of the new, information-based world order. Considering its content one bit is 0 or 1, yes or no, exists or not, but its power is that with the chain of these we can code anything. By digitizing things mankind has achieved that information became not only lasting, but can be stored in virtually endless quantities, and at the same time we are able to distribute it extremely fast.
In parallel with the spreading of computers used for civilian purposes satisfying user expectations on a continuously growing scale has gotten into the focus of development companies and researchers.
Behind the rapidly increasing popularity of machines extremely serious efforts lie, which has been made to reduce “friction” between man and machine. Increasingly comfortable solutions have been created because attempts have been made to understand people’s needs and personalize the services provided by computers.
The basics of recommender systems were founded by researches into cognition science  and information retrieval , and its first manifestation was the Usenet communication system created by Duke University in the second half of the 1970s , where users were able to share textual content with each other. These were categorized into newsgroups and subgroups for easier search, but it was not directly built on or targeted the preferences of users.
The first known such solution was the computer librarian Grundy, which first interviewed users about their preferences and then recommended books to them considering this information. Based on the information collected the system allocated the user into a stereotype group using a rather primitive method, thus recommending the same books to all persons in the same group. For more information about the results of Grundy’s solution and its popularity among users, see Rich’s 1979 article . This approach may seem a little outdated today, but at the time it was a paradigm shift in automated services, since it was personalised. It is important to note that this milestone has not been reached by all webshops, even now.
However, Grundy’s solution quickly gained a lot of critics in the scientific world. Nisbett and Wilson state that “people are very weak in the study and description of their own cognitive processes” . According to their studies, people often highlight their attributes which make them stand out from the rest of a particular group, making stereotyping efforts more difficult. Of course, it could happen that we simply want to create a different image of ourselves.
As Heli Vainio, the manager of one of Northern Europe’s largest shopping malls phrases a little harshly in her earlier interview: “People respond to questionnaires in a way that makes them look better. I don’t care about lies. I’m interested in facts.” . That’s why she’s equipped her shopping centre with Wi-Fi equipment that can track visitors within and in the immediate vicinity of the building with an accuracy of 2 meters. The goal is that instead of the visitors their actions should talk.
Basically two very different directions of recommender systems have evolved over time: collaborative filtering and content-based filtering. The former attempts to map (profile) the taste of users and offers content to them that users with similar preferences liked. The content-based filtering is about knowing the dimensions of the entity to be recommended (for example, a musical content recommendation system can consider the following dimensions: style, artist, era, orchestration, etc.) and the user’s preferences for these dimensions or characteristics. Thus, every time a user likes another song, this new information is added to their profile.
The first example of collaborative filtering and also the origin of the term was the Tapestry system developed by Xerox PARC, which allowed its users to take notes and comment on the documents they were reading (initially in binary form: liking or disliking it). Therefore, users could not only use the content of the documents to manually narrow their search, but on the basis of notes and reviews from other users, which once reaching an appropriate number of users, was able to rank the thematic documents rather well on the basis of their relevance and usefulness .
GroupLens , which started in 1992, was already able to make automated recommendations for Usenet articles if the user had already evaluated some articles in the system. In the following years, many thematic recommender sites were developed, such as Ringo developed at MIT, later the Firefly music recommender pages, or the BellCore movie proposer system.
The first solution, which not only tried to encompass narrower topics, but no less than the Internet itself was Yahoo!, under a different name then. The two Stanford students created a thematic website catalog with indexed pages, which quickly gained popularity and meant easier searches for millions on the Internet, and based on the Alexa rankings it is still ranked as the fifth most visited website.
The roots of content-based filtering have to be looked for in the information retrieval field, from which many techniques have been transferred. The first documented solution came from Emanuel Goldberg in the 1920s (if not including the Jaquard loom presented in 1801, the predecessor of the Hollerith punchcard), which was a “statistical machine” that attempted to automatically find documents stored on celluloid tapes by searching for patterns .
In the 1960s, thanks to a team of researchers at the University of Cornwall gathering around Salton, a model for automatic indexing of texts has been created over nearly a decade, which forms the basis of the text mining processes we know today . The procedure is very simple, documents are classified along certain predetermined criteria (dimensions) that are collected into a vector as indices . The more similar two documents are to each other, the smaller the angle of the vectors describing them are.
The next milestone was the CITE online catalogue system developed in 1979 by Tamás Doszkocs for the National Library of Medicine, which not only allowed users to search for books by category, but sorted them by relevance based on search terms.
Content-based filtering has gained raison d’étre relatively late in the 1990s as a division of information retrieval. The main reason for the delay was that creating a well-functioning content-based filtering system, even in a particular subject, is a huge challenge, since the task is nothing less than to “understand” the examined topic and the factors influencing the relationship of the users to the discipline.
One of the first and quite successful research on this subject was the Music Genome Project in 1999, which aims to “understand” and capture music through its properties. To this end, more than 450 such properties have been revealed and their relations have been described using an algorithm. The basis of the procedure is that when the user likes a song, positive values are assigned to its specific properties (such as style, era, artist, orchestration, beat, etc.). Songs with similar properties will then be further promoted on the preference list and brought to the user’s attention. The huge advantage to collaborative filtering is that very little information is sufficient at startup, while the former unfortunately requires a lot of users and feedback to identify people with similar tastes. However, the disadvantage is that usually it can hardly, or not at all make recommendations outside the user’s music lists, since it is not building on the similarities between users, only the “understanding” the properties of music as an entity. Pandora Internet Radio, which has 250 million users, is based on this project even today .
The first solution to combine collaborative and content-based filtering solutions was Fab  developed by Stanford students, presented in 1994. They point out that their objective with the hybrid system is to eliminate the disadvantages of the two procedures which became known by that time. Their model consists of two basic processes: first they collect content for specific topics (such as websites or articles about financial topics), then for each individual user they select those items collected from specific subjects which highly likely will interest them specifically and finally these contents will reach the user.
Combining the two approaches can be conceived in several ways, one procedure can be embedded in another, as Fab’s example shows, or it is possible to give a joint recommendation as a result of the two procedures, as Netflix does. Netflix’s algorithm, CineMatch was the most successful recommender system for online movie sales in the early 2000s. It was a serious catalyst for such research and the scientific field – which only started its independent existence in the 90s – started rapidly to develop. The 2006 Netflix Award’s challenge was to create a recommender algorithm based on the 100 million film reviews made available by them, which makes recommendations at least 10% better than the results of CineMatch. The 1 million dollar prize in 2009 was awarded for a solution that included 107 different algorithms and mixed their recommendations depending on the circumstances . We can’t omit the biggest example of online referral systems today, amazon.com, which recommends products to the user based on a cooperative filtering technique, taking into account previously browsed and purchased products and what they are currently viewing.
This technique is now used by many webshops in order to improve their sales figures. The purpose of the recommender systems operating in the online customer space is to customize the storefront according to the current taste and need of the customers. This creates an almost unbeatable competitive advantage for online shops over traditional brick-and-mortar store purchases, whose only forte is that the product can be touched by hand or tried on. Nowadays, it is more typical that after a fitting in the shop the products are ordered online, or using the product return option are bought without tryout .
By now recommender systems are widespread and very popular among users drowning in the flood of information, even when many people know that they only try to sell them another product. However, the success of these solutions is indisputable and has irrevocably become part of our lives, let’s just think about Youtube or Facebook .
 E. Rich (1979): User modeling via stereotypes, Cognitive Science, Vol. 3, No. 4,
 M. Sanderson – W. B. Croft (2012): The History of Information Retrieval Research, Proceedings of the IEEE, Vol. 100, pp. 1444–1451.
 P. Resnick – N. Iacovou – M. Sushak – P. Bergstrom – J. Riedl (1994): GroupLens: An open architechure for collaborative filtering of netnews, In Proceedings of the ACM Conf. Computer Support Cooperative Work (CSC),pp. 175-186.
 R. E. Nisbett – T. D. Wilson (1977): Telling more than we can know: Verbal reports on mental processes, Psychological Review, Vol. 84, No. 3. pp. 231-259.
 D. Goldberg – B. Oki – D. Nichols – D. B. Terry (1992): Using Collaborative Filtering to Weave an Information Tapestry, Communications of the ACM, December, Vol. 35, No. 12, pp. 61-70.
 M. Sanderson – W. B. Croft (2012): The History of Information Retrieval Research, Proceedings of the IEEE, Vol. 100, pp. 1444–1451.
 G. Salton – A. Wong – C. S. Yang (1975): A vector space model for automatic indexing, Communications of the ACM, Vol.18, No.11, pp. 613-620.
 M.H. Ferrara – M. P. LaMeau (2012): Pandora Radio/Music Genome Project. Innovation Masters: History’s Best Examples of Business Transformation. Detroit. pp. 267-270. Gale Virtual Reference Library.
 M. Balabanovic ́ – Y Shoham (1997): Fab: Content-based, Collaborative Recommendation, Communications of the ACM, Vol.40, No.3, pp.66-72.
 Recommendation system – https://en.citizendium.org/wiki/Recommendation_system
 J. B. Schafer – J. A. Konstan – J. Riedl (2001): E-Commerce recommendation applications, Data Mining and Knowledge Discovery, Vol. 5, No. 1, pp. 115–153.
History of recommender systems
This article was written by: Sándor Apáthy
Onespire Data Science and Analytics Services
Discover our other posts in this category!
Supporting the data cleansing processes of data migration projects using the KNIME Analytics Platform
In a pilot project with MINDSPIRE, we have examined which tool could be used without developer knowledge to effectively support data cleansing tasks during data migration projects.
A short and superficial introduction to recommender systems: an increasing amount of information is pouring on us, it is a real challenge to sort out the elements that matter to us.
Did you know why most digital transformation projects born dead? Would you like to understand, what the key factors are to ensure successful transformation at any organization?