Starting of September that is week 10 brought something interesting,i was introduced to Recommendation system by Dr. Sarabjot Singh Anand, Co-Founder Sabudh Foundation ,Er. Niharika Arora ,Data Scientist at Tatras Data and Er. Gurmukh Singh ,Trainee Data Scientist at Tatras Data.
Recommender systems are one of the most successful and widespread application of machine learning technologies in business.You can apply recommender systems in scenarios where many users interact with many items.You can find large scale recommender systems in retail, video on demand, ormusic streaming. In order to develop and maintain such systems, a company typically needs a group of expensive data scientist and engineers. That is why even large corporates such as BBC decided to outsource its recommendation services.
Machine learning algorithms in recommender systems are typically classified into two categories — content based and collaborative filtering methods although modern recommenders combine both approaches. Content based methods are based on similarity of item attributes and collaborative methods calculate similarity from interactions. Below we discuss mostly collaborative methods enabling users to discover new content dissimilar to items viewed in the past.
Collaborative methods work with the interaction matrix that can also be called rating matrix in the rare case when users provide explicit rating of items. The task of machine learning is to learn a function that predicts utility of items to each user. Matrix is typically huge, very sparse and most of values are missing.
The simplest algorithm computes cosine or correlation similarity of rows (users) or columns (items) and recommends items that k — nearest neighbors enjoyed.
Matrix factorization based methods attempt to reduce dimensionality of the interaction matrix and approximate it by two or more small matrices with k latent components.
Association rules can also be used for recommendation. Items that are frequently consumed together are connected with an edge in the graph. You can see clusters of best sellers (densely connected items that almost everybody interacted with) and small separated clusters of niche content.
EVALUATION OF RECOMMENDER SYSTEM
More practical offline evaluation measure is recall or precision evaluating percentage of correctly recommended items (out of recommended or relevant items). DCG takes also the position into consideration assuming that relevance of items logarithmically decreases.
The problem which we covered was Cold start problem in that
Sometimes interactions are missing. Cold start products or cold start users do not have enough interactions for reliable measurement of their interaction similarity so collaborative filtering methods fail to generate recommendations.
In machine learning and natural language processing, a topic model is a type of statistical model for discovering the abstract "topics" that occur in a collection of documents. Topic modeling is a frequently used text-mining tool for discovery of hidden semantic structures in a text body. Intuitively, given that a document is about a particular topic, one would expect particular words to appear in the document more or less frequently: "dog" and "bone" will appear more often in documents about dogs, "cat" and "meow" will appear in documents about cats, and "the" and "is" will appear equally in both.
I used Latent Drichillet Allocation(LDA) to find abstract articles from document, the document i used was of news articles.Others used Word2Vec,Doc2Vec,Glove .
LDA
LDA is an example of topic model and is used to classify text in a document to a particular topic. It builds a topic per document model and words per topic model, modeled as Dirichlet distributions.
Word2Vec
Word2vec is a two-layer neural net that processes text. Its input is a text corpus and its output is a set of vectors: feature vectors for words in that corpus.The purpose and usefulness of Word2vec is to group the vectors of similar words together in vectorspace. That is, it detects similarities mathematically. Word2vec creates vectors that are distributed numerical representations of word features, features such as the context of individual words. It does so without human intervention.
Given enough data, usage and contexts, Word2vec can make highly accurate guesses about a word’s meaning based on past appearances.
GloVe
GloVe, coined from Global Vectors, is a model for distributed word representation. The model is an unsupervised learning algorithm for obtaining vector representations for words. Training is performed on aggregated global word-word co-occurrence statistics from a corpus, and the resulting representations showcase interesting linear substructures of the word vector space.
Linear Regression Numpy code We finished coding generating data for the numpy version of Linear regression. We didn't use data from an excel sheet or a Kaggle dataset and hence we had to create our own data. For this, we created random integer data for our X and betas. Then we created a noise, as real data always has noise, using this we created Y data. the code for the same was as follows: import numpy as np samplesize=1000 num_attrs= 3 step = 0.1 x_inputs = np.random.rand(samplesize,num_attrs-1) x0 = np.ones((samplesize,1)) x_data = np.concatenate((x0, x_inputs), axis=1) noise = np.random.randn(len(x_inputs),1) betas = np.random.rand(num_attrs,1) y_true = x_data.dot(betas) + noise #understand this y_true.reshape(1000,1) Python course We started an Udemy course on Python. The concepts we covered today were: Pros and Cons of Dynamic Typing String Indexing and Slicing Various String Methods String Interpolation: ...
In this week we started trying our hands on Deep Learning assignment part 2 "Speech Recognition". In this challenge we will take our knowledge of feedforward neural networks and apply it to a more useful task than recognizing handwritten digits: speech recognition. We were provided a dataset of audio recordings (utterances) and their phoneme state (subphoneme) labels. The data comes from articles published in the Wall Street Journal (WSJ) that are read aloud and labelled using the original text. It is crucial for us to have a means to distiguish different sounds in speech that may or may not represent the same letter or combinations of letters in the written alphabet. For example, the words "jet" and "ridge" both contain the same sound and we refer to this elemental sound as the phoneme "JH". For this challenge we will consider 46 phonemes in the english language. Next we had a session with Danko sir and we did open discussion with him on...
After completing my news article recommender last week the upcoming week brought me opportunity to explore a library for high dimensional space visualization t-SNE , I was told by my mentor Mr.Vikram Jha to explore it and tell insights. t-SNE t-Distributed Stochastic Neighbor Embedding (t-SNE) is a ( prize-winning ) technique for dimensionality reduction that is particularly well suited for the visualization of high-dimensional datasets. Before jumping to t-SNE i knew about old technique of dimensionality reduction that is PCA, Principal Component Analysis, I first studied in ISB videos but when Sarabjot sir explained,it became thorough PCA Principal component analysis (PCA) is a technique used to emphasize variation and bring out strong patterns in a dataset. It's often used to make data easy to explore and visualize.
Comments
Post a Comment