Generative Models for Recommender Systems

Welcome to the fascinating world of generative models in recommender systems, where the power of artificial intelligence meets the art of personalization! Generative models for recommendation systems – an innovative approach that surpasses traditional algorithms to craft personalized suggestions like a master artist painting vibrant portraits of your interests

Imagine a recommender system that not only understands your preferences but has the creative power to implore entirely new possibilities adapted just for you. In this domain, data is not just analysed; it becomes a canvas on which these models paint a rich tapestry of novel and diverse recommendations, promising a journey of delightful discoveries at every turn. Let us begin on this fascinating exploration of generative models, where the boundaries of personalized recommendations are redefined, and the art of suggestion reaches unprecedented heights. Let us dive deep into these concepts in this blog. Read the entire blog to gain immense knowledge on this topic.

What is Generative model for Recommendation system?

Generative models for recommender systems are a class of machine learning models that aim to understand the underlying patterns in user-item interaction data and generate personalized recommendations. This means that generative models not only predict user preferences for existing items but also have the ability to generate entirely new item recommendations that are tailored to each user’s unique preferences.

The fundamental idea behind generative models is to learn the patterns and relationships within the user-item interaction data in a way that allows the model to create new, realistic examples.

  • Personalization: Generative models can create highly personalized recommendations, catering to individual tastes and preferences, as they capture the user’s unique interaction patterns.
  • Novelty: Unlike traditional models that often rely on popular items, generative models can suggest novel and diverse items, enhancing user discovery.
  • Adaptability: Generative models can adapt to changing user preferences over time, ensuring the recommendations remain relevant and up-to-date.
  • Data Efficiency: Generative models can handle sparse data more effectively and make better use of limited user-item interactions.

Let us check some facts about Generative models:

Source: Google Trends

This is searching trends of Generative Model in India for the past 30 days. Daily average search of this topic is more than 75 times. This is how the world is proceeding with the concept of Generative Model.

Let us check the states where several companies are working with this concept.

Source: Google Trends

As we can see Karnataka and Telangana are the states leading from front.

Source: Google Trends

These are the countries searched the most of this topic in the last 30 days. China, Singapore leading the search history.

We have 3 popular generative model: Variational Autoencoder (VAE) and Generative Adversarial Network (GAN), Autoregressive model, these can grasp the intricate relationships between users and items. They create a multidimensional representation of preferences, allowing for a better understanding of individual tastes and preferences.

Let us discuss VAE and GAN in detail.

Variational Autoencoder (VAE):

It is a generative AI algorithm that uses deep learning to detect anomalies, remove noise and generate new content. It is used to learn a probabilistic mapping of input data into a lower-dimensional latent space, and from that latent space, generate new data that resembles the original input.

The basic structure of a Variational Autoencoder consists of two main components:

1.Encoder: The encoder takes the input data (e.g., user-item interactions in the case of recommender systems) and maps it into a latent space that is lower dimension space. The output of the encoder is not a deterministic encoding, but rather a probability distribution of the latent variables.

Now Question arises PCA and Encoder same?

No, as PCA learns linear relationship and Encoder learns Non-Linear relationship.

From the figure we can see that PCA is learning the data having linear structure and Encoder is learning the data having complex structure. Once Encoder uses linear activation functions then in that case PCA=Encoder.

2.Decoder: The decoder takes a sample from the latent space (i.e., latent variables) and reconstructs the original input data from that sample. Like the encoder, the decoder also produces a probability distribution for the output data.

Let us now see what VAE for Recommendation system is:

Here is a high-level overview of how a VAE-based recommender system works:

1.Data Representation:

Convert user-item interactions into a matrix format, where rows represent users, columns represent items, and the entries contain user-item interactions (e.g., ratings, binary indicators, or other relevant metrics).

2.VAE Architecture:

  • Encoder (Q): The encoder network takes user-item interactions (or their embeddings) as input and maps them to a lower-dimensional latent space, where each user and item are represented as continuous vectors.
  • Decoder (P): The decoder network takes a point in the latent space (user and item embeddings) and reconstructs the user-item interactions from it, aiming to reconstruct the original data accurately.

3.VAE Training:

  • The VAE is trained to minimize the reconstruction loss, which measures how well the decoder can reconstruct the original user-item interactions from the latent space.
  • Additionally, VAEs incorporate a regularization term (KL-divergence) to encourage the latent space to follow a prior distribution (usually a Gaussian distribution) and improve the interpretability of the latent representations.

4.Recommendation Generation:

  • After training, the VAE can be used to generate recommendations for users.

Let us check one example code of Variational Encoder. We are considering Movies Lens dataset and working with it to develop the Recommendation model.

First, we install required packages.

! pip install tensorflow –upgrade
import numpy as np
import pandas as pd
from keras.layers import Input, Dense, Lambda
from keras.models import Model
from keras import backend as K
from keras import objectives
from sklearn.model_selection import train_test_split  

Load data and do preprocessing.

data = pd.read_csv(‘/content/ratings.csv’)
# Preprocess the data to create the user-item interactions matrix  

user_item_matrix = data.pivot(index=’userId’, columns=’movieId’, values=’rating’)
user_item_matrix = user_item_matrix.fillna(0)  

# Convert the user-item interactions matrix to a NumPy array  

user_item_matrix_array = user_item_matrix.to_numpy()  

# Spliting the data into train and test  

train_data, test_data = train_test_split(user_item_matrix_array, test_size=0.2, random_state=42)  

Defining parameters, Building Encoder, Decoder and combining encoder and decoder to build the model.

# Define VAE parameters

original_dim = user_item_matrix_array.shape[1]  # Number of items (features)

latent_dim = 32                                 # Number of latent dimensions

batch_size = 64

epochs = 50

# VAE encoder architecture

x = Input(shape=(original_dim,))

h = Dense(64, activation=’relu’)(x)

z_mean = Dense(latent_dim)(h)

z_log_var = Dense(latent_dim)(h)

# Reparameterization trick

def sampling(args):

    z_mean, z_log_var = args

    epsilon = K.random_normal(shape=(K.shape(z_mean)[0], latent_dim), mean=0., stddev=1.0)

    return z_mean + K.exp(z_log_var) * epsilon

z = Lambda(sampling)([z_mean, z_log_var])

# VAE decoder architecture

decoder_h = Dense(64, activation=’relu’)

decoder_mean = Dense(original_dim, activation=’sigmoid’)

h_decoded = decoder_h(z)

x_decoded_mean = decoder_mean(h_decoded)

# Define VAE custom loss function (binary cross-entropy + KL divergence)

def vae_loss(x, x_decoded_mean):

    xent_loss = objectives.binary_crossentropy(x, x_decoded_mean)

    kl_losss = – 0.5 * K.sum(1 + zlog_var – K.square(z_mean) – K.exp(zlog_var), axis=-1)

    return xent_loss + kl_losss

# Compile VAE model

vae = Model (x, x_decoded_mean)

vae.compile(optimizer=’adam’, loss=vae_loss)

Training the model.

# Train the VAE model

vae.fit(train_data, train_data,

        shuffle=True,

        epochs=epochs,

        batch_size=batch_size,

        validation_data=(test_data, test_data))

# Use the trained VAE model to generate recommendations for a user

def generate_recommendations(user_data, vae_model):

    return vae_model.predict(user_data)

# Example usage

user_id = 0

user_data = np.expand_dims(user_item_matrix_array[user_id], axis=0)

recommendations = generate_recommendations(user_data, vae)

# The ‘recommendations’ array will contain predicted interaction probabilities for each item.

# We can then use this information to recommend the top items to the user.

Generative Adversarial Network (GAN):

Let us understand some basic things about GAN.

GANs are a powerful class of generative models that learn to generate new data that resembles a given dataset.

GAN contains the following components:

1.Generator (G): The generator takes random noise as input and transforms it into data samples that resemble the training data.

2.Discriminator (D): The discriminator acts as a binary classifier. It takes data samples as input and tries to distinguish between real data from the training set and fake data generated by the generator.

3.Training Process: Training in GANs takes place alternating the updating of Generator and Discriminator. During each iteration, the generator generates fake data, and the discriminator evaluates both real and fake data, assigning probabilities for each sample being real

4.Loss Function: The generator’s loss is inversely related to the discriminator’s accuracy. The generator aims to minimize the log-probability that the discriminator assigns to the generated data being fake. On the other hand, the discriminator aims to maximize the log-probability of assigning the correct labels to real and fake data.

5.The training process continues iteratively until the generator produces realistic data samples that can deceive the discriminator effectively. At this point, the generator can be used as a generative model to produce new data that resembles the training data.

Top of Form

Let us now focus on GAN for recommended system and how it works:

1.Data Representation:

Convert user-item interactions into a matrix format, where rows represent users, columns represent items, and the entries contain user-item interactions (e.g., ratings, binary indicators, or other relevant metrics).

2.GAN Architecture:

  • Generator (G): The generator network takes random noise or user/item embeddings as input and generates a recommendation vector for each user. It aims to produce recommendations that are relevant to each user.
  • Discriminator (D): The discriminator network takes a user-item pair (user embedding and item embedding) as input and predicts whether the user-item interaction is real (i.e., from the training data) or fake (i.e., generated by the generator).
  • The generator and discriminator are trained adversarial, with the generator trying to improve the quality of its recommendations to deceive the discriminator, and the discriminator trying to distinguish between real and generated user-item pairs accurately.

3.Training Process:

  • Randomly initialize the generator and discriminator.
  • For each training iteration:

1.Sample a batch of real user-item pairs from the training data.

         2.Generate a batch of fake user-item pairs using the generator.

         3.Train the discriminator on both real and fake pairs, optimizing its ability to distinguish between them.

         4.Train the generator, trying to maximize the discriminator’s   confusion (minimizing its ability to tell real from fake).

4.Recommendation Generation:

  • After training, use the trained generator to produce personalized recommendations for users by inputting their user embeddings or random noise.
  • The generator will output a recommendation vector, which can be used to rank items based on their relevance to the user.

Let’s take an example code to understand better:

We are considering Movies Lens dataset and working with it to develop the Recommendation model.

First, we install and import the import packages.

!pip install numpy pandas tensorflow

import numpy as np

import pandas as pd

import tensorflow as tf

from tensorflow.keras.layers import Input, Dense

from tensorflow.keras.models import Model

from tensorflow.keras.optimizers import Adam

from sklearn.model_selection import train_test_split

Loading dataset and doing the preprocessing.

data = pd.read_csv(‘/content/ratings.csv’)

# Preprocess the data to create the user-item interactions matrix

user_item_matrix = data.pivot(index=’userId’, columns=’movieId’, values=’rating’)

user_item_matrix = user_item_matrix.fillna(0)

# Convert the user-item interactions matrix to a NumPy array

user_item_matrix_array = user_item_matrix.to_numpy()

# Spliting the data into train and test sets

train_data, test_data = train_test_split(user_item_matrix_array, test_size=0.3, random_state=10)

Defining parameter and then building architecture of Generator and Discriminator and finally combing these 2 to build the model.

# Define GAN parameters

original_dim = user_item_matrix_array.shape[1]  # Number of items (features)

latent_dim = 32                                 # Number of latent dimensions

batch_size = 64

epochs = 50

# GAN generator architecture

gene_input = Input(shape=(latent_dim,))

gene_x = Dense(128, activation=’relu’)(gene_input)

gene_output = Dense(original_dim, activation=’sigmoid’)(gene_x)

generator = Model(gene_input, gene_output)

# GAN discriminator architecture

disc_input = Input(shape=(original_dim,))

disc_x = Dense(128, activation=’relu’)(disc_input)

disc_output = Dense(1, activation=’sigmoid’)(disc_x)

discriminator = Model(disc_input, disc_output)

# Compile the discriminator

discriminator.compile(optimizer=Adam(beta_1=0.5,learning_rate=0.0002), loss=’binary_crossentropy’)

# Combine the generator and discriminator into a GAN model

gann_input = Input(shape=(latent_dim,))

gann_output = discriminator(generator(gann_input))

gann = Model(gann_input, gann_output)

gann.compile(optimizer=Adam( beta_1=0.5,learning_rate=0.0002), loss=’binary_crossentropy’

Now model will be trained.

# Training the GAN

for epoch in range(epochs):

    # random noise generated as input to the generator

    noise = np.random.normal(0, 1, size=[batch_size, latent_dim])

    # Generate fake data using the generator

    generated_dataa = generator.predict(noise)

    # Select a random batch of real data from the training set

    idx = np.random.randint(0, train_data.shape[0], batch_size)

    real_data = train_data[idx]

    # Concatenate real and fake data to create the training dataset for the discriminator

    X = np.concatenate([real_data, generated_dataa])

    # Create labels for the discriminator (1 for real data, 0 for fake data)

    y_diss = np.zeros(2 * batch_size)

    y_diss[:batch_size] = 0.9  # One-sided label smoothing for stability

    # Train the discriminator

    discriminator.trainable = True

    d_loss = discriminator.train_on_batch(X, y_diss)

    # Train the generator

    noise = np.random.normal(0, 1, size=[batch_size, latent_dim])

    y_genn = np.ones(batch_size)

    discriminator.trainable = False

    g_loss = gan.train_on_batch(noise, y_genn)

# Use the trained generator to generate recommendations for a user

def generate_recommendation (user_data, generator_model):

    return generator_model.predict(user_data)

# Example usage

user_id = 1

user_data = np.expand_dims(user_item_matrix_array[user_id], axis=0)

recommendations = generate_recommendation(user_data, generator)

The ‘recommendations’ array will contain predicted interaction probabilities for each item.

We can then use this information to recommend the top items to the user.

Autoregressive model for Recommendation system

Autoregressive models are another class of generative models that can be used in recommendation systems.

One popular autoregressive model for recommendation is the Recurrent Neural Network (RNN) with sequence-to-sequence architecture.

Let us see how Autoregressive model is used for Recommendation system:

1.Data Preparation: Convert user-item interactions into a sequence of items for each user, ordered by the timestamp or interaction sequence. For example, a user’s interaction history [item1, item2, item3] can be used to predict the next item in the sequence.

2.Autoregressive Model Architecture:

  • Encoder: The encoder network processes the input sequence (user’s interaction history) and encodes it into a fixed-length context vector. Common encoder choices include LSTM (Long Short-Term Memory) or GRU (Gated Recurrent Unit) layers.
  • Decoder: The decoder network takes the context vector and generates the next item in the sequence, conditioned on the previous items. It can use LSTM or GRU layers, followed by a dense layer with a SoftMax activation to predict the probability distribution over items.

3.Training Process:

  • The autoregressive model is trained to minimize the cross-entropy loss between the predicted item probabilities and the ground truth (the actual next item in the sequence).
  • During training, the model uses teacher forcing, where the ground truth items are fed as input to the decoder at each time step, instead of using its own generated outputs.

4.Recommendation Generation:

  • To generate recommendations for a user, provide the user’s interaction history to the trained model as input.
  • Use the model to predict the next item in the sequence.

Repeat the process, conditioning each prediction on the previously generated items, until the desired no. of recommendation is obtained.

#Importing important libraries

import numpy as np

import pandas as pd

from keras.models import Model

from keras.layers import Input, LSTM, Dense

from keras.utils import to_categorical

from sklearn.model_selection import train_test_split

data = pd.read_csv(‘path/to/dataset.csv’)

# Preprocess the data to create the user-item interactions sequences

sequences = data.groupby(‘userId’)[‘movieId’].apply(list).tolist()

# Convert movieId to integer indices (assuming movieIds start from 1)

unique_items = set(item for seq in sequences for item in seq)

item_to_index = {item: index for index, item in enumerate(unique_items)}

sequences_indices = [[item_to_index[item] for item in seq] for seq in sequences]

# Create input and target sequences for training

input_sequences = [seq[:-1] for seq in sequences_indices]

target_sequences = [seq[1:] for seq in sequences_indices]

# Pad sequences to the same length for batch processing of the data

max_sequence_length = max(len(seq) for seq in sequences_indices)

input_sequences = np.array([seq + [0] * (max_sequence_length – len(seq)) for seq in input_sequences])

target_sequences = np.array([seq + [0] * (max_sequence_length – len(seq)) for seq in target_sequences])

# Converting target sequences to one-hot encoded format

target_sequences_onehot = to_categorical(target_sequences, num_classes=len(unique_items))

# Split the data into train and test

train_data, test_data, train_targets, test_targets = train_test_split(input_sequences, target_sequences_onehot, test_size=0.2, random_state=43)

# Define RNN model architecture

latent_dim = 32  # Number of latent dimensions

num_items = len(unique_items)

encoder_inputs = Input(shape=(max_sequence_length,))

encoder = LSTM(latent_dim)(encoder_inputs)

decoder_inputs = Input(shape=(max_sequence_length – 1, num_items))

decoder_lstm = LSTM(latent_dim, return_sequences=True)

decoder_outputs = decoder_lstm(decoder_inputs, initial_state=[encoder, encoder])

decoder_dense = Dense(num_items, activation=’softmax’)

decoder_outputs = decoder_dense(decoder_outputs)

model = Model([encoder_inputs, decoder_inputs], decoder_outputs)

# Compile the model

model.compile(optimizer=’adam’, loss=’categorical_crossentropy’)

# Train the model

model.fit([train_data, train_data[:, 1:]], train_targets,

          batch_size=64,

          epochs=50,

          validation_data=([test_data, test_data[:, 1:]], test_targets))

# Use the trained model to generate recommendations for a user

def generate_recommendations(user_data, model):

    # Assuming ‘user_data’ contains the user’s interaction history as a sequence of item indices

    # For example: [1, 5, 2, 3, 0, 0, 0] where 0 represents padding

    user_data = np.expand_dims(user_data, axis=0)

    target_input = np.zeros_like(user_data)

    predictions = model.predict([user_data, target_input])

    next_item_index = np.argmax(predictions[:, -1, :])

    return next_item_index

# Example usage

user_id = 0

user_data = input_sequences[user_id]

next_item_index = generate_recommendations(user_data, model)

next_item = list(unique_items)[next_item_index] print(“Next recommended item:”, next_item)

Conclusion

So, from the blog we got an overall idea what Generative model for Recommended system is. We discussed 3 Recommendation i.e., VAE, GAN, Autoregressive models in brief. We have understood its application using sample dataset of Movies Lens. In practice, the choice of a generative model depends on the specific characteristics of the dataset, the desired recommendation goals, and the available computational resources. Overall, generative models offer a promising direction in the field of recommendation systems, and ongoing research continues to improve their effectiveness and address the challenges associated with their use. Thanks for having patience to read my blog. If you have liked it, please provide feedback in the comment section.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top