Logo

Viturka

Documentation

Installion

To install the SDK, run the following command in terminal:

pip install viturka

To install the deep learning library, i.e. viturka_nn, run the following command in terminal:

pip install viturka_nn

viturka.FM_als

1. handle_missing_values

Handles missing values in a DataFrame by dropping or imputing columns based on the percentage of NaN values. Includes options for advanced imputation.

Parameters:

  • df (pd.DataFrame): The input DataFrame.
  • drop_threshold (float): Percentage of missing values above which columns are dropped (default: 0.7).
  • fill_threshold (float): Percentage of missing values below which columns are imputed (default: 0.3).
  • advanced_imputation (str or None): Type of advanced imputation ('knn', 'regression', or None).

Returns:

A DataFrame with missing values handled.

2. _regression_imputation

Performs regression-based imputation for a column with missing values.

Parameters:

  • df (pd.DataFrame): The DataFrame containing the column to be imputed.
  • target_column (str): The column to be imputed.

Returns:

The DataFrame with the target column imputed.

3. correlation_matrix

Plots the correlation matrix for selected features in a CSV file.

Parameters:

  • file (str): Path to the CSV file.
  • selected_columns (list): List of columns for which to compute correlations.

4. find_similar_items

Finds the most similar items to a given item using cosine similarity and latent vectors from a trained model.

Parameters:

  • model: A trained factorization machine model.
  • dict_vectorizer: A vectorizer for categorical features.
  • item_features (pd.DataFrame): Dataset containing item features.
  • item_id: The ID of the target item.
  • item_id_column (str): Column name for item IDs.
  • numerical_columns (list): List of numerical feature columns.
  • categorical_columns (list): List of categorical feature columns.
  • top_k (int): Number of similar items to return (default: 5).

Returns:

A list of top-k similar items with their similarity scores.

5. train_model

Trains a factorization machine model on a dataset while avoiding data leakage.

Parameters:

  • file (str): Path to the CSV dataset.
  • target_column (str): Column name for the target variable.
  • numerical_columns (list): List of numerical feature columns.
  • categorical_columns (list): List of categorical feature columns.
  • item_id_column (str): Column name for item IDs.
  • n_iter (int): Number of iterations (default: 100).
  • Other parameters for model configuration.

Returns:

Trained model, combined training/test data, test target values, vectorizer, scaler, and processed DataFrame.

6. evaluate_model

Evaluates the FM model using mean squared error (MSE).

Parameters:

  • model: Trained FM model.
  • X_test: Test feature matrix.
  • y_test: True target values for the test set.
  • scaler_target: Scaler used for the target variable.

Returns:

Predicted values and MSE score.

viturka.client

This document provides an overview of the ModelUploader class in the client module, which facilitates uploading models to a server, performing aggregation, and integrating the received global model.

Class: ModelUploader

The ModelUploader class is responsible for handling model uploads, interacting with the server, and performing local aggregation with global models.

Attributes

  • api_key: The API key for authenticating requests to the server.
  • server_url: The endpoint for uploading models to the server. Default is https://example.com/upload_model.

Methods

__init__(self, api_key)

Initializes the ModelUploader object with the provided API key.

Parameters:
  • api_key (str): The API key for authenticating server requests.

pad_to_match_shape(self, model1_weights, model2_weights)

Pads the smaller weight array with zeros to match the size of the larger array.

Parameters:
  • model1_weights (array): The first model's weights.
  • model2_weights (array): The second model's weights.
Returns:
  • A tuple of two arrays with matching shapes.

upload_model(self, model, model_type, vectorizer=None)

Uploads a model to the server, retrieves the global model, and performs local aggregation.

Parameters:
  • model: The local model to be uploaded.
  • model_type (str): The type of the model (e.g., "linear").
  • vectorizer (optional): Additional vectorizer object for preprocessing, if applicable.
Returns:
  • The locally aggregated model after merging with the global model.
Process:
  • Serializes the local model and optional vectorizer using pickle.
  • Sends the serialized data to the server via a POST request.
  • Deserializes the received global model and performs local aggregation on weights, latent factor matrix, and bias term.

Usage Example

from viturka.client import ModelUploader

# Initialize the uploader
uploader = ModelUploader(api_key="your_api_key")

# Define your model and optional vectorizer
local_model = ...  # Example: A scikit-learn or custom model object
vectorizer = ...   # Example: A vectorizer

# Upload and aggregate the model
aggregated_model = uploader.upload_model(local_model, model_type="linear", vectorizer=vectorizer)

Viturka API

Following are the examples of API Payloads:

Bash (using curl)

curl -X POST \
  "https://viturka.com/upload_model" \
  -H "Authorization: Bearer ${api_key}" \
  -F "model=@model.pkl;filename=model.pkl" \
  -F "model_type=${model_type}"

cURL (command-line tool)


curl -X POST \
  -H "Content-Type: multipart/form-data" \
  -F "api_key=${api_key}" \
  -F "model_type=${model_type}" \
  -F "model=@model.pkl;filename=model.pkl" \
  "https://viturka.com/upload_model"
 

JavaScript (using fetch API)

const formData = new FormData();
formData.append('api_key', api_key);
formData.append('model_type', model_type);
formData.append('model', new Blob([model_data]), 'model.pkl');

fetch("https://viturka.com/upload_model", {
  method: 'POST',
  body: formData
})
.then(response => response.json())
.then(data => {
  // Process response data
})
.catch(error => {
  console.error(error);
});

Python (using requests)


import requests

response = requests.post(
    "https://viturka.com/upload_model",
    files={'model': ('model.pkl', model_data)},
    data={'api_key': api_key, 'model_type': model_type}
)

JSON


{
    "api_key": "${api_key}",
    "model_type": "${model_type}",
    "model": "data:application/octet-stream;base64,"
}

Following is the list of collaborative filtering recommender model types offered by Viturka along with required parameters:

For more collaborative models, check out Model Registry Syntax for column names of parameters: User ID is user_id and Product Name is product_name. Similarly for all column names avoid capital letters and use underscore in place of space.

We recommend using Desktop Mode for the best experience. Please switch to Desktop Mode in your browser settings.