Kernels

`dkregression.kernels.RBF(x, dist_norm=2, lower_bound_multiplier=0.2, upper_bound_multiplier=5, k_nearest_initial_guess=10)`

The radial basis function (RBF) kernel or squared exponential kernel, how it sometimes is also referred to uses a bell curve to calculate the relationship between two points $x_1, x_2\in\mathbb{R}^d$ in space. It has one (hyper)parameter: the sacale length $\ell$. Let $\mathbf{x}_1\in\mathbb{R}^{m\times d}$ be a matrix that represents $m$ $d$-dimensional points and $\mathbf{x}_2\in\mathbb{R}^{n\times d}$ be a matrix that represents $n$ $d$-dimensional points, then the RBF kernel is defined as: $$ k(\mathbf{x}_1,\mathbf{x}_2)=\exp\left(-\frac{\lVert\mathbf{x}_2 - \mathbf{x}_1\rVert_L^2}{2\ell^2}\right)~. $$ $L$ is the distance norm used. The kernel matrix is of shape $k(\mathbf{x}_1,\mathbf{x}_2)\in\mathbb{R}^{n\times m}$.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Set of data points used to initizalize the kernel. Needs to be two-dimensional of shape `(n,d)` with `n` being the number of points and `d` being the dimenionsionality. This determines the initial guess for the kernel parameter $\ell$ as well as the upper and lower bounds of $\ell$ if the `DKR.fit` method is used.	required
`dist_norm`	`float`	Order of the norm that is used for calculating the pairwise distance between two points.	`2`
`lower_bound_multiplier`	`float`	Factor by which the inital guess for the kernel parameters should be multiplied to obtain a lower bound for the search conducted in `DKR.fit`. Value should fulfill `0<=lower_bound_multiplier<=1`.	`0.2`
`upper_bound_multiplier`	`int`	Factor by which the inital guess for the kernel parameters should be multiplied to obtain a upper bound for the search conducted in `DKR.fit`. Value should fulfill `1<=upper_bound_multiplier`.	`5`
`k_nearest_initial_guess`	`int`	The $k$ used in the heuristic to calculate the average distance between all points in the dataset and their k-nearest point which is used as initial guess for the kernel scale length $\ell$.	`10`

Attributes:

Name	Type	Description
`param_names`	`list`	A list of strings that contains the names of the model parameters. For the RBF kernel, this is the scale length 'l'.
`dist_norm`	`float`	The L distance norm to be used to calculate the pairwise distances in the RBF kernel function. Value needs to fulfill `0<dist_norm`.
`k_nearest_initial_guess`	`int`	The $k$ used in the heuristic to calculate teh average distance between all points in the dataset and their k-nearest point which is used as initial guess for the kernel scale length $\ell$.
`params`	`dict`	Contains the current estimate of the kernel hyperparameter $\ell$. The keys are defined in `RBF.param_names`. For the RBF kernel the only key of this dictionary is 'l'. Upon creating the RBF object, an initial guess for the scale length is calculated as the average distance of the `k_nearest_initial_guess`-th closest point for each point in `X`. This ensures that on average, a sufficient number of points contribute to the kernel regression, at least for the inital guess.
`params_lower_bound`	`dict`	Same keys as defined in `RBF.params_names`, provide the lower bounds for the kernel parameters ($\ell$ for the RBF). The value is determined by multiplying the factor `RBF.lower_bound_multiplier` with the inital guess for each of kernel parameters.
`params_upper_bound`	`dict`	Same keys as defined in `RBF.params_names`, provide the upper bounds for the kernel parameters ($\ell$ for the RBF). The value is determined by multiplying the factor `RBF.upper_bound_multiplier` with the inital guess for each of kernel parameters.

Examples:

import torch
from dkregression.kernels import RBF
from dkregression.likelihoods import UnivariateGaussianLikelihood
from dkregression.cross_validation import CrossValidation
from dkregression import DKR

X = torch.rand((100,2))
Y = torch.rand((100,1))

kernel = RBF(X,dist_norm=1) # using the L1-norm
likelihood = UnivariateGaussianLikelihood()
cv = CrossValidation()

model = DKR(kernel, likelihood, cv)
model.fit(X,Y)

Source code in src/dkregression/kernels/rbf.py

def __init__(self,x,dist_norm=2,lower_bound_multiplier=0.2,upper_bound_multiplier=5,k_nearest_initial_guess=10) -> None:
    # expects x.shape = (n,d)
    self.param_names = ["l"]
    self.dist_norm = dist_norm
    self.k_nearest_initial_guess = k_nearest_initial_guess
    self.lower_bound_multiplier = lower_bound_multiplier
    self.upper_bound_multiplier = upper_bound_multiplier
    self.init_params(x)

`kernel_matrix(x1, x2, normalize_dim=1)`

Returns the kernel matrix for the RBF kernel based on two sets of input points x1, and x2.

Parameters:

Name	Type	Description	Default
`x1`	`Tensor`	First set of input points. Needs to have shape `(n,d)`. Even for the one-dimensional case, the shape needs to be `(n,1)`.	required
`x2`	`Tensor`	First set of input points. Needs to have shape `(m,d)`. Even for the one-dimensional case, the shape needs to be `(m,1)`.	required
`normalize_dim`	`int`	Specifies if and in which dimension the kernel matrix should be normalized. `normalize_dim=-1` indicated no normalization, `normalize_dim=0` indicates column-wise normalization, and `normalize_dim=1` indicates row-wise normalization.	`1`

Returns:

Type	Description
`Tensor`	The kernel matrix of shape `(m,n)`. If `normalize` is set to `True`, the kernel matrix is normalized row-wise, i.e., `kernel_matrix(x1,x2,normalize=True)[i,:].sum()==1`.

Examples:

import torch
from dkregression.kernels import RBF

X1 = torch.rand((100,2))
X2 = torch.rand((50,2))

# the initialization is mandatory to set an initial value for the scale length
kernel = RBF(X1) 
k = kernel.kernel_matrix(X1,X2)
print(k.shape)

This results in the following output:

torch.Size([50, 100])

Source code in src/dkregression/kernels/rbf.py

def kernel_matrix(self,x1,x2,normalize_dim=1) -> torch.Tensor:
    """Returns the kernel matrix for the RBF kernel based on two sets of input points `x1`, and `x2`.

    Args:
        x1 (torch.Tensor): First set of input points. Needs to have shape `(n,d)`. Even for the one-dimensional case, the shape needs to be `(n,1)`.
        x2 (torch.Tensor): First set of input points. Needs to have shape `(m,d)`. Even for the one-dimensional case, the shape needs to be `(m,1)`.
        normalize_dim (int, optional): Specifies if and in which dimension the kernel matrix should be normalized. `normalize_dim=-1` indicated no normalization, `normalize_dim=0` indicates column-wise normalization, and `normalize_dim=1` indicates row-wise normalization.

    Returns:
        (torch.Tensor): The kernel matrix of shape `(m,n)`. If `normalize` is set to `True`, the kernel matrix is normalized row-wise, i.e., `kernel_matrix(x1,x2,normalize=True)[i,:].sum()==1`. 

    Examples:
    ```py
    import torch
    from dkregression.kernels import RBF

    X1 = torch.rand((100,2))
    X2 = torch.rand((50,2))

    # the initialization is mandatory to set an initial value for the scale length
    kernel = RBF(X1) 
    k = kernel.kernel_matrix(X1,X2)
    print(k.shape)
    ```
    This results in the following output:
    ```
    torch.Size([50, 100])
    ```
    """

    #expect that x.shape = (n,d)
    # distance matrix
    d = torch.linalg.norm(x1.unsqueeze(0)-x2.unsqueeze(1),dim=-1,ord=self.dist_norm)

    # calculate the log kernel matrix
    log_k = -d**2/(2*self.params["l"]**2)

    if normalize_dim in [0,1]:
        k = torch.softmax(log_k,dim=normalize_dim)  #this should be numerically more stable than normalizing exp(k)
    elif normalize_dim == -1:
        k = torch.exp(log_k)
    else:
        raise NotImplementedError("The normalization dimension needs to be non-negative integer identifying the dimension along which the kernel matrix should be normalized or -1 to indicate that no normalization should occur.")

    return k

Custom Kernel

It is easily possible to design a custom kernel or implement other kernel functions that are not yet part

of the DKRegression package. Doing this requires following the following template. name="__codelineno-0-1" href="#__codelineno-0-1">class CustomKernel(): def __init__(self, x, *args, **kwargs) -> None: # expect the shape of x to be (n,d) # a list of the parameters names (as strings), e.g., ["param1", "param2"] self.param_names = ... # an initial guess for the kernel parameters based on x. Needs to be # a dictionary with keys specified in self.param_names. self.params = ... # lower bounds for the kernel parameters to be used in `DKR.fit`. # Needs to be a dictionary with keys specified in self.param_names. self.params_lower_bound = ... # upper bounds for the kernel parameters to be used in `DKR.fit`. # Needs to be a dictionary with keys specified in self.param_names. self.params_upper_bound = ... def kernel_matrix(self,x1,x2,normalize_dim=1) -> torch.Tensor: # expect the shape of x1 to be (n,d) and the shape of x2 to be (m,d). k = ... # implement the custom kernel function here # handle the different normalization cases if normalize_dim in [0,1]: # if the kernel function is of the type exp(...), consider # computing log(k) above instead of `k` and use `torch.softmax`. # See `dkregression.kernels.RBF` this implementation trick. k /= torch.sum(k,dim=normalize_dim) elif normalize_dim == -1: # if using the softmax trick, don't forget to compute # the exponential of log(k) here. pass else: raise NotImplementedError("The normalization dimension needs to be \ class="s2"> non-negative integer identifying the dimension along which the \ class="s2"> kernel matrix should be normalized or -1 to indicate that no \ class="s2"> normalization should occur.") return k