Attention

  • namespace: Rindow\NeuralNetworks\Layer
  • classname: Attention

Dot-product attention layer.

Inputs are query tensor of shape [batch_size, Tq, dim], value tensor of shape [batch_size, Tv, dim] and key tensor of shape [batch_size, Tv, dim]. The calculation follows the steps:

  • Calculate scores with shape [batch_size, Tq, Tv] as a query-key dot product: scores = matmul(query, key, transpose_b=True).
  • Use scores to calculate a distribution with shape [batch_size, Tq, Tv]: distribution = softmax(scores).
  • Use distribution to create a linear combination of value with shape [batch_size, Tq, dim]: return matmul(distribution, value).

Methods

constructor

$builer->Attention(
    array $input_shapes=>null,
    string $name=null,
)

You can create a Attention layer instances with the Layer Builder.

Options

  • input_shape: Tell the first layer the shape of the input data. In input_shape, the batch dimension is not included.
  • return_attention_scores: If True, returns the attention scores after softmax.

Input shape

Input is a list in the form of [query,value] or [query,value,key]. If the key is omitted, the same tensor as value is entered. the query tensor shape is [batch_size, Tq, dim]. the value tensor shape is [batch_size, Tv, dim]. the key tensor shape is [batch_size, Tv, dim].

Output shape

if return_attention_scores is true, list of [outputs,scores]. the outputs shape is [batch_size, Tq, dim]. the scores shape is [batch_size, Tq, Tv]

$attention = $builder->layers()->Attention();
....
$query = $mo->ones([4,3,5]);
$value = $mo->ones([4,2,5]);
....
[$outputs,$scores] = $attention->forward([$query,$value],true,
                                    ['return_attention_scores'=>true]);
# $outputs->shape() : [4,3,5]
# $scores->shape() : [4,3,2]

Example of usage

class Foo extends AbstractModel
{
    public function __construct($backend,$builder)
    {
        ...
        $this->attention = $builder->layers()->Attention();
        ....
    }

    protected function call(.....) : NDArray
    {
        ...
        $outputs = $this->attention->forward([$key, $value],$training);
        ...
    }
}