特徵提取

timm 中所有模型都有一致的機制，用於從模型中獲取各種型別的特徵，以完成分類之外的任務。

倒數第二層特徵（預分類器特徵）

有多種方法可以獲取倒數第二層模型的特徵，而無需對模型進行修改（當然，您也可以隨意進行修改）。首先必須決定是需要池化還是非池化的特徵。

非池化

有三種方法可以獲得非池化特徵。最終的、非池化的特徵有時被稱為最後隱藏狀態。在 timm 中，這包括到最終歸一化層為止（例如在 ViT 風格的模型中），但不包括池化/類別詞元選擇和最終的後池化層。

無需修改網路，可以在任何模型上呼叫 `model.forward_features(input)`，而不是通常的 `model(input)`。這將繞過網路的頭部（分類器）和全域性池化。

如果想顯式地修改網路以返回非池化特徵，可以建立一個不帶分類器和池化的模型，或者稍後移除它們。這兩種方法都會從網路中移除與分類器相關的引數。

forward_features()

>>> import torch
>>> import timm
>>> m = timm.create_model('xception41', pretrained=True)
>>> o = m(torch.randn(2, 3, 299, 299))
>>> print(f'Original shape: {o.shape}')
>>> o = m.forward_features(torch.randn(2, 3, 299, 299))
>>> print(f'Unpooled shape: {o.shape}')

輸出

Original shape: torch.Size([2, 1000])
Unpooled shape: torch.Size([2, 2048, 10, 10])

建立不帶分類器和池化的模型

>>> import torch
>>> import timm
>>> m = timm.create_model('resnet50', pretrained=True, num_classes=0, global_pool='')
>>> o = m(torch.randn(2, 3, 224, 224))
>>> print(f'Unpooled shape: {o.shape}')

輸出

Unpooled shape: torch.Size([2, 2048, 7, 7])

稍後移除

>>> import torch
>>> import timm
>>> m = timm.create_model('densenet121', pretrained=True)
>>> o = m(torch.randn(2, 3, 224, 224))
>>> print(f'Original shape: {o.shape}')
>>> m.reset_classifier(0, '')
>>> o = m(torch.randn(2, 3, 224, 224))
>>> print(f'Unpooled shape: {o.shape}')

輸出

Original shape: torch.Size([2, 1000])
Unpooled shape: torch.Size([2, 1024, 7, 7])

將非池化輸出連結到分類器

使用 `forward_head()` 函式，可以將最後的隱藏狀態反饋給模型的頭部。

>>> model = timm.create_model('vit_medium_patch16_reg1_gap_256', pretrained=True)
>>> output = model.forward_features(torch.randn(2,3,256,256))
>>> print('Unpooled output shape:', output.shape)
>>> classified = model.forward_head(output)
>>> print('Classification output shape:', classified.shape)

輸出

Unpooled output shape: torch.Size([2, 257, 512])
Classification output shape: torch.Size([2, 1000])

池化

要修改網路以返回池化特徵，可以使用 `forward_features()` 並自行對結果進行池化/展平，或者像上面那樣修改網路但保持池化層不變。

建立不帶分類器的模型

>>> import torch
>>> import timm
>>> m = timm.create_model('resnet50', pretrained=True, num_classes=0)
>>> o = m(torch.randn(2, 3, 224, 224))
>>> print(f'Pooled shape: {o.shape}')

輸出

Pooled shape: torch.Size([2, 2048])

稍後移除

>>> import torch
>>> import timm
>>> m = timm.create_model('ese_vovnet19b_dw', pretrained=True)
>>> o = m(torch.randn(2, 3, 224, 224))
>>> print(f'Original shape: {o.shape}')
>>> m.reset_classifier(0)
>>> o = m(torch.randn(2, 3, 224, 224))
>>> print(f'Pooled shape: {o.shape}')

輸出

Original shape: torch.Size([2, 1000])
Pooled shape: torch.Size([2, 1024])

多尺度特徵圖（特徵金字塔）

目標檢測、分割、關鍵點檢測以及各種密集畫素任務需要從骨幹網路中獲取多尺度的特徵圖。這通常透過修改原始的分類網路來實現。由於每個網路的結構差異很大，任何給定的目標檢測或分割庫通常只支援少數幾種骨幹網路。

timm 提供了一個一致的介面，可以將任何包含的模型建立為特徵骨幹網路，並輸出選定級別的特徵圖。

透過在任何 `create_model` 呼叫中新增 `features_only=True` 引數，可以建立一個特徵骨幹網路。預設情況下，大多數具有特徵層次結構的模型將輸出最多 5 個特徵，最大縮減率為 32。然而，這因模型而異，有些模型層次較少，而有些（如 ViT）則有大量非層次化的特徵圖，它們預設輸出最後 3 個。可以向 `create_model` 傳遞 `out_indices` 引數來指定你想要的特徵。

建立一個特徵圖提取模型

>>> import torch
>>> import timm
>>> m = timm.create_model('resnest26d', features_only=True, pretrained=True)
>>> o = m(torch.randn(2, 3, 224, 224))
>>> for x in o:
...     print(x.shape)

輸出

torch.Size([2, 64, 112, 112])
torch.Size([2, 256, 56, 56])
torch.Size([2, 512, 28, 28])
torch.Size([2, 1024, 14, 14])
torch.Size([2, 2048, 7, 7])

查詢特徵資訊

特徵骨幹網路建立後，可以查詢其通道或解析度縮減資訊，以提供給下游的頭部，而無需靜態配置或硬編碼常量。`.feature_info` 屬性是一個封裝了特徵提取點資訊的類。

>>> import torch
>>> import timm
>>> m = timm.create_model('regnety_032', features_only=True, pretrained=True)
>>> print(f'Feature channels: {m.feature_info.channels()}')
>>> o = m(torch.randn(2, 3, 224, 224))
>>> for x in o:
...     print(x.shape)

輸出

Feature channels: [32, 72, 216, 576, 1512]
torch.Size([2, 32, 112, 112])
torch.Size([2, 72, 56, 56])
torch.Size([2, 216, 28, 28])
torch.Size([2, 576, 14, 14])
torch.Size([2, 1512, 7, 7])

選擇特定特徵級別或限制步幅

還有兩個額外的建立引數會影響輸出的特徵。

out_indices 選擇要輸出的索引
output_stride 限制網路的特徵輸出步幅（順便說一下，在分類模式下也有效）

輸出索引選擇

`out_indices` 引數受所有模型支援，但並非所有模型都具有相同的索引到特徵步幅的對映關係。請檢視程式碼或檢查 `feature_info` 進行比較。輸出索引通常對應於 `C(i+1)` 特徵級別（即 `2^(i+1)` 倍的縮減）。對於大多數卷積神經網路模型，索引 0 是步幅為 2 的特徵，索引 4 是步幅為 32 的特徵。對於許多 ViT 或 ViT-Conv 混合模型，可能有很多或所有特徵圖都具有相同的形狀，或者是層次化和非層次化特徵圖的組合。最好檢視 `feature_info` 屬性，以瞭解特徵的數量、它們對應的通道數和縮減級別。

out_indices 支援負索引，這使得獲取最後一個、倒數第二個等特徵圖變得容易。out_indices=(-2,) 將返回任何模型的倒數第二個特徵圖。

輸出步幅（特徵圖擴張）

output_stride 是透過將層轉換為使用擴張卷積來實現的。這樣做並不總是直接的，一些網路僅支援 output_stride=32。

>>> import torch
>>> import timm
>>> m = timm.create_model('ecaresnet101d', features_only=True, output_stride=8, out_indices=(2, 4), pretrained=True)
>>> print(f'Feature channels: {m.feature_info.channels()}')
>>> print(f'Feature reduction: {m.feature_info.reduction()}')
>>> o = m(torch.randn(2, 3, 320, 320))
>>> for x in o:
...     print(x.shape)

輸出

Feature channels: [512, 2048]
Feature reduction: [8, 8]
torch.Size([2, 512, 40, 40])
torch.Size([2, 2048, 40, 40])

靈活的中間特徵圖提取

除了使用模型工廠的 `features_only` 引數外，許多模型還支援一個 `forward_intermediates()` 方法，它提供了一種靈活的機制來提取中間特徵圖和最後的隱藏狀態（可以連結到頭部）。此外，該方法還支援一些模型特定的特性，例如為某些模型返回類別或蒸餾字首詞元。

與 `forward_intermediates` 函式相伴的是一個 `prune_intermediate_layers` 函式，它允許你從模型中修剪層，包括頭部、最終歸一化層和/或不需要的尾部塊/階段。

一個 `indices` 引數同時用於 `forward_intermediates()` 和 `prune_intermediate_layers()`，以選擇要返回的特徵或要移除的層。與 `features_only` API 的 `out_indices` 一樣，`indices` 是模型特定的，並選擇返回哪些中間結果。

在非層次化的基於塊的模型（如 ViT）中，索引對應於塊；在具有層次化階段的模型中，它們通常對應於主幹（stem）和每個層次化階段的輸出。支援正向（從頭開始）和負向（相對於結尾）索引，而 `None` 用於返回所有中間結果。

在修剪模型時，`prune_intermediate_layers()` 呼叫會返回一個索引變數，因為負索引必須轉換為絕對（正）索引。

model = timm.create_model('vit_medium_patch16_reg1_gap_256', pretrained=True)
output, intermediates = model.forward_intermediates(torch.randn(2,3,256,256))
for i, o in enumerate(intermediates):
    print(f'Feat index: {i}, shape: {o.shape}')

Feat index: 0, shape: torch.Size([2, 512, 16, 16])
Feat index: 1, shape: torch.Size([2, 512, 16, 16])
Feat index: 2, shape: torch.Size([2, 512, 16, 16])
Feat index: 3, shape: torch.Size([2, 512, 16, 16])
Feat index: 4, shape: torch.Size([2, 512, 16, 16])
Feat index: 5, shape: torch.Size([2, 512, 16, 16])
Feat index: 6, shape: torch.Size([2, 512, 16, 16])
Feat index: 7, shape: torch.Size([2, 512, 16, 16])
Feat index: 8, shape: torch.Size([2, 512, 16, 16])
Feat index: 9, shape: torch.Size([2, 512, 16, 16])
Feat index: 10, shape: torch.Size([2, 512, 16, 16])
Feat index: 11, shape: torch.Size([2, 512, 16, 16])

model = timm.create_model('vit_medium_patch16_reg1_gap_256', pretrained=True)
print('Original params:', sum([p.numel() for p in model.parameters()]))

indices = model.prune_intermediate_layers(indices=(-2,), prune_head=True, prune_norm=True)  # prune head, norm, last block
print('Pruned params:', sum([p.numel() for p in model.parameters()]))

intermediates = model.forward_intermediates(torch.randn(2,3,256,256), indices=indices, intermediates_only=True)  # return penultimate intermediate
for o in intermediates:    
    print(f'Feat shape: {o.shape}')

Original params: 38880232
Pruned params: 35212800
Feat shape: torch.Size([2, 512, 16, 16])

< > 在 GitHub 上更新