极市导读
本文介绍了已有的几种关键点检测数据增强的方法,将其的优缺点进行了对比并整合出了一个兼容100+种关键点检测方法的小工具,附有详细的代码实操。 >>加入极市CV技术交流群,走在计算机视觉的最前沿
准备写个关键点检测的专栏,名字就叫“只讲关键点”,这是篇是只讲关键点(人脸/人体/手部关键点检测)系列的第一篇。我寻思着,既然是第一篇,那应该讲一些基础且实用的东西,就好比小说里的修仙打怪开新地图,如果你练得一手好药,无论你是筑基金丹还是元婴老怪,总是能胜人一筹。那么,对于关键点检测来说,什么样的知识点是属于“基础且实用”的“好药”范畴呢,那当然是数据增强了,因为无论您正在研究或需要解决的是哪种关键点检测任务(人脸关键点、人体关键点、手部关键点),都需要用到数据增强,如果您手上有一个简单好用的数据增强的工具,那将能为您节省不少的时间。这篇文章可能是这个专栏的所有文章里实用性最强的一篇,所以作为这个专栏的首篇是再合适不过了。这篇文章不仅介绍已有的数据增强工具,也介绍一下我自己整合的一个兼容100+ 种关键点检测方法的小工具 torchlm,简单易用,可pip一键安装。下文所述均为个人理解,如有偏误欢迎指正。
GitHub(欢迎star和pr): https://github.com/DefTruth/torchlm
现在有哪些数据增强工具?比较出名的imgaug和albumentations,当然torchvision中也包含一些常用的图像数据增强方法。imgaug和albumentations包含非常多的基础的数据增强方法,它们可以实现的效果包括添加噪音、仿射变换、裁剪、翻转、旋转等,也支持矩形框、关键点等数据类型的变换。其效果图如下所示:
此外,Albumentations更是发了篇论文《Albumentations: fast and flexible image augmentations》,也是十分骚气。到目前为止,albumentations已经包含了70多种数据增强方法,也都涵盖了图像、矩形框、关键点、分割等各个数据类型的变换。这些开源库的特点就是非常全面,支持很多不同数据类型的变换,我在日常的工作中也经常会用到。
但用的多了,也慢慢地发现了一些使用体验上的问题。比如,当我想结合imgaug、albumentations和torchvision来用的时候,会发现他们各自有一套数据类型的定义,为了兼容不同的规范,我经常要捏一些函数来做转换,整个pipeline就变得ugly了;
又比如,当我想按照albumentations的风格来写一个关键点的transforms时,发现albumentations对类要求不仅仅是要实现apply_to_keypoints,还要实现兼容apply_to_bboxes、apply_to_mask等方法来兼容其他的数据类型的变换,但其实我只想要一个关于关键点变换的功能,然后能够以比较优雅的方式直接放到Compose的pipeline里面,最终我还是选择了在外部写一个函数来实现这个事情,这就不太pythonic;
再比如,imgaug和albumentations并没有对“数据安全”做很好的检查,在albumentations中设置remove_invisible=True时,会自动删除越界的点位,而为False时,则会保留越界的点位,哪怕点位已经在图像外面,但在实际应用中,这两种方式,对于关键点检测这个任务来说,都不是很合理。如果是删除越界点位,那么,在你经过一连串的数据增强的变化后,原来是10个点位,现在变成8个了,这8个点位大概率是不能用的,因为索引对不上。如果是保留图像外的点位,那么可能会给训练数据引入额外的误差。毫无疑问,他们都是非常优秀的开源工具,但是就关键点检测这个任务来说,我还是想要一个更加简洁一致的数据增强工具,不需要过多的抽象,可以有one-line code的风格。
imgaug仓库(12k+star):https://github.com/aleju/imgaug
albumentations仓库(9k+star):https://github.com/albumentations-team/albumentations
那么,这个工具,在我的想象里,应该是怎么样的呢?我想,它应该是一个带有“极简主义”色彩的工具箱,它需要满足:
one-line code
兼容其他主流库的方法,一行代码解决。
安全性
,可自动回退,撤销不安全的增强变换。
零代码自动兼容numpy和Tensor的数据类型
,不需要用户做任何的转换。
现有的开源工具目前看来似乎还没有能全部满足上述这些要求的,那就自己写一个吧,于是就有了torchlm的transforms模块。当然,必须强调的是,这并没有做出什么有用的创新,也没有和这些非常成熟的开源项目比较的意思,它的出现仅仅是为了让这些关键点的数据增强方法更好用而已。torchlm不仅提供了自定义的将近30种关键点数据增强方式,并且可以通过torchlm.bind方法,one-line-code style 兼容 80+ 种来自torchvision和albumentations的数据增强方法,也支持一行代码绑定用户自定义方法,自动兼容numpy和Tensor的数据类型,不需要用户做任何的转换。并且torchlm提供的将近30种关键点数据增强方法以及被torchlm.bind绑定后的80+种来自torchvision和albumentations的数据增强方法都是“安全”的,可自动回退,撤销不安全的增强变换。
数据格式要求
: 所有的transform统一规范为(img, landmarks)输入输出,无论实际上是否对点位进行变动,不变动点位的image-only变换会直接返回原始点位。np.ndarray或者torch.Tensor均可,因为这两种是最常用的。albumentations中的list格式,反而不是很常用。我们经常需要对landmarks做一些数学运算,list不是很方便。img是RGB格式输入输出,shape为(H,W,3),landmarks是xy
格式输入输出,shape为(N,2)
,N表示点位的数量。
类型命名风格
: 命名风格统一为LandmarksXXX或者LandmarksRandomXXX,遵循torchvision和albumentations的命名风格,Random表示该类型属于随机变换类型,可在初始化时指定一个概率。而torchlm提供的非Random则表示非随机类型,在torchlm的pipeline中一定会被执行(而bind自其他主流库的不一定是必须执行的,比如albumentations中非Random类型还是可以指定概率的...)。
自动数据类型转换,autodtype
: 这个问题很常见,比如你刚刚正愉快地把你的landmarks处理成了numpy数组,正想把torchvision的image-only transforms和albumentations捏在一起用时,却发现torchvision需要的输入是Tensor,而albumentations的landmarks输入要求是list,emmm...,啊这,numpy表示不服(你们都不考虑一下兼容我的么?)。嗯,其实解决的方式其实也并不复杂,就是利用python的装饰器,给函数或被bind的transform做一个自动转换数据类型的标记。我写了一个autodtype的装饰器,就是专门来干这种无聊的事情的,它会把你的数据类型转换成函数需要的输入类型,并且在数据增强完成后,再将数据类型转换成原始的类型。
数据增强的安全性和简洁性
: 通常情况下,在做关键点检测任务的时候,我们希望经过一系列数据增强后的点位是完备的。比如我做个98点的人脸关键点检测,如果数据增强导致点位数量变少或引入额外的误差,都不是理想的选择。所以,数据增强的“安全性”就比较重要了。torchlm.bind会自动对被bind的类型或方法做这种安全性检查,并且所有torchlm的transforms模块中的方法都是支持这种安全性检查的,不会出现奇怪的点位,当发现变换前后点位数量不一致时,可自动回退,撤销不安全的增强变换。
torchlm.bind方法说明
: 这里单独把torchlm.bind拿出来讲几句。所有torchvision、albumentations和用户自定义的方法,经过torchlm.bind绑定后,会自动带有autodtype和“安全性”的特性,你只需要正常定义一个关键点数据增强的函数就行,剩下边边角角的事情,就交给torchlm.bind吧。另外,torchlm.bind还提供了一个有用的参数,prob,如果指定了这个参数,torchlm.bind就会把所有被绑定的transform或callable方法都变成random风格的,会按照随机的概率执行。在torchvision中,有些transform并非是随机的,那么就可以通过这种方式变成随机的;还有用户自定义的函数,也不需要手动进行随机的设置,通过torchlm.bind的prob设置就可以变为随机的版本了。这样有个好处,就是,无论是来自torchvision、albumentations,还是用户自定义的方法,都可以比较优雅地放到一个Compose的pipeline中了。比如,以下这个混合使用的例子:
import torchvision
import albumentations
import torchlm
transform = torchlm.LandmarksCompose([
# use native torchlm transforms
torchlm.LandmarksRandomScale(prob=0.5),
# bind torchvision image only transforms, bind with a given prob
torchlm.bind(torchvision.transforms.GaussianBlur(kernel_size=(5, 25)), prob=0.5), # 不需要做数据转换了
torchlm.bind(torchvision.transforms.RandomAutocontrast(p=0.5)),
# bind albumentations image only transforms
torchlm.bind(albumentations.ColorJitter(p=0.5)), # 也不需要在外部检查点位数量
torchlm.bind(albumentations.GlassBlur(p=0.5)),
# bind albumentations dual transforms
torchlm.bind(albumentations.RandomCrop(height=200, width=200, p=0.5)),
torchlm.bind(albumentations.Rotate(p=0.5)),
# bind custom callable array functions
torchlm.bind(callable_array_noop, bind_type=torchlm.BindEnum.Callable_Array),
# bind custom callable Tensor functions with a given prob,指定一个概率,变成随机版本
torchlm.bind(callable_tensor_noop, bind_type=torchlm.BindEnum.Callable_Tensor, prob=0.5),
# ...
])
new_img, new_landmarks = transform(img, landmarks) # img,landmarks可以是np.ndarray或Tensor
瞧瞧,这个用法看起来是不是优雅多了,torchlm.bind帮你处理了那些边边角角的事情。
torchlm的已经发布在pypi上了,因此可以通过pip一键安装,很方便。
pip3 install torchlm
# install from specific pypi mirrors use '-i'
pip3 install torchlm -i https://pypi.org/simple/
或者从github源码下载安装
# clone torchlm repository locally
git clone --depth=1 https://github.com/DefTruth/torchlm.git
cd torchlm
# install in editable mode
pip install -e .
class LandmarksNormalize(LandmarksTransform):
def __init__(
self,
mean: float = 127.5,
std: float = 128.
):
class LandmarksUnNormalize(LandmarksTransform):
def __init__(
self,
mean: float = 127.5,
std: float = 128.
):
class LandmarksToTensor(LandmarksTransform):
def __init__(self):
class LandmarksToNumpy(LandmarksTransform):
def __init__(self):
class LandmarksResize(LandmarksTransform):
def __init__(
self,
size: Union[Tuple[int, int], int],
keep_aspect: bool = False
):
class LandmarksClip(LandmarksTransform):
def __init__(
self,
width_pad: float = 0.2,
height_pad: float = 0.2,
target_size: Union[Tuple[int, int], int] = None,
**kwargs
):
class LandmarksRandomCenterCrop(LandmarksTransform):
def __init__(
self,
width_range: Tuple[float, float] = (0.8, 1.0),
height_range: Tuple[float, float] = (0.8, 1.0),
prob: float = 0.5
):
transform = torchlm.LandmarksCompose([
torchlm.LandmarksRandomCenterCrop(width_range=(0.5, 0.1), height_range=(0.5, 0.1), prob=1.),
torchlm.LandmarksResize((256, 256))
])
class LandmarksHorizontalFlip(LandmarksTransform):
"""WARNING: HorizontalFlip augmentation mirrors the input image. When you apply
that augmentation to keypoints that mark the side of body parts (left or right),
those keypoints will point to the wrong side (since left on the mirrored image
becomes right). So when you are creating an augmentation pipeline look carefully
which augmentations could be applied to the input data. Also see:
https://albumentations.ai/docs/getting_started/keypoints_augmentation/
"""
def __init__(self):
class LandmarksRandomHorizontalFlip(LandmarksTransform):
"""WARNING: HorizontalFlip augmentation mirrors the input image. When you apply
that augmentation to keypoints that mark the side of body parts (left or right),
those keypoints will point to the wrong side (since left on the mirrored image
becomes right). So when you are creating an augmentation pipeline look carefully
which augmentations could be applied to the input data. Also see:
https://albumentations.ai/docs/getting_started/keypoints_augmentation/
"""
def __init__(
self,
prob: float = 0.5
):
class LandmarksAlign(LandmarksTransform):
def __init__(
self,
eyes_index: Union[Tuple[int, int], List[int]] = None
):
transform = torchlm.LandmarksCompose([
torchlm.LandmarksRandomRotate(80, prob=1.), # 先增加旋转
torchlm.LandmarksRandomAlign(eyes_index=(96, 97), prob=1.), # 再进行对齐看效果
torchlm.LandmarksResize((256, 256))
])
class LandmarksRandomAlign(LandmarksTransform):
def __init__(
self,
eyes_index: Union[Tuple[int, int], List[int]] = None,
prob: float = 0.5
):
class LandmarksRandomScale(LandmarksTransform):
def __init__(
self,
scale: Union[Tuple[float, float], float] = 0.4,
prob: float = 0.5,
diff: bool = True
):
transform = torchlm.LandmarksCompose([
torchlm.LandmarksRandomScale(scale=(-0.5, 1.5), prob=1.),
torchlm.LandmarksResize((256, 256), keep_aspect=True)
])
class LandmarksRandomShear(LandmarksTransform):
def __init__(
self,
shear_factor: Union[Tuple[float, float], List[float], float] = 0.2,
prob: float = 0.5
):
transform = torchlm.LandmarksCompose([
torchlm.LandmarksRandomShear(prob=1.),
torchlm.LandmarksResize((256, 256))
])
class LandmarksRandomHSV(LandmarksTransform):
def __init__(
self,
hue: Union[Tuple[int, int], int] = 20,
saturation: Union[Tuple[int, int], int] = 20,
brightness: Union[Tuple[int, int], int] = 20,
prob: float = 0.5
):
transform = torchlm.LandmarksCompose([
torchlm.LandmarksRandomHSV(prob=1.),
torchlm.LandmarksResize((256, 256))
])
class LandmarksRandomTranslate(LandmarksTransform):
def __init__(
self,
translate: Union[Tuple[float, float], float] = 0.2,
prob: float = 0.5,
diff: bool = False
):
transform = torchlm.LandmarksCompose([
torchlm.LandmarksRandomTranslate(prob=1.),
torchlm.LandmarksResize((256, 256))
])
class LandmarksRandomRotate(LandmarksTransform):
def __init__(
self,
angle: Union[Tuple[int, int], List[int], int] = 10,
prob: float = 0.5,
bins: Optional[int] = None
):
transform = torchlm.LandmarksCompose([
torchlm.LandmarksRandomRotate(angle=80, prob=1.),
torchlm.LandmarksResize((256, 256))
])
class LandmarksRandomBlur(LandmarksTransform):
def __init__(
self,
kernel_range: Tuple[int, int] = (3, 11),
prob: float = 0.5,
sigma_range: Tuple[int, int] = (0, 4)
):
transform = torchlm.LandmarksCompose([
torchlm.LandmarksResize((256, 256)),
torchlm.LandmarksRandomBlur(kernel_range=(5, 35), prob=1.)
])
class LandmarksRandomBrightness(LandmarksTransform):
def __init__(
self,
brightness: Tuple[float, float] = (-30., 30.),
contrast: Tuple[float, float] = (0.5, 1.5),
prob: float = 0.5
):
transform = torchlm.LandmarksCompose([
torchlm.LandmarksRandomBrightness(prob=1.),
torchlm.LandmarksResize((256, 256))
])
class LandmarksRandomMask(LandmarksTransform):
def __init__(
self,
mask_ratio: float = 0.1,
prob: float = 0.5,
trans_ratio: float = 0.5
):
transform = torchlm.LandmarksCompose([
torchlm.LandmarksRandomMask(prob=1.),
torchlm.LandmarksResize((256, 256))
])
class LandmarksRandomMaskMixUp(LandmarksTransform):
def __init__(
self,
mask_ratio: float = 0.25,
prob: float = 0.5,
trans_ratio: float = 0.5,
alpha: float = 0.9
):
transform = torchlm.LandmarksCompose([
torchlm.LandmarksRandomMaskMixUp(prob=1.),
torchlm.LandmarksResize((256, 256))
])
class LandmarksRandomPatches(LandmarksTransform):
def __init__(
self,
patch_dirs: List[str] = None,
patch_ratio: float = 0.15,
prob: float = 0.5,
trans_ratio: float = 0.5
):
transform = torchlm.LandmarksCompose([
torchlm.LandmarksRandomPatches(prob=1.),
torchlm.LandmarksResize((256, 256))
])
class LandmarksRandomPatchesMixUp(LandmarksTransform):
def __init__(
self,
patch_dirs: List[str] = None,
patch_ratio: float = 0.2,
prob: float = 0.5,
trans_ratio: float = 0.5,
alpha: float = 0.9
):
transform = torchlm.LandmarksCompose([
torchlm.LandmarksRandomPatchesMixUp(alpha=0.5, prob=1.),
torchlm.LandmarksResize((256, 256))
])
class LandmarksRandomBackground(LandmarksTransform):
def __init__(
self,
background_dirs: List[str] = None,
prob: float = 0.5
):
transform = torchlm.LandmarksCompose([
torchlm.LandmarksRandomBackground(prob=1.),
torchlm.LandmarksResize((256, 256))
])
class LandmarksRandomBackgroundMixUp(LandmarksTransform):
def __init__(
self,
background_dirs: List[str] = None,
alpha: float = 0.3,
prob: float = 0.5
):
transform = torchlm.LandmarksCompose([
torchlm.LandmarksRandomBackgroundMixUp(alpha=0.5, prob=1.),
torchlm.LandmarksResize((256, 256))
])
class BindAlbumentationsTransform(LandmarksTransform):
def __init__(
self,
transform: Albumentations_Transform_Type,
prob: float = 1.0
):
class BindTorchVisionTransform(LandmarksTransform):
def __init__(
self,
transform: TorchVision_Transform_Type,
prob: float = 1.0
):
class BindArrayCallable(LandmarksTransform):
def __init__(
self,
call_func: Callable_Array_Func_Type,
prob: float = 1.0
):
class BindTensorCallable(LandmarksTransform):
def __init__(
self,
call_func: Callable_Tensor_Func_Type,
prob: float = 1.0
):
class LandmarksCompose(object):
def __init__(
self,
transforms: List[LandmarksTransform]
):
一个示例性质的pipeline如下所示,用法很简单。
import torchlm
transform = torchlm.LandmarksCompose([
# use native torchlm transforms
torchlm.LandmarksRandomScale(prob=0.5),
torchlm.LandmarksRandomTranslate(prob=0.5),
torchlm.LandmarksRandomShear(prob=0.5),
torchlm.LandmarksRandomMask(prob=0.5),
torchlm.LandmarksRandomBlur(kernel_range=(5, 25), prob=0.5),
torchlm.LandmarksRandomBrightness(prob=0.),
torchlm.LandmarksRandomRotate(40, prob=0.5, bins=8),
torchlm.LandmarksRandomCenterCrop((0.5, 1.0), (0.5, 1.0), prob=0.5),
# ...
])
通过 torchlm.bind 可以一行代码兼容torchvision和albumentations的 80+ 种数据增强方法,并且自动处理数据类型转换和数据“安全性”检查。
import torchvision
import albumentations
import torchlm
transform = torchlm.LandmarksCompose([
# use native torchlm transforms
torchlm.LandmarksRandomScale(prob=0.5),
# bind torchvision image only transforms, bind with a given prob
torchlm.bind(torchvision.transforms.GaussianBlur(kernel_size=(5, 25)), prob=0.5),
torchlm.bind(torchvision.transforms.RandomAutocontrast(p=0.5)),
# bind albumentations image only transforms
torchlm.bind(albumentations.ColorJitter(p=0.5)),
torchlm.bind(albumentations.GlassBlur(p=0.5)),
# bind albumentations dual transforms
torchlm.bind(albumentations.RandomCrop(height=200, width=200, p=0.5)),
torchlm.bind(albumentations.Rotate(p=0.5)),
# ...
])
还可以通过 torchlm.bind 可以一行代码绑定用户自定义的数据增强方法,并且自动处理数据类型转换和数据“安全性”检查。
# First, defined your custom functions
def callable_array_noop(img: np.ndarray, landmarks: np.ndarray) -> Tuple[np.ndarray, np.ndarray]:
# do some transform here ...
return img.astype(np.uint32), landmarks.astype(np.float32)
def callable_tensor_noop(img: Tensor, landmarks: Tensor) -> Tuple[Tensor, Tensor]:
# do some transform here ...
return img, landmarks
# Then, bind your functions and put it into the transforms pipeline.
transform = torchlm.LandmarksCompose([
# use native torchlm transforms
torchlm.LandmarksRandomScale(prob=0.5),
# bind custom callable array functions
torchlm.bind(callable_array_noop, bind_type=torchlm.BindEnum.Callable_Array),
# bind custom callable Tensor functions with a given prob
torchlm.bind(callable_tensor_noop, bind_type=torchlm.BindEnum.Callable_Tensor, prob=0.5),
# ...
])
torchlm提供良心的全局调试设置,通过设置一些全局选项,方便你对数据增强进行调试,以便定位到底是哪里出了问题。
import torchlm
# some global setting
torchlm.set_transforms_debug(True)
torchlm.set_transforms_logging(True)
torchlm.set_autodtype_logging(True)
如果设置了这些全局选项为True,那么每次数据增强的pipeline在运行时,都会输出一些有用的信息,辅助你进行判断和检查。
LandmarksRandomScale() AutoDtype Info: AutoDtypeEnum.Array_InOut
LandmarksRandomScale() Execution Flag: False
BindTorchVisionTransform(GaussianBlur())() AutoDtype Info: AutoDtypeEnum.Tensor_InOut
BindTorchVisionTransform(GaussianBlur())() Execution Flag: True
BindAlbumentationsTransform(ColorJitter())() AutoDtype Info: AutoDtypeEnum.Array_InOut
BindAlbumentationsTransform(ColorJitter())() Execution Flag: True
BindTensorCallable(callable_tensor_noop())() AutoDtype Info: AutoDtypeEnum.Tensor_InOut
BindTensorCallable(callable_tensor_noop())() Execution Flag: False
Error at LandmarksRandomTranslate() Skip, Flag: False Error Info: LandmarksRandomTranslate() have 98 input landmarks, but got 96 output landmarks!
LandmarksRandomTranslate() Execution Flag: False
Execution Flag: True 表示该变换被成功执行, False 则表示没有被成功执行,这可能是由于随机概率被跳过了,也可能是出现了运行时异常 (当debug mode 是 True 时,torchlm会中断pipeline并抛出详细的异常信息).
AutoDtype Info:
如果你不小心往一个需要numpy数组输入的变换传了Tensor,也是没有影响的,torchlm 会通过 autodtype 装饰器自动兼容不同的数据类型,并且在变换完成后,自动地将输出的数据转换为原来的类型。
import cv2
import numpy as np
import torchvision
import albumentations
from torch import Tensor
from typing import Tuple
import torchlm
def callable_array_noop(
img: np.ndarray,
landmarks: np.ndarray
) -> Tuple[np.ndarray, np.ndarray]:
# Do some transform here ...
return img.astype(np.uint32), landmarks.astype(np.float32)
def callable_tensor_noop(
img: Tensor,
landmarks: Tensor
) -> Tuple[Tensor, Tensor]:
# Do some transform here ...
return img, landmarks
def test_torchlm_transforms_pipeline():
print(f"torchlm version: {torchlm.__version__}")
seed = np.random.randint(0, 1000)
np.random.seed(seed)
img_path = "./2.jpg"
anno_path = "./2.txt"
save_path = f"./logs/2_wflw_{seed}.jpg"
img = cv2.imread(img_path)[:, :, ::-1].copy() # RGB
with open(anno_path, 'r') as fr:
lm_info = fr.readlines()[0].strip('\n').split(' ')
landmarks = [float(x) for x in lm_info[:196]]
landmarks = np.array(landmarks).reshape(98, 2) # (5,2) or (98, 2) for WFLW
# some global setting will show you useful details
torchlm.set_transforms_debug(True)
torchlm.set_transforms_logging(True)
torchlm.set_autodtype_logging(True)
transform = torchlm.LandmarksCompose([
# use native torchlm transforms
torchlm.LandmarksRandomScale(prob=0.5),
torchlm.LandmarksRandomTranslate(prob=0.5),
torchlm.LandmarksRandomShear(prob=0.5),
torchlm.LandmarksRandomMask(prob=0.5),
torchlm.LandmarksRandomBlur(kernel_range=(5, 25), prob=0.5),
torchlm.LandmarksRandomBrightness(prob=0.),
torchlm.LandmarksRandomRotate(40, prob=0.5, bins=8),
torchlm.LandmarksRandomCenterCrop((0.5, 1.0), (0.5, 1.0), prob=0.5),
# bind torchvision image only transforms with a given bind prob
torchlm.bind(torchvision.transforms.GaussianBlur(kernel_size=(5, 25)), prob=0.5),
torchlm.bind(torchvision.transforms.RandomAutocontrast(p=0.5)),
torchlm.bind(torchvision.transforms.RandomAdjustSharpness(sharpness_factor=3, p=0.5)),
# bind albumentations image only transforms
torchlm.bind(albumentations.ColorJitter(p=0.5)),
torchlm.bind(albumentations.GlassBlur(p=0.5)),
torchlm.bind(albumentations.RandomShadow(p=0.5)),
# bind albumentations dual transforms
torchlm.bind(albumentations.RandomCrop(height=200, width=200, p=0.5)),
torchlm.bind(albumentations.RandomScale(p=0.5)),
torchlm.bind(albumentations.Rotate(p=0.5)),
# bind custom callable array functions with a given bind prob
torchlm.bind(callable_array_noop, bind_type=torchlm.BindEnum.Callable_Array, prob=0.5),
# bind custom callable Tensor functions
torchlm.bind(callable_tensor_noop, bind_type=torchlm.BindEnum.Callable_Tensor, prob=0.5),
torchlm.LandmarksResize((256, 256)),
torchlm.LandmarksNormalize(),
torchlm.LandmarksToTensor(),
torchlm.LandmarksToNumpy(),
torchlm.LandmarksUnNormalize()
])
trans_img, trans_landmarks = transform(img, landmarks)
new_img = torchlm.draw_landmarks(trans_img, trans_landmarks, circle=2)
cv2.imwrite(save_path, new_img[:, :, ::-1])
# unset the global status when you are in training process
torchlm.set_transforms_debug(False)
torchlm.set_transforms_logging(False)
torchlm.set_autodtype_logging(False)
你看,现在整个数据增强的pipeline是不是优雅很多了,无论是torchlm原生的transforms,还是来自torchvision和albumentations的transforms,都可以很自然地放到一个流程里面来了,顺眼多了。也不用去管输入是numpy数组还是Tensor了。而且,当你想要自定义一个关键点数据增强的方法放入到整个pipeline时,需要做的,仅仅就是定义好这个方法。torchlm.bind帮你处理了很多边边角角的事情。
one-line code
兼容其他主流库的方法,一行代码解决。
安全性
,可自动回退,撤销不安全的增强变换。
零代码自动兼容numpy和Tensor的数据类型
,不需要用户做任何的转换。
更多的文档资料,请查看torchlm的主页:
GitHub(欢迎star和pr): https://github.com/DefTruth/torchlm
pypi download stats:https://pepy.tech/project/torchlm
如果觉得有用,就请分享到朋友圈吧!
公众号后台回复“transformer”获取最新Transformer综述论文下载~
# 极市平台签约作者#
DefTruth
知乎:DefTruth
一名缺少天赋的文科生、图形AI算法工程师、推理引擎业余玩家
研究领域:计算机视觉(检测/分割/抠图/识别/跟踪)、计算机图形学(动画驱动/UE4),
业余玩一玩推理引擎,热爱开源。
心态:保持学习,认为完整比完美更重要~