Finetuner makes neural network fine-tuning easier and faster by streamlining the workflow and handling all the complexity and infrastructure requirements in the cloud. With Finetuner, one can easily enhance the performance of pre-trained models and make them production-ready without expensive hardware.
This release covers Finetuner version 0.6.7, including dependencies finetuner-api 0.4.8 and and finetuner-core 0.11.4.
This release contains 4 new features.
🆕 Features
Add support for cross-modal evaluation in the EvaluationCallback
(#615)
In previous versions of Finetuner, when using the EvaluationCallback
to calculate IR metrics, you could only use a single model to encode both the query and the index data. This means that for training multiple models at the same time, like in CLIP fine-tuning, you could only use one encoder for evaluation. It is now possible to do cross-modal evaluation, where you use one model for encoding the query data and a second model for encoding the index data. This is useful in multi-modal tasks like text-to-image
.
For doing the cross-modal evaluation, all you need to do is specify the model
and index_model
arguments in the EvaluationCallback
, like so:
import finetuner
from finetuner.callback import EvaluationCallback
run = finetuner.fit(
model='openai/clip-vit-base-patch32',
train_data=train_data,
eval_data=eval_data,
loss='CLIPLoss',
callbacks=[
EvaluationCallback(
query_data=query_data,
index_data=index_data,
model='clip-text',
index_model='clip-vision'
)
]
)
See the EvaluationCallback
section of the Finetuner documentation for details on using this callback. See also the sections Text-to-Image Search via CLIP and Using MCLIP for concrete examples of cross-modal evaluation.
Add support for Multilingual CLIP (#611)
Finetuner now supports a Multilingual CLIP model from the OpenCLIP project. Multilingual CLIP models are trained on large text and image datasets from different languages using the CLIP constrastive learning approach.
They are a good fit for text-to-image applications where texts are in languages other than English.
The currently supported Multilingual CLIP model - xlm-roberta-base-ViT-B-32::laion5b_s13b_b90k
- uses a ViT Base32 image encoder and an XLM Roberta Base text encoder.
You can find details on how to fine-tune this specific model in the Multilingual Text-to-Image search with MultilingualCLIP section of the documentation.
import finetuner
run = finetuner.fit(
model='xlm-roberta-base-ViT-B-32::laion5b_s13b_b90k',
train_data=train_data,
eval_data=eval_data,
epochs=5,
learning_rate=1e-6,
loss='CLIPLoss',
device='cuda',
)
Filter models by task in finetuner.describe_models()
(#610)
The finetuner.describe_models()
function, which provides an overview of supported model backbones, now accepts an optional task
argument that filters the models by task.
To display all models you can omit the argument.
import finetuner
finetuner.describe_models()
To filter based on task, you need to provide a valid task name. For example:
finetuner.describe_models(task='image-to-image')
or
finetuner.describe_models(task='text-to-image')
Currently valid task names are text-to-text
, text-to-image
and image-to-image
.
Configure the num_items_per_class
argument in finetuner.fit()
(#614)
The finetuner.fit()
method now includes a new argument num_items_per_class
that allows you to set the number of items per label that will be included in each batch. This gives the user the ability to further tailor batch construction to their liking. If not set, this argument has a default value of 4, compatible with the previous versions of Finetuner.
You can easily set this when calling finetuner.fit()
:
import finetuner
run = finetuner.fit(
model='efficient_b0',
train_data=train_data,
eval_data=eval_data,
batch_size=128,
num_items_per_class=8,
)
batch_size % num_items_per_class == 0
. Otherwise Finetuner cannot respect the given num_items_per_class
and throws an error.🤟 Contributors
We would like to thank all contributors to this release:
- Wang Bo (@bwanglzu)
- Michael Günther (@guenthermi)
- Louis Milliken (@LMMilliken)
- George Mastrapas (@gmastrapas)