Releases: allenai/allennlp
v2.0.0rc1
This is the first (and hopefully only) release candidate for AllenNLP 2.0. Please note that this is a release candidate, and the APIs are still subject to change until the final 2.0 release. We'll provide a detailed writeup with the final 2.0 release, including a migration guide. In the meantime, here are the headline features of AllenNLP 2.0:
- Support for models that combine language and vision features
- Transformer Toolkit, a suite of classes and components that make it easy to experiment with transformer architectures
- A framework for multitask training
- Revamped data loading, for improved performance and flexibility
What's new
Added 🎉
- Added
TensorCacheclass for caching tensors on disk - Added abstraction and concrete implementation for image loading
- Added abstraction and concrete implementation for
GridEmbedder - Added abstraction and demo implementation for an image augmentation module.
- Added abstraction and concrete implementation for region detectors.
- A new high-performance default
DataLoader:MultiProcessDataLoading. - A
MultiTaskModeland abstractions to use with it, includingBackboneandHead. The
MultiTaskModelfirst runs its inputs through theBackbone, then passes the result (and
whatever other relevant inputs it got) to eachHeadthat's in use. - A
MultiTaskDataLoader, with a correspondingMultiTaskDatasetReader, and a couple of new
configuration objects:MultiTaskEpochSampler(for deciding what proportion to sample from each
dataset at every epoch) and aMultiTaskScheduler(for ordering the instances within an epoch). - Transformer toolkit to plug and play with modular components of transformer architectures.
- Added a command to count the number of instances we're going to be training with
- Added a
FileLockclass tocommon.file_utils. This is just like theFileLockfrom thefilelocklibrary, except that
it adds an optional flagread_only_ok: bool, which when set toTruechanges the behavior so that a warning will be emitted
instead of an exception when lacking write permissions on an existing file lock.
This makes it possible to use theFileLockclass on a read-only file system. - Added a new learning rate scheduler:
CombinedLearningRateScheduler. This can be used to combine different LR schedulers, using one after the other. - Added an official CUDA 10.1 Docker image.
- Moving
ModelCardandTaskCardabstractions into the main repository. - Added a util function
allennlp.nn.util.dist_reduce(...)for handling distributed reductions.
This is especially useful when implementing a distributedMetric.
Changed ⚠️
DatasetReaders are now always lazy. This means there is nolazyparameter in the base
class, and the_read()method should always be a generator.- The
DataLoadernow decides whether to load instances lazily or not.
With thePyTorchDataLoaderthis is controlled with thelazyparameter, but with
theMultiProcessDataLoadingthis is controlled by themax_instances_in_memorysetting. ArrayFieldis now calledTensorField, and implemented in terms of torch tensors, not numpy.- Improved
nn.util.move_to_devicefunction by avoiding an unnecessary recursive check for tensors and
adding anon_blockingoptional argument, which is the same argument as intorch.Tensor.to(). - If you are trying to create a heterogeneous batch, you now get a better error message.
- Readers using the new vision features now explicitly log how they are featurizing images.
master_addrandmaster_portrenamed toprimary_addrandprimary_port, respectively.is_masterparameter for training callbacks renamed tois_primary.masterbranch renamed tomain- Torch version bumped to 1.7.1 in Docker images.
Removed 👋
- Removed
nn.util.has_tensor.
Fixed ✅
- The
build-vocabcommand no longer crashes when the resulting vocab file is
in the current working directory. - Fixed typo with
LabelFieldstring representation: removed trailing apostrophe. Vocabulary.from_filesandcached_pathwill issue a warning, instead of failing, when a lock on an existing resource
can't be acquired because the file system is read-only.TrackEpochCallbackis now aEpochCallback.
Commits
9a4a424 Moves vision models to allennlp-models (#4918)
412896b fix merge conflicts
ed322eb A helper for distributed reductions (#4920)
9ab2bf0 add CUDA 10.1 Docker image (#4921)
d82287e Update transformers requirement from <4.1,>=4.0 to >=4.0,<4.2 (#4872)
5497394 Multitask example (#4898)
0f00d4d resolve _read type (#4916)
5229da8 Toolkit decoder (#4914)
4183a49 Update mkdocs-material requirement from <6.2.0,>=5.5.0 to >=5.5.0,<6.3.0 (#4880)
d7c9eab improve worker error handling in MultiProcessDataLoader (#4912)
94dd9cc rename 'master' -> 'primary' for distributed training (#4910)
c9585af fix imports in file_utils
03c7ffb Merge branch 'main' into vision
effcc4e improve data loading docs (#4909)
2f54570 remove PyTorchDataLoader, add SimpleDataLoader for testing (#4907)
31ec6a5 MultiProcessDataLoader takes PathLike data_path (#4908)
5e3757b rename 'multi_process_*' -> 'multiprocess' for consistency (#4906)
df36636 Data loading cuda device (#4879)
aedd3be Toolkit: Cleaning up TransformerEmbeddings (#4900)
54e85ee disable codecov annotations (#4902)
2623c4b Making TrackEpochCallback an EpochCallback (#4893)
1d21c75 issue warning instead of failing when lock can't be acquired on a resource that exists in a read-only file system (#4867)
ec197c3 Create pull_request_template.md (#4891)
15d32da Make GQA work (#4884)
fbab0bd import MultiTaskDataLoader to data_loaders/init.py (#4885)
d1cc146 Merge branch 'main' into vision
abacc01 Adding f1 score (#4890)
9cf41b2 fix navbar link
9635af8 rename 'master' -> 'main' (#4887)
d0a07fb docs: fix simple typo, multplication -> multiplication (#4883)
d1f032d Moving modelcard and taskcard abstractions to main repo (#4881)
f62b819 Make images easier to find for Visual Entailment (#4878)
1fff7ca Update docker torch version (#4873)
7a7c7ea Only cache, no featurizing (#4870)
d2aea97 Fix typo in str (#4874)
1c72a30 Merge branch 'master' into vision
6a8d425 add CombinedLearningRateScheduler (#4871)
85d38ff doc fixes
c4e3f77 Switch to torchvision for vision components 👀, simplify and improve MultiProcessDataLoader (#4821)
3da8e62 Merge branch 'master' into vision
a3732d0 Fix cache volume (#4869)
832901e Turn superfluous warning to info when extending the vocab in the embedding matrix (#4854)
147fefe Merge branch 'master' into vision
87e3536 Make tests work again (#4865)
d16a5c7 Merge remote-tracking branch 'origin/master' into vision
457e56e Merge branch 'master' into vision
c8521d8 Toolkit: Adding documentation and small changes for BiModalAttention (#4859)
ddbc740 gqa reader fixes during vilbert training (#4851)
50e50df Generalizing transformer layers (#4776)
52fdd75 adding multilabel option (#4843)
7887119 Other VQA datasets (#4834)
e729e9a Added GQA reader (#4832)
52e9dd9 Visual entailment model code (#4822)
01f3a2d Merge remote-tracking branch 'origin/master' into vision
3be6c97 SNLI_VE dataset reader (#4799)
b659e66 VQAv2 (#4639)
c787230 Merge remote-tracking branch 'origin/master' into vision
db2d1d3 Merge branch 'master' into vision
6bf1924 Merge branch 'master' into vision
167bcaa remove vision push trigger
7591465 Merge remote-tracking branch 'origin/master' into vision
22d4633 improve independence of vision components (#4793)
98018cc fix merge conflicts
c780315 fix merge conflicts
5d22ce6 Merge remote-tracking branch 'origin/master' into vision
602399c update with master
ffafaf6 Multitask data loading and scheduling (#4625)
7c47c3a Merge branch 'master' into vision
12c8d1b Generalizing self attention (#4756)
63f61f0 Merge remote-tracking branch 'origin/master' into vision
b48347b Merge remote-tracking branch 'origin/master' into vision
81892db fix failing tests
98edd25 update torch requirement
8da3508 update with master
cc53afe separating TransformerPooler as a new module (#4730)
4ccfa88 Transformer toolkit: BiModalEncoder now has separate num_attention_heads for both modalities (#4728)
91631ef Transformer toolkit (#4577)
677a9ce Merge remote-tracking branch 'origin/master' into vision
2985236 This should have been part of the previously merged PR
c5d264a Detectron NLVR2 (#4481)
e39a5f6 Merge remote-tracking branch 'origin/master' into vision
f1e46fd Add MultiTaskModel (#4601)
fa22f73 Merge remote-tracking branch 'origin/master' into vision
41872ae Merge remote-tracking branch 'origin/master' into vision
f886fd0 Merge remote-tracking branch 'origin/master' into vision
191b641 make existing readers work with multi-process loading (#4597)
d7124d4 fix len calculation for new data loader (#4618)
8746361 Merge branch 'master' into vision
319794a remove duplicate padding calculations in collate fn (#4617)
de9165e rename 'node_rank' to 'global_rank' in dataset reader 'DistributedInfo' (#4608)
3d11419 Formatting updates for new version of black (#4607)
cde06e6 Changelog
1b08fd6 ensure models check runs on right branch
44c8791 ensure vision CI runs on each commit (#4582)
95e8253 Merge branch 'master' into vision
e74a736 new data loading (#4497)
6f82005 Merge remote-tracking branch 'origin/master' into vision
a7d45de Initializing a VilBERT model from a pre-trained transformer (#4495)
3833f7a Merge branch 'master' into vision
71d7cb4 Merge branch 'master' into vision
3137961 Merge remote-tracking branch 'origin/master' into vision
6cc508d Merge branch 'master' into vision
f87df83 Merge remote-tracking branch 'origin/master' into vision
0bbe84b An initial VilBERT model for NLVR...
v1.3.0
What's new
Added 🎉
- Added links to source code in docs.
- Added
get_embedding_layerandget_text_field_embedderto thePredictorclass; to specify embedding layers for non-AllenNLP models. - Added Gaussian Error Linear Unit (GELU) as an Activation.
Changed ⚠️
- Renamed module
allennlp.data.tokenizers.tokentoallennlp.data.tokenizers.token_classto avoid
this bug. transformersdependency updated to version 4.0.1.
Fixed ✅
- Fixed a lot of instances where tensors were first created and then sent to a device
with.to(device). Instead, these tensors are now created directly on the target device. - Fixed issue with
GradientDescentTrainerwhen constructed withvalidation_data_loader=Noneandlearning_rate_scheduler!=None. - Fixed a bug when removing all handlers in root logger.
ShardedDatasetReadernow inherits parameters frombase_readerwhen required.- Fixed an issue in
FromParamswhere parameters in theparamsobject used to a construct a class
were not passed to the constructor if the value of the parameter was equal to the default value.
This caused bugs in some edge cases where a subclass that takes**kwargsneeds to inspect
kwargsbefore passing them to its superclass. - Improved the band-aid solution for segmentation faults and the "ImportError: dlopen: cannot load any more object with static TLS"
by adding atransformersimport. - Added safety checks for extracting tar files
Commits
d408f41 log import errors for default plugins (#4866)
f2a5331 Adds a safety check for tar files (#4858)
84a36a0 Update transformers requirement from <3.6,>=3.4 to >=4.0,<4.1 (#4831)
fdad31a Add ability to specify the embedding layer if the model does not use TextFieldEmbedder (#4836)
41c5224 Improve the band-aid solution for seg faults and the static TLS error (#4846)
63b6d16 fix FromParams bug (#4841)
6c3238e rename token.py -> token_class.py (#4842)
cec9209 Several micro optimizations (#4833)
48a4865 Add GELU activation (#4828)
3e62365 Bugfix for attribute inheritance in ShardedDatasetReader (#4830)
458c4c2 fix the way handlers are removed from the root logger (#4829)
5b30658 Fix bug in GradientDescentTrainer when validation data is absent (#4811)
f353c6c add link to source code in docs (#4807)
0a83271 No Docker auth on PRs (#4802)
ad8e8a0 no ssh setup on PRs (#4801)
v1.2.2
What's new
Added 🎉
- Added Docker builds for other torch-supported versions of CUDA.
- Adds
allennlp-semparseas an official, default plugin.
Fixed ✅
GumbelSamplernow sorts the beams by their true log prob.
Commits
023d9bc Prepare for release v1.2.2
7b0826c push commit images for both CUDA versions
3cad5b4 fix AUC test (#4795)
efde092 upgrade ssh-agent action (#4797)
ec37dd4 Docker builds for other CUDA versions, improve CI (#4796)
0d8873c doc link quickfix
e4cc95c improve plugin section in README (#4789)
d99f7f8 ensure Gumbel sorts beams by true log prob (#4786)
9fe8d90 Makes the transformer cache work with custom kwargs (#4781)
1e7492d Update transformers requirement from <3.5,>=3.4 to >=3.4,<3.6 (#4784)
f27ef38 Fixes pretrained embeddings for transformers that don't have end tokens (#4732)
v1.2.1
What's new
Added 🎉
- Added an optional
seedparameter toModelTestCase.set_up_modelwhich sets the random
seed forrandom,numpy, andtorch. - Added support for a global plugins file at
~/.allennlp/plugins. - Added more documentation about plugins.
- Added sampler class and parameter in beam search for non-deterministic search, with several
implementations, includingMultinomialSampler,TopKSampler,TopPSampler, and
GumbelMaxSampler. UtilizingGumbelMaxSamplerwill give Stochastic Beam Search.
Changed ⚠️
- Pass batch metrics to
BatchCallback.
Fixed ✅
- Fixed a bug where forward hooks were not cleaned up with saliency interpreters if there
was an exception. - Fixed the computation of saliency maps in the Interpret code when using mismatched indexing.
Previously, we would compute gradients from the top of the transformer, after aggregation from
wordpieces to tokens, which gives results that are not very informative. Now, we compute gradients
with respect to the embedding layer, and aggregate wordpieces to tokens separately. - Fixed the heuristics for finding embedding layers in the case of RoBERTa. An update in the
transformerslibrary broke our old heuristic. - Fixed typo with registered name of ROUGE metric. Previously was
rogue, fixed torouge. - Fixed default masks that were erroneously created on the CPU even when a GPU is available.
Commits
04247fa support global plugins file, improve plugins docs (#4779)
9f7cc24 Add sampling strategies to beam search (#4768)
f6fe8c6 pin urllib3 in dev reqs for responses (#4780)
764bbe2 Pass batch metrics to BatchCallback (#4764)
dc3a4f6 clean up forward hooks on exception (#4778)
fcc3a70 Fix: typo in metric, rogue -> rouge (#4777)
b89320c Set the device for an auto-created mask (#4774)
92a844a RoBERTa embeddings are no longer a type of BERT embeddings (#4771)
23f0a8a Ensure cnn_encoder respects masking (#4746)
b4f1a7a add seed option to ModelTestCase.set_up_model (#4769)
b7cec51 Made Interpret code handle mismatched cases better (#4733)
9759b15 allow TextFieldEmbedder to have EmptyEmbedder that may not be in input (#4761)
v1.2.0
What's new
Changed ⚠️
- Enforced stricter typing requirements around the use of
Optional[T]types. - Changed the behavior of
Lazytypes infrom_paramsmethods. Previously, if you defined aLazyparameter like
foo: Lazy[Foo] = Nonein a customfrom_paramsclassmethod, thenfoowould actually never beNone.
This behavior is now different. If no params were given forfoo, it will beNone.
You can also now set default values for foo likefoo: Lazy[Foo] = Lazy(Foo).
Or, if you want you want a default value but also want to allow forNonevalues, you can
write it like this:foo: Optional[Lazy[Foo]] = Lazy(Foo). - Added support for PyTorch version 1.7.
Fixed ✅
- Made it possible to instantiate
TrainerCallbackfrom config files. - Fixed the remaining broken internal links in the API docs.
- Fixed a bug where Hotflip would crash with a model that had multiple TokenIndexers and the input
used rare vocabulary items. - Fixed a bug where
BeamSearchwould fail ifmax_stepswas equal to 1.
Commits
7f85c74 fix docker build (#4762)
cc9ac0f ensure dataclasses not installed in CI (#4754)
812ac57 Fix hotflip bug where vocab items were not re-encoded correctly (#4759)
aeb6d36 revert samplers and fix bug when max_steps=1 (#4760)
baca754 Make returning token type id default in transformers intra word tokenization. (#4758)
5d6670c Update torch requirement from <1.7.0,>=1.6.0 to >=1.6.0,<1.8.0 (#4753)
0ad228d a few small doc fixes (#4752)
71a98c2 stricter typing for Optional[T] types, improve handling of Lazy params (#4743)
27edfbf Add end+trainer callbacks to Trainer.from_partial_objects (#4751)
b792c83 Fix device mismatch bug for categorical accuracy metric in distributed training (#4744)
v1.2.0rc1
What's new
Added 🎉
- Added a warning when
batches_per_epochfor the validation data loader is inherited from
the train data loader. - Added a
build-vocabsubcommand that can be used to build a vocabulary from a training config file. - Added
tokenizer_kwargsargument toPretrainedTransformerMismatchedIndexer. - Added
tokenizer_kwargsandtransformer_kwargsarguments toPretrainedTransformerMismatchedEmbedder. - Added official support for Python 3.8.
- Added a script:
scripts/release_notes.py, which automatically prepares markdown release notes from the
CHANGELOG and commit history. - Added a flag
--predictions-output-fileto theevaluatecommand, which tells AllenNLP to write the
predictions from the given dataset to the file as JSON lines. - Added the ability to ignore certain missing keys when loading a model from an archive. This is done
by adding a class-level variable calledauthorized_missing_keysto any PyTorch module that aModeluses.
If defined,authorized_missing_keysshould be a list of regex string patterns. - Added
FBetaMultiLabelMeasure, a multi-label Fbeta metric. This is a subclass of the existingFBetaMeasure. - Added ability to pass additional key word arguments to
cached_transformers.get(), which will be passed on toAutoModel.from_pretrained(). - Added an
overridesargument toPredictor.from_path(). - Added a
cached-pathcommand. - Added a function
inspect_cachetocommon.file_utilsthat prints useful information about the cache. This can also
be used from thecached-pathcommand withallennlp cached-path --inspect. - Added a function
remove_cache_entriestocommon.file_utilsthat removes any cache entries matching the given
glob patterns. This can used from thecached-pathcommand withallennlp cached-path --remove some-files-*. - Added logging for the main process when running in distributed mode.
- Added a
TrainerCallbackobject to support state sharing between batch and epoch-level training callbacks. - Added support for .tar.gz in PretrainedModelInitializer.
- Added classes:
nn/samplers/samplers.pywithMultinomialSampler,TopKSampler, andTopPSamplerfor
sampling indices from log probabilities - Made
BeamSearchregistrable. - Added
top_k_samplingandtype_p_samplingBeamSearchimplementations. - Pass
serialization_dirtoModelandDatasetReader. - Added an optional
include_in_archiveparameter to the top-level of configuration files. When specified,include_in_archiveshould be a list of paths relative to the serialization directory which will be bundled up with the final archived model from a training run.
Changed ⚠️
- Subcommands that don't require plugins will no longer cause plugins to be loaded or have an
--include-packageflag. - Allow overrides to be JSON string or
dict. transformersdependency updated to version 3.1.0.- When
cached_pathis called on a local archive withextract_archive=True, the archive is now extracted into a unique subdirectory of the cache root instead of a subdirectory of the archive's directory. The extraction directory is also unique to the modification time of the archive, so if the file changes, subsequent calls tocached_pathwill know to re-extract the archive. - Removed the
truncation_strategyparameter toPretrainedTransformerTokenizer. The way we're calling the tokenizer, the truncation strategy takes no effect anyways. - Don't use initializers when loading a model, as it is not needed.
- Distributed training will now automatically search for a local open port if the
master_portparameter is not provided. - In training, save model weights before evaluation.
allennlp.common.util.peak_memory_mbrenamed topeak_cpu_memory, andallennlp.common.util.gpu_memory_mbrenamed topeak_gpu_memory,
and they both now return the results in bytes as integers. Also, thepeak_gpu_memoryfunction now utilizes PyTorch functions to find the memory
usage instead of shelling out to thenvidia-smicommand. This is more efficient and also more accurate because it only takes
into account the tensor allocations of the current PyTorch process.- Make sure weights are first loaded to the cpu when using PretrainedModelInitializer, preventing wasted GPU memory.
- Load dataset readers in
load_archive. - Updated
AllenNlpTestCasedocstring to remove reference tounittest.TestCase
Removed 👋
- Removed
common.util.is_masterfunction.
Fixed ✅
- Fixed a bug where the reported
batch_lossmetric was incorrect when training with gradient accumulation. - Class decorators now displayed in API docs.
- Fixed up the documentation for the
allennlp.nn.beam_searchmodule. - Ignore
*argswhen constructing classes withFromParams. - Ensured some consistency in the types of the values that metrics return.
- Fix a PyTorch warning by explicitly providing the
as_tupleargument (leaving
it as its default value ofFalse) toTensor.nonzero(). - Remove temporary directory when extracting model archive in
load_archive
at end of function rather than viaatexit. - Fixed a bug where using
cached_path()offline could return a cached resource's lock file instead
of the cache file. - Fixed a bug where
cached_path()would fail if passed acache_dirwith the user home shortcut~/. - Fixed a bug in our doc building script where markdown links did not render properly
if the "href" part of the link (the part inside the()) was on a new line. - Changed how gradients are zeroed out with an optimization. See this video from NVIDIA
at around the 9 minute mark. - Fixed a bug where parameters to a
FromParamsclass that are dictionaries wouldn't get logged
when an instance is instantiatedfrom_params. - Fixed a bug in distributed training where the vocab would be saved from every worker, when it should have been saved by only the local master process.
- Fixed a bug in the calculation of rouge metrics during distributed training where the total sequence count was not being aggregated across GPUs.
- Fixed
allennlp.nn.util.add_sentence_boundary_token_ids()to usedeviceparameter of input tensor. - Be sure to close the TensorBoard writer even when training doesn't finish.
- Fixed the docstring for
PyTorchSeq2VecWrapper.
Commits
01644ca Pass serialization_dir to Model, DatasetReader, and support include_in_archive (#4713)
1f29f35 Update transformers requirement from <3.4,>=3.1 to >=3.1,<3.5 (#4741)
6bb9ce9 warn about batches_per_epoch with validation loader (#4735)
00bb6c5 Be sure to close the TensorBoard writer (#4731)
3f23938 Update mkdocs-material requirement from <6.1.0,>=5.5.0 to >=5.5.0,<6.2.0 (#4738)
10c11ce Fix typo in PretrainedTransformerMismatchedEmbedder docstring (#4737)
0e64b4d fix docstring for PyTorchSeq2VecWrapper (#4734)
006bab4 Don't use PretrainedModelInitializer when loading a model (#4711)
ce14bdc Allow usage of .tar.gz with PretrainedModelInitializer (#4709)
c14a056 avoid defaulting to CPU device in add_sentence_boundary_token_ids() (#4727)
24519fd fix typehint on checkpointer method (#4726)
d3c69f7 Bump mypy from 0.782 to 0.790 (#4723)
cccad29 Updated AllenNlpTestCase docstring (#4722)
3a85e35 add reasonable timeout to gpu checks job (#4719)
1ff0658 Added logging for the main process when running in distributed mode (#4710)
b099b69 Add top_k and top_p sampling to BeamSearch (#4695)
bc6f15a Fixes rouge metric calculation corrected for distributed training (#4717)
ae7cf85 automatically find local open port in distributed training (#4696)
321d4f4 TrainerCallback with batch/epoch/end hooks (#4708)
001e1f7 new way of setting env variables in GH Actions (#4700)
c14ea40 Save checkpoint before running evaluation (#4704)
40bb47a Load weights to cpu with PretrainedModelInitializer (#4712)
327188b improve memory helper functions (#4699)
90f0037 fix reported batch_loss (#4706)
39ddb52 CLI improvements (#4692)
edcb6d3 Fix a bug in saving vocab during distributed training (#4705)
3506e3f ensure parameters that are actual dictionaries get logged (#4697)
eb7f256 Add StackOverflow link to README (#4694)
17c3b84 Fix small typo (#4686)
e0b2e26 display class decorators in API docs (#4685)
b9a9284 Update transformers requirement from <3.3,>=3.1 to >=3.1,<3.4 (#4684)
d9bdaa9 add build-vocab command (#4655)
ce604f1 Update mkdocs-material requirement from <5.6.0,>=5.5.0 to >=5.5.0,<6.1.0 (#4679)
c3b5ed7 zero grad optimization (#4673)
9dabf3f Add missing tokenizer/transformer kwargs (#4682)
9ac6c76 Allow overrides to be JSON string or dict (#4680)
55cfb47 The truncation setting doesn't do anything anymore (#4672)
990c9c1 clarify conda Python version in README.md
97db538 official support for Python 3.8 🐍 (#4671)
1e381bb Clean up the documentation for beam search (#4664)
11def8e Update bug_report.md
97fe88d Cached path command (#4652)
c9f376b Update transformers requirement from <3.2,>=3.1 to >=3.1,<3.3 (#4663)
e5e3d02 tick version for nightly releases
b833f90 fix multi-line links in docs (#4660)
d7c06fe Expose from_pretrained keyword arguments (#4651)
175c76b fix confusing distributed logging info (#4654)
fbd2ccc fix numbering in RELEASE_GUIDE
2d5f24b improve how cached_path extracts archives (#4645)
824f97d smooth out release process (#4648)
c7b7c00 Feature/prevent temp directory retention (#4643)
de5d68b Fix tensor.nonzero() function overload warning (#4644)
e8e89d5 add flag for saving predictions to 'evaluate' command (#4637)
e4fd5a0 Multi-label F-beta metric (#4562)
f0e7a78 Create Dependabot config file (#4635)
0e33b0b Return consistent types from metrics (#4632)
2df364f Update transformers requirement from <3.1,>=3.0 to >=3.0,<3.2 (#4621)
6d480aa Im...
v1.1.0
Highlights
Version 1.1 was mainly focused on bug fixes, but there are a few important new features such as gradient checkpointing with pretrained transformer embedders and official support for automatic mixed precision (AMP) training through the new torch.amp module.
Details
Added
Predictor.capture_model_internals()now accepts a regex specifying which modules to capture.- Added the option to specify
requires_grad: falsewithin an optimizer's parameter groups. - Added the
file-friendly-loggingflag back to thetraincommand. Also added this flag to thepredict,evaluate, andfind-learning-ratecommands. - Added an
EpochCallbackto track current epoch as a model class member. - Added the option to enable or disable gradient checkpointing for transformer token embedders via boolean parameter
gradient_checkpointing. - Added a method to
ModelTestCasefor running basic model tests when you aren't using config files. - Added some convenience methods for reading files.
cached_path()can now automatically extract and read files inside of archives.- Added the ability to pass an archive file instead of a local directory to
Vocab.from_files. - Added the ability to pass an archive file instead of a glob to
ShardedDatasetReader. - Added a new
"linear_with_warmup"learning rate scheduler. - Added a check in
ShardedDatasetReaderthat ensures the base reader doesn't implement manual distributed sharding itself. - Added an option to
PretrainedTransformerEmbedderandPretrainedTransformerMismatchedEmbedderto use a scalar mix of all hidden layers from the transformer model instead of just the last layer. To utilize this, just setlast_layer_onlytoFalse. - Training metrics now include
batch_lossandbatch_reg_lossin addition to aggregate loss across number of batches.
Changed
- Upgraded PyTorch requirement to 1.6.
- Beam search now supports multi-layer decoders.
- Replaced the NVIDIA Apex AMP module with torch's native AMP module. The default trainer (
GradientDescentTrainer) now takes ause_amp: boolparameter instead of the oldopt_level: strparameter. - Not specifying a
cuda_devicenow automatically determines whether to use a GPU or not. - Discovered plugins are logged so you can see what was loaded.
allennlp.data.DataLoaderis now an abstract registrable class. The default implementation remains the same, but was renamed toallennlp.data.PyTorchDataLoader.BertPoolercan now unwrap and re-wrap extra dimensions if necessary.
Removed
- Removed the
opt_levelparameter toModel.loadandload_archive. In order to use AMP with a loaded model now, just run the model's forward pass within torch'sautocastcontext.
Fixed
- Fixed handling of some edge cases when constructing classes with
FromParamswhere the class
accepts**kwargs. - Fixed division by zero error when there are zero-length spans in the input to a
PretrainedTransformerMismatchedIndexer. - Improved robustness of
cached_pathwhen extracting archives so that the cache won't be corrupted
if a failure occurs during extraction. - Fixed a bug with the
averageandevalb_bracketing_scoremetrics in distributed training. - Fixed a bug in distributed metrics that caused nan values due to repeated addition of an accumulated variable.
- Fixed how truncation was handled with
PretrainedTransformerTokenizer.
Previously, ifmax_lengthwas set toNone, the tokenizer would still do truncation if the
transformer model had a default max length in its config.
Also, whenmax_lengthwas set to a non-Nonevalue, several warnings would appear
for certain transformer models around the use of thetruncationparameter. - Fixed evaluation of all metrics when using distributed training.
- Added a
py.typedmarker. Fixed type annotations inallennlp.training.util. - Fixed problem with automatically detecting whether tokenization is necessary.
This affected primarily the Roberta SST model. - Improved help text for using the --overrides command line flag.
- Removed unnecessary warning about deadlocks in
DataLoader. - Fixed testing models that only return a loss when they are in training mode.
- Fixed a bug in
FromParamsthat caused silent failure in case of the parameter type beingOptional[Union[...]]. - Fixed a bug where the program crashes if
evaluation_data_loaderis aAllennlpLazyDataset. - Reduced the amount of log messages produced by
allennlp.common.file_utils. - Fixed a bug where
PretrainedTransformerEmbedderparameters appeared to be trainable
in the log output even whentrain_parameterswas set toFalse. - Fixed a bug with the sharded dataset reader where it would only read a fraction of the instances
in distributed training. - Fixed checking equality of
ArrayFields. - Fixed a bug where
NamespaceSwappingFielddid not work correctly with.empty_field(). - Put more sensible defaults on the
huggingface_adamwoptimizer. - Simplified logging so that all logging output always goes to one file.
- Fixed interaction with the python command line debugger.
- Log the grad norm properly even when we're not clipping it.
- Fixed a bug where
PretrainedModelInitializerfails to initialize a model with a 0-dim tensor - Fixed a bug with the layer unfreezing schedule of the
SlantedTriangularlearning rate scheduler. - Fixed a regression with logging in the distributed setting. Only the main worker should write log output to the terminal.
- Pinned the version of boto3 for package managers (e.g. poetry).
- Fixed issue #4330 by updating the
tokenizersdependency. - Fixed a bug in
TextClassificationPredictorso that it passes tokenized inputs to theDatasetReader
in case it does not have a tokenizer. reg_lossis only now returned for models that have some regularization penalty configured.- Fixed a bug that prevented
cached_pathfrom downloading assets from GitHub releases. - Fixed a bug that erroneously increased last label's false positive count in calculating fbeta metrics.
Tqdmoutput now looks much better when the output is being piped or redirected.- Small improvements to how the API documentation is rendered.
- Only show validation progress bar from main process in distributed training.
Commits
dcc9cdc Prepare for release v1.1.0
aa750be fix Average metric (#4624)
e1aa57c improve robustness of cached_path when extracting archives (#4622)
711afaa Fix division by zero when there are zero-length spans in MismatchedEmbedder. (#4615)
be97943 Improve handling of **kwargs in FromParams (#4616)
187b24e add more tutorial links to README (#4613)
e840a58 s/logging/logger/ (#4609)
dbc3c3f Added batched versions of scatter and fill to util.py (#4598)
2c54cf8 reformat for new version of black (#4605)
2dd335e batched_span_select now guarantees element order in each span (#4511)
62f554f specify module names by a regex in predictor.capture_model_internals() (#4585)
f464aa3 Bump markdown-include from 0.5.1 to 0.6.0 (#4586)
d01cdff Update RELEASE_PROCESS.md to include allennlp-models (#4587)
3aedac9 Prepare for release v1.1.0rc4
87a61ad Bug fix in distributed metrics (#4570)
71a9a90 upgrade actions to cache@v2 (#4573)
bd9ee6a Give better usage info for overrides parameter (#4575)
0a456a7 Fix boolean and categorical accuracy for distributed (#4568)
8511274 add actions workflow for closing stale issues (#4561)
de41306 Static type checking fixes (#4545)
5a07009 Fix RoBERTa SST (#4548)
351941f Only pin mkdocs-material to minor version, ignore specific patch version (#4556)
0ac13a4 fix CHANGELOG
3b86f58 Prepare for release v1.1.0rc3
44d2847 Metrics in distributed setting (#4525)
1d61965 Bump mkdocs-material from 5.5.3 to 5.5.5 (#4547)
5b97780 tick version for nightly releases
b32608e add gradient checkpointing for transformer token embedders (#4544)
f639336 Fix logger being created twice (#4538)
660fdaf Fix handling of max length with transformer tokenizers (#4534)
15e288f EpochCallBack for tracking epoch (#4540)
9209bc9 Bump mkdocs-material from 5.5.0 to 5.5.3 (#4533)
bfecdc3 Ensure len(self.evaluation_data_loader) is not called (#4531)
5bc3b73 Fix typo in warning in file_utils (#4527)
e80d768 pin torch >= 1.6
73220d7 Prepare for release v1.1.0rc2
9415350 Update torch requirement from <1.6.0,>=1.5.0 to >=1.5.0,<1.7.0 (#4519)
146bd9e Remove link to self-attention modules. (#4512)
2401282 add back file-friendly-logging flag (#4509)
54e5c83 closes #4494 (#4508)
fa39d49 ensure call methods are rendered in docs (#4522)
e53d185 Bug fix for case when param type is Optional[Union...] (#4510)
14f63b7 Make sure we have a bool tensor where we expect one (#4505)
18a4eb3 add a requires_grad option to param groups (#4502)
6c848df Bump mkdocs-material from 5.4.0 to 5.5.0 (#4507)
d73f8a9 More BART changes (#4500)
1cab3bf Update beam_search.py (#4462)
478bf46 remove deadlock warning in DataLoader (#4487)
714334a Fix reported loss: Bug fix in batch_loss (#4485)
db20b1f use longer tqdm intervals when output being redirected (#4488)
53eeec1 tick version for nightly releases
d693cf1 PathLike (#4479)
2f87832 only show validation progress bar from main process (#4476)
9144918 Fix reported loss (#4477)
5c97083 fix release link in CHANGELOG and formatting in README
4eb9795 Prepare for release v1.1.0rc1
f195440 update 'Models' links in README (#4475)
9c801a3 add CHANGELOG to API docs, point to license on GitHub, improve API doc formatting (#4472)
69d2f03 Clean up Tqdm bars when output is being piped or redirected (#4470)
7b188c9 fixed bug that erronously increased last label's false positive count (#4473)
64db027 Skip ETag check if OSError (#4469)
b9d011e More BART ...
v1.1.0rc4
Changes since v1.1.0rc3
Added
- Added a workflow to GitHub Actions that will automatically close unassigned stale issues and
ping the assignees of assigned stale issues.
Fixed
- Fixed a bug in distributed metrics that caused nan values due to repeated addition of an accumulated variable.
Commits
87a61ad Bug fix in distributed metrics (#4570)
71a9a90 upgrade actions to cache@v2 (#4573)
bd9ee6a Give better usage info for overrides parameter (#4575)
0a456a7 Fix boolean and categorical accuracy for distributed (#4568)
8511274 add actions workflow for closing stale issues (#4561)
de41306 Static type checking fixes (#4545)
5a07009 Fix RoBERTa SST (#4548)
351941f Only pin mkdocs-material to minor version, ignore specific patch version (#4556)
v1.1.0rc3
Changes since v1.1.0rc2
Fixed
- Fixed how truncation was handled with
PretrainedTransformerTokenizer.
Previously, ifmax_lengthwas set toNone, the tokenizer would still do truncation if the
transformer model had a default max length in its config.
Also, whenmax_lengthwas set to a non-Nonevalue, several warnings would appear
for certain transformer models around the use of thetruncationparameter. - Fixed evaluation of all metrics when using distributed training.
Commits
0ac13a4 fix CHANGELOG
3b86f58 Prepare for release v1.1.0rc3
44d2847 Metrics in distributed setting (#4525)
1d61965 Bump mkdocs-material from 5.5.3 to 5.5.5 (#4547)
5b97780 tick version for nightly releases
b32608e add gradient checkpointing for transformer token embedders (#4544)
f639336 Fix logger being created twice (#4538)
660fdaf Fix handling of max length with transformer tokenizers (#4534)
15e288f EpochCallBack for tracking epoch (#4540)
9209bc9 Bump mkdocs-material from 5.5.0 to 5.5.3 (#4533)
bfecdc3 Ensure len(self.evaluation_data_loader) is not called (#4531)
5bc3b73 Fix typo in warning in file_utils (#4527)
e80d768 pin torch >= 1.6
v1.1.0rc2
What's new since v1.1.0rc1
Changed
- Upgraded PyTorch requirement to 1.6.
- Replaced the NVIDIA Apex AMP module with torch's native AMP module. The default trainer (
GradientDescentTrainer)
now takes ause_amp: boolparameter instead of the oldopt_level: strparameter.
Fixed
- Removed unnecessary warning about deadlocks in
DataLoader. - Fixed testing models that only return a loss when they are in training mode.
- Fixed a bug in
FromParamsthat caused silent failure in case of the parameter type beingOptional[Union[...]].
Added
- Added the option to specify
requires_grad: falsewithin an optimizer's parameter groups. - Added the
file-friendly-loggingflag back to thetraincommand. Also added this flag to thepredict,evaluate, andfind-learning-ratecommands.
Removed
- Removed the
opt_levelparameter toModel.loadandload_archive. In order to use AMP with a loaded
model now, just run the model's forward pass within torch'sautocast
context.
Commits
73220d7 Prepare for release v1.1.0rc2
9415350 Update torch requirement from <1.6.0,>=1.5.0 to >=1.5.0,<1.7.0 (#4519)
146bd9e Remove link to self-attention modules. (#4512)
2401282 add back file-friendly-logging flag (#4509)
54e5c83 closes #4494 (#4508)
fa39d49 ensure call methods are rendered in docs (#4522)
e53d185 Bug fix for case when param type is Optional[Union...] (#4510)
14f63b7 Make sure we have a bool tensor where we expect one (#4505)
18a4eb3 add a requires_grad option to param groups (#4502)
6c848df Bump mkdocs-material from 5.4.0 to 5.5.0 (#4507)
d73f8a9 More BART changes (#4500)
1cab3bf Update beam_search.py (#4462)
478bf46 remove deadlock warning in DataLoader (#4487)
714334a Fix reported loss: Bug fix in batch_loss (#4485)
db20b1f use longer tqdm intervals when output being redirected (#4488)
53eeec1 tick version for nightly releases
d693cf1 PathLike (#4479)
2f87832 only show validation progress bar from main process (#4476)
9144918 Fix reported loss (#4477)
5c97083 fix release link in CHANGELOG and formatting in README