-
Notifications
You must be signed in to change notification settings - Fork 5
/
HISTORY.txt
366 lines (358 loc) · 30.6 KB
/
HISTORY.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
# RCSB_DB HISTORY
11-Mar-2018 - Py2->Py3 and refactored for python packaging tools
4-Jul-2018 - V0.12 reformulate schema definitions directly from dictionary
metadata and targeted helper functions.
20-Jul-2018 V0.13 overhaul static schema management
28-Jul-2018 V0.14 corrections for selection and type filters and object size pruning
20-Aug-2018. V0.15 added sliced collections (e.g. entity or canonical identifier),
JSON schema description of collections, sfx/xfel examples, dictionary
methods implemented in helper classes, and incremental repository metadata
updates.
25-Aug-2018 V0.16 split out rcsb.utils.config, rcsb.utils.io, rcsb.utils.multiproc and
mock-data as a shared submodule for test data. Merged branch ' namespace'
back into master.
28-Aug-2018 V0.17 rename console scripts directory to avoid conflict with reserved keyword in Py2.
9-Sep-2018 V0.18 add support for multi-level JSON schema generation and validation,
integrate linking CIF methods to generate content at load time, add core
assembly collection, add cardinal identifiers in new categories to each
core collection, and enum normalization as a data transformation filter operation.
11-Sep-2018 V0.19 add dictionary method for citation author aggregation
adjust cardinality of entity and assembly identifier categories
14-Sep-2018 V0.20 Require at least one record in any array type, adjust constraints on iterables.
18-Sep-2018 V0.21 Require homogeneous categories/classes in JSON schema production.
22-Sep-2018 V0.22 Add method to generate _pdbx_struct_assembly.rcsb_candidate_assembly
10-Oct-2018 V0.23 Add date format to schema definitions; generate schemas and add validation tests
for repository_holdings, entity_sequence_clusters, and data_exchange schema types;
extend derived content for core_entry and core_entity collections;
add subcategory aggregation feature; and refactor api for helper methods.
12-Oct-2018 V0.24 Add rcsb_repository_holdings_transferred and rcsb_repository_holdings_insilico_models,
make datetime type mappings the same as date, check for empty required properties for
subcategory aggregates
28-Oct-2018 V0.25 Move local helper method configuration to common YAML configuration, restore some missing
modules for cockroach and crate server types, add category rcsb_accession_info in core_entry,
make audit_authors an iterable type in categories rcsb_repository_holdings_transferred,
rcsb_repository_holdings_insilico_models, and rcsb_repository_holdings_unreleased,
verify data types in sequence cluster collections.
13-Nov-2018 V0.26 Add chemical component and bird chemical component core collections with convenience
categories rcsb_chem_comp_info, rcsb_chem_comp_synonyms, rcsb_chem_comp_descriptor,
and rcsb_chem_comp_target. Add convenience category rcsb_entry_info in pdbx_core core_entry
collection. Include correspondence details for DrugBank and CCDC/CSD.
Add dictionary methods to filter core entry objects by experiment type to
remove largely vacuous categories. Add new CLI entry point for schema generation.
27-Nov-2018 V0.27 resolve inconsistent handling of multiple sources for antibody molecule types. Add
dictionary method rcsb_file_block_by_method add item pdbx_chem_comp_audit.ordinal to replace
the primary key for this category. Add new mechanism to inject private document key
attributes to support Solr access and indexing. (Feature branch multisourcefix). Extend
repository holdings collection with prerelease sequences and extended content types for
the current repository state.
28-Nov-2018 V0.28 minor update to FASTA sequence content type to repository holdings current inventory and
adjustments in constraints for rcsb_entry_info category production
30-Nov-2018 V0.29 branch birdconsolidate includes mashup of all BIRD definition data, additional content
types in current repository holdings with adjustments to filter obsolete entries, various
additional categories added to core entry collection.
1-Dec-2018 V0.30 Adjustments for roundtrip CI/CD
2-Dec-2018 V0.31 Fixes for expansion of semi-colon separated value types, add methods to
include NCBI scientific and common names.
3-Dec-2018 V0.32 interim update for troubleshooting entity collection load issues, adding search indices
for core collections, adding optional private identifier chemical component identifiers for non-polymer
entities, adjustments to bird core citation schema.
9-Dec-2018 V0.33 Add _pdbx_reference_molecule.class, _pdbx_reference_molecule.type, Add categories drugbank_info
and drugbank_target, add method rcsb_add_bird_entity_identifiers, add _rcsb_entry_info.solvent_entity_count
and adjust counts to include solvent. Add consolidated loader for BIRD data, add loader for DrugBank
corresponding data from rcsb.utils.chemref. Add CLI option for loading integrated chemical reference data.
Move time consuming schema validation tests to separate subdirectory. Add mandatory option for injected
private keys. Resolve schema loading issues with current chemical component and BIRD definitions.
12-Dec-2018 V0.34 Add rcsb_entity_id identifiers in categories struct_ref_seq and struct_ref_seq_dif.Add category rcsb_entity_poly_info
as the basis for a new core collection core_entity_monomer. Adjust logic for reporting assembly format availability.
13-Dec-2018 V0.35 Add ihm_dev collection and support for I/HM repository model files
18-Dec-2018 V0.36 Add _entity_poly.rcsb_prd_id and configuration to include private key for BIRD identifier in the core_entity collection.'
7-Jan-2019 V0.37 Simplify and consolidate site specific configuration options, streamline cli scripts, and consolidate common
schema access and building methods in SchemaDefUtil() (feature branch configpath)
9-Jan-2019 V0.38 Introduce a new data transformation filter for XML character references and miscellaneous related changes.
10-Jan-2019 V0.39 Adjustments to improve diagnostics relative remaining data issues observed loading latest data files, and
tuning of classification for RCSB candidate assemblies.
17-Jan-2019 V0.40 adjust handling of document replacement in sliced collections
25-Jan-2019 V0.41 schema extension for DrugBank core collection
16-Feb-2019 V0.42 Add entity_instance_core support, add content type merging feature to allow consolidation of data artifacts
prior to data load processing, and overhaul the slice processing to improve performance.
18-Feb-2019 V0.43 provide a more graceful handling of empty slices,
19-Feb-2019 V0.44 add slice support for pdbx_validate_* categories, and adjust category exclusions for the entity_core collection.
20-Feb-2019 V0.44 adding EntityInstanceExtractor.py preliminary version, adjust logging and cli script options.
#
22-Feb-2019 V0.45 preliminary version of tools for validation load analysis.
3-Mar-2019 V0.46 move validation analysis tools, EntityInstanceExtractor.py and related tests to package rcsb.utils.anal.
Change helper criteria for identifying chemical component objects and warn for missing chem_comp_atom category cases.
Change release filter for chem_comp_* collection to allow only status codes 'REL' and 'REF_ONLY' -
14-Mar-2019 V0.47 Additional dictionary content and helper method support for -
_entity.rcsb_macromolecular_names_combined, _rcsb_entry_info.software_programs_combined,
_rcsb_entity_container_identifiers.chem_comp_ligand, _rcsb_entity_container_identifiers.chem_comp_monomers,
Added new data types int-csv and int-scsv
Added automated subcategory aggregation in the general data processing pipeline.
New sub-category taxonomy_lineage with items: _rcsb_entity_source_organism.taxonomy_lineage_id, _rcsb_entity_source_organism.taxonomy_lineage_name, _rcsb_entity_source_organism.taxonomy_lineage_depth
and _rcsb_entity_host_organism.taxonomy_lineage_id, _rcsb_entity_host_organism.taxonomy_lineage_name,
_rcsb_entity_host_organism.taxonomy_lineage_depth
New sub-category rcsb_ec_lineage with items _entity.rcsb_ec_lineage_name _entity.rcsb_ec_lineage_id
and _entity.rcsb_ec_lineage_depth
Add_rcsb_schema_container_identifiers.schema_name, _rcsb_schema_container_identifiers.collection_name
_rcsb_schema_container_identifiers.collection_schema_version
Configuration changes to simplify the handling of versioning removing these from database and collection names
and schema file names, version configuration now managed in separate configuration options,
NCBI_TAXONOMY_LOCATOR removed, added configuration options NCBI_TAXONOMY_CACHE_PATH and ENZYME_CLASSIFICATION_CACHE_PATH,
provide separate configuration options schema version assignment within collections,
up-version api and json schema, add config option VRPT_REPO_PATH_ENV.
chem_comp category attributes normalized between chem_comp_core and bird_chem_comp_core collections.
17-Mar-2019 V0.48 adding subcategory aggregate rcsb_macromolecular_names_combined
24-Mar-2019 V0.49 address missing category issues in chemical reference data collections, various changes to
address schema validation exceptions introduced by additionalProperties: False. Add consolidated
EC item. Extend taxonomy lineage to include leaf node.
24-Mar-2019 V0.50 temporary adjustment for error handing for obsolete taxonomy ids.
25-Mar-2019 V0.51 remap merged taxons and adjust exception handling for taxonomy lineage generation
31-Mar-2019 V0.52 block_attributes: REF_PARENT_CATEGORY_NAME: REF_PARENT_ATTRIBUTE_NAME: to provide parent details for synthetic key
add 'addParentRefs' in enforceOpts to write relative $ref properties to describe parent relationships.
Address GitHub issues:
Failed int cast for None in DataTransformFactory #20 and
MySQL SchemaDefLoader skip zero values #19
3-Apr-2019 V0.53 added private schema property _primary_key to mark attributes that are part of an object primary key.
This feature is controlled by 'addPrimaryKey' enforceOpts property.
7-Apr-2019 V0.54 adding support for integrating CATH and SCOP structural classifiers and sequence difference type counts.
11-Apr-2019 V0.55 add tree loaders to etl cli tool, add entity_instance_validation, add a variety of counts, add atc codes.
13-Apr-2019 V0.56 add prototypical json schema properties '_attribute_groups' mapped to subcategory id/labels.
17-Apr-2019 V0.57/58 schema and cache adjustments for node tree loading
25-Apr-2019 V0.59 adds parent source/host organism and polymer monomer counts
3-May-2019 V0.60 adds helper support for attributes:
_rcsb_entry_info.deposited_polymer_monomer_count,
_rcsb_entry_info.polymer_entity_count_protein,
_rcsb_entry_info.polymer_entity_count_nucleic_acid,
_rcsb_entry_info.polymer_entity_count_nucleic_acid_hybrid,
_rcsb_entry_info.polymer_entity_count_DNA,
_rcsb_entry_info.polymer_entity_count_RNA,
_rcsb_entry_info.nonpolymer_ligand_entity_count
_rcsb_entry_info.selected_polymer_entity_types
_rcsb_entry_info.polymer_entity_taxonomy_count
_rcsb_entry_info.assembly_count
and categories
rcsb_entity_instance_domain_scop
rcsb_entity_instance_domain_cath
4-May-2019 V0.61 extend content in and categories rcsb_entity_instance_domain_scop
rcsb_entity_instance_domain_cath to support slicing
18-May-2019 V0.62 add rcsb_assembly_info category to the core_assembly
adjust classification of ternary complexes extending enumeration for _rcsb_entry_info.selected_polymer_entity_types
add new classifier _rcsb_entry_info.na_polymer_entity_types
and removed _rcsb_entry_info.nonpolymer_ligand_entity_count
20-May-2019 V0.63 Category rcsb_prot_sec_struct_info added to core_entity_instance and prune primary secondary structure
categories from this collection.
21-May-2019 V0.64 Handle odd ordering of records in struct_ref_seq_dif
29-Jun-2019 V0.65 Update development workflow, method code packaging, and general housekeeping
8-Aug-2019 V0.66 Refactoring dictionary method helpers and external resource provisioning to
reduce redundant processing and facilitate greater caching. Remove testing
dependencies on mock schemas and add explicit schema differencing to better
schema evolution. Add new helper methods for sites and connections and revise helper
methods for all entity and instance features.
10-Sep-2019 V0.67 Adding extended metadata types, AtcProvider() and SiftsSummaryProvider().
19-Sep-2019 V0.68 Eliminate private keys, revise/extend internal schema metadata, purge schema,
add polymer alignment methods.
24-Sep-2019 V0.69 Add additional polymer_instance features and new polymer_instance_validations methods.
25-Sep-2019 V0.70 Incorporate RCSB extensions in 'min' schemas add single collection specific identifiers core collections
Add missing required entry identifiers.
25-Sep-2019 V0.71 Adjustments for edge cases and enums after full-load tests
29-Sep-2019 V0.72 Further adjustments for edge cases and enums after full-load tests
29-Sep-2019 V0.73 Parallelize merging content types in RepositoryProvider()
29-Sep-2019 V0.74 Cleaning up more edge cases with missing or disorganized data
6-Oct-2019 V0.75 Reorganize feature value within feature range and position subcategories
7-Oct-2019 V0.76 Tweak sequence reference processing to better cope with inconsistencies and missing data.
8-Oct-2019 V0.77 Upversion schema after successful build
14-Oct-2019 V0.78 Add support for sequence isoforms, entry molecular weight, turn off SIFTS for
references and alignments, and various schema adjusts.
14-Oct-2019 V0.79 Add test cases and fix exceptions for missing sequence difference details.
14-Oct-2019 V0.80 Fix linting issue and update documentation in configuration example file -
16-Oct-2019 V0.81 Filter reference sequence assignments marked as self-reference - added tests for GO Id consistency
17-Oct-2019 V0.82 Preserve annotations on self-referenced sequences for now/shift ATC depth
19-Oct-2019 V0.83 Strip missing/empty values from subcategory aggregates/handle missing accession codes on struct_ref_seq records
19-Oct-2019 V0.84 Abandon reliance on ref_id/align_id consistency in struct_ref_* categories.
20-Oct-2019 V0.85 Update type, coverage and configuration for latest IHM dictionary update
23-Oct-2019 V0.86 Add nesting property to objects with unit cardinality,
don't propagate exact-match/suggest search contexts to full-text
25-Oct-2019 V0.87 Address a couple of anomalous sequence accession code issues and sync up mock repos
26-Oct-2019 V0.88 Address anomalous non-polymer reference sequence records.
2-Nov-2019 V0.89 Add bird specific citation handling methods and rerun
23-Nov-2019 V0.90 Move pipeline to py38
4-Dec-2019 V0.91 Add type specific entity collections (branch entity-type-division)
5-Dec-2019 V0.92 Update search contexts for repository holdings.
8-Dec-2019 V0.93 Partially handle odd cases of missing entity_poly category and adjust enumeration for nonpolymer instances features
Adjust error reporting in getProtSecStructFeatures(), tweak mandatory items for fiber diffraction, handle boundary
counts for cases with no macromolecules
10-Dec-2019 V0.94 Adding aggregation of gene names, related annotation lineage, fixed miscellaneous counting issues
14-Dec-2019 V0.941 Reorganize repository holdings collections
15-Dec-2019 V0.942 Suppress SIFTS with primary data loading
16-Dec-2019 V0.943 Extend normalizeCsvToList() to handle more inconsistent cases.
16-Dec-2019 V0.944 Add workflow module and tests to facilitate Luigi integration.
23-Dec-2019 V0.945 Add consolidated EC processing and disable auto search context filters
04-Jan-2020 V0.946 Add support for rcsb_polymer_entity.rcsb_enzyme_class_combined_depth, set single process cache management on loading
08-Jan-2020 V0.947 Add checks for missing entity gene names
10-Jan-2020 V0.949 Update dependencies for rcsb.utils.struct
20-Jan-2020 V0.950 pre-beta-rev-v1 - reorganizing assembly and feature categories, standardize on provenance_source, update enums
25-Jan-2020 V0.951 nested schema extension adjustments
27-Jan-2020 V0.952 configuration update updating ExDb collection names
28-Jan-2020 V0.953 update configuration documentation and cleanup unused files
29-Jan-2020 V0.954 finalize feature schema
30-Jan-2020 V0.955 update dependencies
30-Jan-2020 V0.956 Change PDBx MOCK repository organization from SANDBOX to FTP repository layout, handle missing data for
chromophores, and update configuration example file.
30-Jan-2020 V0.957 Update dependency version for rcsb.utils.config
2-Feb-2020 V0.958 Corrections keyword arguments in RepoLoadWorkflow()
3-Feb-2020 V0.959 Split configuration file into site and schema portions
5-Feb-2020 V0.960 Fix some inconsistencies among repository holdings collections and add consistency checks for
entity instance validation features
6-Feb-2020 V0.961 Add additional chimeric and multi-reference test cases and suppress mongodb tracebacks on replace failures
9-Feb-2020 V0.962 Cleanup combined software program list, separate tests to detect schema differences, add support for
schema updates to existing collections, update schema update CLI with compare methods.
11-Feb-2020 V0.963 Expose upsertFlag on mongo update method wrapper.
12-Feb-2020 V0.964 Extend entity and entity instance annotations with covalent linkage details
12-Feb-2020 V0.965 Add model_ids to entry_container_identifiers and update dependencies
14-Feb-2020 V0.966 Add MongoDb maintenance operations, point example config to development branch assets, update enums with
BIRD_MOLECULE_NAME.
19-Feb-2020 V0.967 Initialize zero values for feature counts and coverage
26-Feb-2020 V0.968 Add missing provenance source for rcsb_chem_comp_synonyms, filter case duplicates for taxonomy common
names, entity descriptions, and gene names.
5-Mar-2020 V0.969 Allow category nested contexts and remote tests for Uniprot-core
17-Mar-2020 V0.970 Add separate primary citation category and offset for zero based sequence database references
24-Mar-2020 V0.971 Add support for rcsb_chem_comp_annotation and various fixes for schema builder.
27-Mar-2020 V0.972 Add support for rcsb_repository_holdings_removed.repository_content_types
1-Apr-2020 V0.973 Add test cases for pdbx_audit_revision_details.description
1-Apr-2020 V0.974 Add test cases for RESID annotations
5-Apr-2020 V0.975 Adjustments to logging details for sequence processing
7-Apr-2020 V0.976 Add min/max for consolidated diffraction wavelength details
8-Apr-2020 V0.977 Adjust handling of nested contexts for subcategories.
17-Apr-2020 V0.978 Add rcsb_accession_info.has_released_experimental_data
27-Apr-2020 V0.979 Adjustments for remediated carbohydrate data files, leave BIRD_* annotations solely in definition collections.
30-Apr-2020 V0.980 New NMR repo holdings content types, specific configuration option added for ed maps list
2-May-2020 V0.981 Restore out-of-sync repository holdings data processing module and update related tests
8-May-2020 V0.982 Adding support for context_attribute_values and search path schema extensions.
13-May-2020 V0.983 Adjustments to suppress materialization of an assembly of deposited coordinates
15-May-2020 V0.984 Adjustments to suppress logging of missing assembly data
16-May-2020 V0.985 Adding test case for branched entity replace test
18-May-2020 V0.986 More complete object purge for loadType replace, explicitly filter heavy atoms in counting stats.
19-May-2020 V0.987 Reduce verbose logging of memory usage and use alternative $regex filter for mongo purging
21-May-2020 V0.988 Added additional methods to export search group and context statistics
22-May-2020 V0.989 Adding auth to entity polymer sequence residue mapping item.
3-Jun-2020 V0.990 Updating dependencies for rcsb.utils.validation and validation test conditions.
4-Jun-2020 V0.991 Adding new flag for prerelease sequence availability
5-Jun-2020 V0.992 Adding support for new enum DDL attributes.
12-Jun-2020 V0.993 Adjustments to the labeling of site, occupancy and observed features
17-Jun-2020 V0.994 Adjustments rcsb_repository_holdings_unreleased_entry collection
18-Jun-2020 V0.995 Adding entity taxonomy count.
25-Jun-2020 V0.996 Adding schema validation to automatically follow on from any load failures in mongo/PdbxLoader()
25-Jun-2020 V0.997 Update dependencies
30-Jun-2020 V0.998 Add partial reload of failed object salvage path
10-Jul-2020 V0.999 Adding support for derivative branched entity methods
15-Jul-2020 V1.100 Adding methods to export water counting statistics.
28-Jul-2020 V1.200 Add stash server config options to example configuration file
28-Jul-2020 V1.300 Tweak stash configuration
30-Jul-2020 V1.400 Add PubChem and Pharos mapping
5-Aug-2020 V1.500 Add carbohydrate content to rcsb_chem_comp_synonyms and rcsb_chem_comp_annotation
8-Aug-2020 V1.610 Limit SIFTS summary data cached
15-Aug-2020 V1.620 Update dependency version for CATH backup
21-Aug-2020 V1.630 Turn on the carbohydrate type options.
26-Aug-2020 V1.631 Adjust polymer type counting -
29-Aug-2020 V1.632 Add fix for missing data collection resolution
30-Aug-2020 V1.633 Fix issue with subcategory aggregates with unit cardinality
10-Sep-2020 V1.634 Trap missing data type exception in schema construction
24-Sep-2020 V1.635 Use alternative mapping from pdbx_poly_seq_scheme
1-Oct-2020 V1.636 Change handling of chemical component synonyms
7-Oct-2020 V1.637 Update sequence cluster data and config path
12-Oct-2020 V1.638 Update modeled residue counts and add option to use selective internal enumerations.
13-Oct-2020 V1.639 Update example data and configuration
13-Oct-2020 V1.640 Cast examples in schema, adjust handling of internal enumeration fallbacks
21-Oct-2020 V1.641 Add DrugBank brand names to rcsb_chem_comp_synonyms
23-Oct-2020 V1.642 Add rcsb_repository_holdings_combined_entry collection
24-Oct-2020 V1.643 Incorporate obsolete/supersede item in rcsb_repository_holdings_combined_entry
28-Oct-2020 V1.644 Make reference alignment options accessible from the configuration file
31-Oct-2020 V1.645 Add taxonomy identifier salvage method
1-Nov-2030 V1.646 Extend taxonomy identifier salvage method
6-Dec-2020 V1.647 Update database server configuration
9-Dec-2020 V1.648 Add validation slider image repository content type
16-Dec-2020 V1.649 Add hydrogen atom counts -
28-Dec-2020 V1.650 Corrections to indexing in multiprocessing RepoScanUtils(), fix CLI test cases
6-Jan-2021 V1.651 Enumeration updates for selected polymer types.
8-Jan-2021 V1.652 Add mongodb distinct() method.
8-Jan-2021 V1.653 Enumeration updates for selected polymer types.
12-Jan-2021 V1.654 Add primary citation author ORCIDs
15-Jan-2021 V1.655 Back out Protein/NA/Oligosaccharide selected polymer type for now
16-Jan-2021 V1.656 Further enumeration updates for polymer types.
28-Jan-2021 V1.657 RSRCC -> RSCC
29-Jan-2021 V1.658 Expose OpenEye SMILES descriptors as RCSB descriptors
30-Jan-2021 V1.659 Update rcsb.utils.taxonomy dependency to add fallback capability
12-Feb-2021 V1.660 Add ligand validation score and generation category rcsb_nonpolymer_instance_validation_score
15-Feb-2021 V1.661 Split out dictionary methods repository tools into rcsb.utils.dictionary and rcsb.utils.repository
21-Feb-2021 V1.662 Integrating latest rcsb.utils.dictionary and rcsb.utils.repository.
22-Feb-2021 V1.663 Further integration adjustments for rcsb.utils.dictionary.
25-Feb-2021 V1.664 Update dependencies and pipeline config
7-Mar-2021 V1.666 Handle inconsistent superseding data in repository holdings.
12-Mar-2021 V1.667 Add separate processing of pdb-format update list inventory.
18-Mar-2021 V1.668 Add support for embedded iterables within subcategory aggregates, add additional_properties
feature categories.
7-Apr-2021 V1.669 Updates for corrections in DictionaryAPI() method name getFullDescendentList()
28-Apr-2021 V1.670 Add content type for mmCIF format validation data
4-May-2021 V1.671 Remove deprecated schema attribute rcsb_nested_indexing_context.context_attributes.search_paths
19-May-2021 V1.673 Update dependencies in rcsb.utils.dictionary
27-May-2021 V1.674 Update dependencies to rcsb.utils.dictionary
2-Jun-2021 V1.675 Update dependencies to rcsb.utils.dictionary and diagnostic output
3-Jun-2021 V1.676 Setting long description format to markdown - update pipeline config again
5-Jun-2021 V1.677 Prune test cases
10-Jun-2021 V1.679 Update dependencies to rcsb.utils.dictionary
26-Jul-2021 V1.680 Update dependencies to rcsb.utils.dictionary, pipeline configuration, and example configuration
27-Jul-2021 V1.681 Add support for provider exclusion filters to reduce test resource burden
28-Jul-2021 V1.682 Adjustments for API extensions in DictMethodResourceProvider() and further culling of redundant unittests
29-Jul-2021 V1.683 Adjustments for API extensions in RepoLoadWorkflow() and dependency updates
1-Aug-2021 V1.684 Add support for embedded iterables in json schema generation and support scanning obsolete entries.
1-Aug-2021 V1.685 Adjust argument processing in RepoScanExec.py and update dependencies
3-Aug-2021 V1.686 Update configuration example and dependencies
25-Aug-2021 V1.687 Update dependencies
26-Aug-2021 V1.688 Update dependencies
28-Aug-2021 V1.689 Extend failure reporting for MongoDb PdbxLoader()
28-Aug-2021 V1.690 Further adjustments in reporting for MongoDb PdbxLoader()
10-Sep-2021 V1.691 Update dependencies for rcsb.utils.dictionary and rcsb.utils.io
23-Sep-2021 V1.692 Simplify updates to RepoHoldingsDataPrep() for the update collection
27-Sep-2021 V1.693 Add RepoHoldingsRemoteDataPrep and support for RepositoryProvider discoveryMode="remote"
29-Sep-2021 V1.694 Make discovery mode an internal configuration property of RepositoryProvider()
8-Oct-2021 V1.695 Pass configuration URLs to RepositoryHoldingsRemoteDataPrep() and
added support for building computed models schema
16-Oct-2021 V1.696 Add validation and loading tests for computed models
17-Oct-2021 V1.697 Standardize path details for computed models
17-Oct-2021 V1.698 Change the path hierarchy for computed models
16-Dec-2021 V1.699 Enforce pymongo version 3.12.0; latest release (4.0) results in self.__dbClient not being created successfully
30-Mar-2022 V1.700 Add pdbx_comp_model_core database to CLI commands, and
Add support for loading id code lists for mongo PdbxLoader() (preliminary)
Exclude 'ma_' categories from being declared as mandatory
20-Apr-2022 V1.701 Adding support for computed-model loading;
Updates to handle comp model local and global scores, as well as internal identifiers;
Add testRepoHoldingsRemoteDataPrep test case;
Enable use of cluster filename template configuration passed in by Luigi
29-Jun-2022 V1.702 Remove unnecessary custom handling of computed-model identifiers (will now use the internally-modified entry.id)
23-Dec-2022 V1.703 Configuration changes to support tox 4
26-Jan-2023 V1.704 Update MA_DICT_LOCATOR path in exdb-config-example.yml and add uchar5 to DataTypeApplicationInfo.py
30-Jan-2023 V1.705 Update requirements (pin SQLAlchemy)
14-Feb-2023 V1.706 Updates to PdbxLoader and RepoLoadWorkflow to support resumability of core data loading tasks
22-Feb-2023 V1.707 Updates to PdbxLoader to use case-sensitivity for brute force document purge
06-Apr-2023 V1.708 Add support for entity_id_list cifType in DataTypeApplicationInfo
26-Apr-2023 V1.709 Fix document pre-purge regex during load, and add regexPurge flag to control running that step
8-May-2023 V1.710 Fix error handling in PdbxLoader to cause failure when documents fail to load
19-May-2023 V1.711 Update DNS to PDB archive
23-Sep-2023 V1.712 Add support for pdb_id_u cifType in DataTypeApplicationInfo
6-Nov-2023 V1.713 Add maxStepLength argument to RepoLoadWorkflow
21-Nov-2023 V1.714 Add support in json schema for min and unique items for sub-categories and iterable attributes
15-Dec-2023 V1.715 Revert setting for mergeValidationReports
19-Mar-2024 V1.716 Add quality check to PdbxLoader to ensure validation report data are loaded along with the primary data;
Begin updating PdbxLoader and RepoLoadEx to support weekly update workflow CLI requirements
03-Apr-2024 V1.717 Add int_list cifType to DataTypeApplicationInfo
09-Apr-2024 V1.718 Update RepoLoadExec CLI and RepoLoadWorkflow to support CLI usage from weekly-update workflow
6-May-2024 V1.719 Updates to CLI utilities
9-May-2024 V1.720 Adjust provider type exclusion input to accept a list of types; update setuptools config
13-May-2024 V1.721 Update requirements
1-Jul-2024 V1.722 Pylinting
13-Aug-2024 V1.723 Update code for pymongo 4.x support
10-Sep-2024 V1.724 Add load completion check for the number of nonpolymer entity instances with validation data
22-Oct-2024 V1.725 Remove dependency on edmaps holdings file (no longer generating Map Coefficient MTZ files);
Add CLI support for performing final sanity check for ExDB loading and holdings in etl.load_ex.DbLoadingWorkflow task;
Update CI/CD testing to use python 3.10
23-Dec-2024 V1.726 Skip integers that exceed max int32 in DataTransformFactory