Release history and notes

Sequence based identifiers are used for versioning (schema follows below):

major.minor[.revision]
  • It is always safe to upgrade within the same minor version (for example, from 0.3 to 0.3.4).

  • Minor version changes might be backwards incompatible. Read the release notes carefully before upgrading (for example, when upgrading from 0.3.4 to 0.4).

  • All backwards incompatible changes are mentioned in this document.

0.6.1

2026-04-22

  • Improve OGL patterns.

0.6

2026-04-11

  • Added jurisdiction and scope detection for Creative Commons license URLs (e.g., http://creativecommons.org/licenses/by-nc/2.0/uk → jurisdiction “uk”, http://creativecommons.org/licenses/by-nc/3.0/igo → scope “igo”).

  • Added jurisdiction and scope optional fields to LicenceVersion model.

  • Added prose patterns for CC URLs embedded in text (e.g., “Article under https://creativecommons.org/licenses/by-sa/3.0/”).

0.5.2

2026-04-10

  • Minor fixes.

0.5.1

2026-04-04

  • Improved Creative Commons recognition (additional aliases, versions and IGO variants) and expanded free-text matching.

  • Enhanced traceability: resolved licences include source line/file info surfaced in explain outputs.

  • Added integration and trace tests for CC matching and source-line explanations.

  • Docs updated to reflect traceability and plugin interface changes.

0.5

2026-03-27

Major refactoring and data consolidation

This release significantly simplifies the data model and maintenance while preserving full backwards compatibility for the public API.

  • Merged url_map.json and publishers.json into a single unified aliases/aliases.json file.

    • All URL mappings and publisher-specific shorthand aliases are now stored together with rich metadata (including urls arrays and aliases lists).

    • Removed separate data/urls/ and data/publishers/ directories (and their parsers).

    • Updated all documentation, tests, scripts, and contributing guidelines accordingly.

  • Simplified internal lookups:

    • Consolidated URL handling: LicenceNormaliser now builds a single _vkey_to_url map (version_key → list of URLs) and selects the shortest/cleanest URL for display.

    • Alias and URL resolution now use the same AliasParser (which also implements URLPlugin).

    • Trace output now consistently points to aliases.json with correct line numbers for both aliases and URLs.

  • Improved family categorisation:

    • Fixed wrongly categorised licences (notably unlicense, wtfpl, and several others now correctly map to public-domain where appropriate).

    • Updated fallback logic in _infer_family() and related tests.

  • Other fixes and cleanups:

    • Minor improvements to error handling and DataSourceError wrapping.

    • Updated exception hierarchy documentation (kept LicenceNormalisationError for backwards compatibility).

    • Cleaned up unused/deprecated references across the codebase.

0.4

2026-03-26

  • Renamed project from license-normaliser to licence-normaliser.

  • Cleanup.

0.3.2

2026-03-25

  • Introduce sub-aliases in aliases.json.

  • Introduce handy scripts for finding, automatic filling and merging the duplicates.

0.3.1

2026-03-24

  • Minor fixes.

0.3

2026-03-23

  • Plugin-based architectureLicenceNormaliser class accepts plugin CLASSES (not instances) with lazy loading; replaces old module-level globals.

  • 6 plugin interfacesBasePlugin, RegistryPlugin, URLPlugin, AliasPlugin, FamilyPlugin, NamePlugin, ProsePlugin.

  • New `BasePlugin.refresh()` — classmethod to fetch/refresh data from upstream URLs; CLI uses parser IDs (spdx, opendefinition, etc.) instead of class names.

  • Thread-safe singleton_DefaultNormaliser class with double-checked locking protects the shared LicenceNormaliser instance.

  • Public `registry_keys()`LicenceNormaliser.registry_keys() exposes known keys; get_registry_keys() in _cache.py uses it.

  • URL population fix — inverted URL map (version_key → cleaned_url) ensures LicenceVersion.url is populated for resolved licences.

  • Removed dead code — deleted _registry.py, parsers/base.py, parsers/__init__.py, and empty DEFAULT_* module globals.

  • Docs updated — ARCHITECTURE.rst, AGENTS.md, README.rst rewritten to reflect new plugin architecture.

0.2

2026-03-21

  • Architecture rewrite — parser-based pluggable system with 8 parsers loading from JSON data files.

  • 3-level hierarchyLicenceFamily → LicenceName → LicenceVersion with 11 families including publisher OA/TDM.

  • New data files — 170+ aliases, 41 prose patterns, 50+ publisher URLs (Elsevier, Wiley, Springer, ACS, etc.).

  • Strict modestrict=True raises LicenceNotFoundError on unknown licences.

  • Bug fixescc-pdm/cc0 family inference, gpl-3.0+ +-suffix stripping, false-positive prose pattern removed, ALIASES-before-REGISTRY lookup order.

  • Tests — 386 tests (up from ~80), including 165-case integration matrix.

  • DocsAGENTS.md, ARCHITECTURE.rst, CONTRIBUTING.rst rewritten; stale literalinclude targets fixed.

0.1.1

2026-03-17

  • Remove Pydantic as dependency.

0.1

2026-03-17

  • Initial beta release.