Short Description
In https://fedora.gitlab.io/sigs/go/go-vendor-tools/, we use the scancode.api.get_licenses() Python API to detect license texts contained within license files and then produce an SPDX expression (relevant code). The scancode license detector is very powerful due to its large corpus of license texts, but it has a huge dependency footprint which has prevented us from making it the default license scanner in go-vendor-tools (currently, askalono is the default). Many of these dependencies are to support package scanning features that we don't use. I wonder what it would look like to extract the license scanning–related APIs into a smaller subpackage (or maybe to make more dependencies optional Python extras) without disrupting scancode-toolkit as a whole. Do you think that'd be feasible?
Possible Labels
- new feature
- improve-license-detection
- installation and packaging
- core and api
Select Category
Describe the Update
A mechanism to use the scancode/licensedcode file license detection API (scancode.api.get_licenses()) without pulling in so many dependencies.
How This Feature will help you/your organization
See earlier comment about Go Vendor Tools' license scanning functionality.
Possible Solution/Implementation Details
See above. Reudction of dependencies in the main scancode-toolkit package (e.g., by using extras) or extracting the license scanning API are possible solutions.
Example/Links if Any
Can you help with this Feature
Short Description
In https://fedora.gitlab.io/sigs/go/go-vendor-tools/, we use the
scancode.api.get_licenses()Python API to detect license texts contained within license files and then produce an SPDX expression (relevant code). The scancode license detector is very powerful due to its large corpus of license texts, but it has a huge dependency footprint which has prevented us from making it the default license scanner in go-vendor-tools (currently, askalono is the default). Many of these dependencies are to support package scanning features that we don't use. I wonder what it would look like to extract the license scanning–related APIs into a smaller subpackage (or maybe to make more dependencies optional Python extras) without disruptingscancode-toolkitas a whole. Do you think that'd be feasible?Possible Labels
Select Category
Describe the Update
A mechanism to use the scancode/licensedcode file license detection API (
scancode.api.get_licenses()) without pulling in so many dependencies.How This Feature will help you/your organization
See earlier comment about Go Vendor Tools' license scanning functionality.
Possible Solution/Implementation Details
See above. Reudction of dependencies in the main scancode-toolkit package (e.g., by using extras) or extracting the license scanning API are possible solutions.
Example/Links if Any
Can you help with this Feature