Software Heritage
| Formation | June 30, 2016 |
|---|---|
| Founder | Roberto Di Cosmo, Stefano Zacchiroli |
| Type | Non‑profit |
| Headquarters | Inria |
| Location | |
Scientific Advisors | Gérard Berry Jean-François Abramatic Julia Lawall Serge Abiteboul |
| Affiliations | Inria |
| Staff | 13 |
| Website | softwareheritage |
Software Heritage is a non-profit organization which provides a service for archiving and referencing historical and contemporary software — with a focus on human readable source code. The site was unveiled in 2016 by Inria and is supported by UNESCO. The project itself is structured as a non‑profit multi‑stakeholder initiative.
The stated mission of Software Heritage is to collect, preserve and share all software that is publicly available in source code form, with the goal of building a common, shared infrastructure at the service of industry, research, culture and society as a whole.
Software source code is collected by crawling code hosting platforms, like GitHub, GitLab.com or Bitbucket, and packages archives, like npm or PyPI, and ingested into a special data structure, a Merkle DAG, that is the core of the archive. Each artifact in the archive is associated with a SoftWare Hash IDentifier (SWHID).
In order to increase the chances of preserving the Software Heritage archive over the long term, a mirror program was established in 2018, joined by ENEA and FossID as of October 2020.