diff --git a/docs/how_to_identify_projects.md b/docs/how_to_identify_projects.md new file mode 100644 index 0000000000..5ad95a489c --- /dev/null +++ b/docs/how_to_identify_projects.md @@ -0,0 +1,51 @@ +# How to Identify Open Source Projects in Sustainability and Climate + +This document outlines a methodology for identifying relevant open-source projects in the areas of sustainability and climate, building upon the existing efforts of the Open Sustainable Technology initiative. + +## 1. Understanding the Scope and Principles + +Before beginning the search, it is crucial to understand the core principles and criteria for project inclusion as defined by OpenSustain.tech. Projects should: + +* Align with the [Open Sustainability Principles](https://opensustain.tech/principles/). +* Be instrumental in preserving/restoring natural ecosystems, supporting climate change mitigation/adaptation, or enabling environmental sustainability through open technology, methods, data, knowledge, intelligence, or tools. +* Be actively used or developed by others outside the core project or organization. +* Be structured and documented for maintenance, reuse, and extendability. +* Be published under an open-source license. + +## 2. Leveraging Keyword-Based Search + +The `ost_keywords.txt` file provides a valuable starting point with a list of terms frequently associated with open-source projects in this domain. These keywords can be used to formulate targeted search queries across various platforms. + +**Example Keywords (from `ost_keywords.txt`): +** +* `mass`, `content`, `concentration`, `radioactivity`, `atmosphere`, `water`, `carbon`, `energy`, `climate`, `biodiversity`, `renewable energy`, `solar`, `wind`, `hydrology`, `emissions`, `pollution`, `deforestation`, `conservation`, `sustainable development`. + +**Search Platforms:** + +* **GitHub:** Utilize advanced search filters to combine keywords with programming languages (e.g., `python`, `R`, `Java`) and repository characteristics (e.g., `stars:>100`, `forks:>50`, `updated:>2023-01-01`). +* **GitLab, Bitbucket, Zenodo:** Extend searches to these platforms using similar keyword-based approaches. +* **Academic Search Engines (e.g., Google Scholar, Semantic Scholar):** Search for research papers that mention open-source tools or datasets related to sustainability. Look for terms like "open-source software," "open data," "GitHub repository," in conjunction with sustainability keywords. +* **Specialized Search Engines/Indexes:** Explore platforms like [Libraries.io](https://libraries.io/), [PyPi](https://pypi.org/), [rdrr.io](https://rdrr.io/) for package-level discovery. + +## 3. Exploring Existing Networks and Communities + +* **Open Source Communities:** Engage with existing open-source communities focused on environmental science, climate change, or sustainable technology. Forums, mailing lists, and social media groups can be excellent sources for discovering new projects. +* **Conferences and Workshops:** Attend virtual or in-person conferences and workshops related to AI/ML for sustainability. Projects are often presented and discussed at these events. +* **Journal of Open Source Software (JOSS):** Regularly review publications in JOSS for newly published open-source research software relevant to the domain. +* **Crowdsourcing and Interviews:** As highlighted in the OpenSustain.tech methodology report, direct engagement with domain experts and practitioners through interviews or crowdsourcing initiatives can uncover valuable projects not easily found through automated means. + +## 4. Analyzing Project Suitability + +Once potential projects are identified, evaluate them against the OpenSustain.tech criteria. Pay close attention to: + +* **Active Development:** Check commit history, recent pull requests, and issue activity. +* **Community Engagement:** Look for signs of external contributions, active discussions, and responsiveness from maintainers. +* **Documentation:** Assess the clarity and completeness of project documentation, including installation guides, usage examples, and contribution guidelines. +* **Licensing:** Verify that the project is released under a recognized open-source license. + +## 5. Contributing to the Open Sustainable Technology Database + +For any newly identified projects that meet the criteria, follow the [CONTRIBUTING.md](https://github.com/protontypes/open-sustainable-technology/blob/main/CONTRIBUTING.md) guidelines to add them to the Open Sustainable Technology database via a pull request. Ensure to provide all necessary details and a clear description of the project's relevance to sustainability. + +By following this comprehensive approach, we can collectively enhance the visibility and impact of open-source projects driving environmental sustainability. + diff --git a/ost_keywords.txt b/ost_keywords.txt new file mode 100644 index 0000000000..01e2f751a4 --- /dev/null +++ b/ost_keywords.txt @@ -0,0 +1 @@ +[('mass', 1241), ('content', 1157), ('concentration', 1145), ('radioactivity', 1089), ('atmosphere', 943), ('tendency', 866), ('surface', 786), ('water', 729), ('mole', 642), ('fraction', 409), ('integral', 385), ('flux', 361), ('emission', 352), ('carbon', 324), ('express', 324), ('particle', 291), ('aerosol', 281), ('nitrogen', 191), ('cloud', 176), ('ocean', 154), ('temperature', 144), ('deposition', 135), ('upward', 111), ('wave', 109), ('dioxide', 106), ('organic', 103), ('velocity', 102), ('transport', 99), ('soil', 98), ('downward', 96), ('land', 96), ('northward', 94), ('particulate', 92), ('matter', 86), ('heat', 86), ('wind', 86), ('energy', 83), ('stratiform', 82), ('production', 81), ('amount', 80), ('acid', 79), ('thickness', 76), ('eastward', 75), ('fire', 74), ('ambient', 73), ('monoxide', 72), ('downwelling', 70), ('liquid', 66), ('advection', 66), ('shortwave', 65), ('height', 63), ('pressure', 60), ('convective', 57), ('salt', 55), ('hydrogen', 55), ('agricultural', 55), ('snow', 54), ('density', 52), ('rate', 52), ('number', 52), ('inorganic', 51), ('nitrate', 49), ('longwave', 49), ('dissolve', 48), ('waste', 48), ('combustion', 48), ('assume', 46), ('radiative', 46), ('bromine', 43), ('chlorine', 42), ('parameterized', 42), ('stress', 41), ('eddy', 41), ('elemental', 40), ('vapor', 39), ('precipitation', 37), ('product', 37), ('radiance', 37), ('nmvoc', 36), ('biological', 36), ('clear', 36), ('salinity', 35), ('methyl', 34), ('spectral', 33), ('compound', 32), ('radical', 32), ('methane', 32), ('sulfur', 32), ('wavelength', 32), ('distribution', 32), ('swell', 32), ('variance', 32), ('phytoplankton', 31), ('dust', 30), ('maximum', 30), ('minus', 30), ('upwelling', 30), ('forest', 30), ('savanna', 30), ('grassland', 30), ('litter', 29), ('period', 29), ('diffusivity', 28), ('depth', 28), ('vegetation', 27), ('optical', 26), ('mercury', 26), ('burning', 26), ('convection', 25), ('sulfate', 25), ('mix', 25), ('derivative', 25), ('treatment', 25), ('disposal', 25), ('chloride', 24), ('residential', 24), ('angle', 23), ('ammonia', 23), ('biomass', 23), ('humidity', 23), ('industrial', 23), ('geopotential', 22), ('condense', 22), ('nitrous', 22), ('ozone', 22), ('coefficient', 22), ('diffusion', 22), ('evaporation', 22), ('base', 21), ('altitude', 21), ('atomic', 21), ('gaseous', 21), ('molecular', 21), ('momentum', 21), ('photon', 21), ('irradiance', 21), ('iron', 21), ('chlorophyll', 21), ('alpha', 20), ('formaldehyde', 20), ('ratio', 20), ('frequency', 20), ('floor', 20), ('melt', 20), ('scatter', 19), ('anthropogenic', 19), ('peroxide', 19), ('toluene', 19), ('xylene', 19), ('square', 19), ('absorption', 18), ('butane', 18), ('ethane', 18), ('ethene', 18), ('ethyne', 18), ('propane', 18), ('propene', 18), ('photosynthetic', 18), ('mesoscale', 18), ('outgo', 18), ('tracer', 18), ('effective', 17), ('sigma', 17), ('benzene', 17), ('sulfide', 17), ('bromide', 17), ('force', 17), ('tide', 17), ('radius', 17), ('growth', 17), ('maritime', 17), ('troposphere', 16), ('condensation', 16), ('pinene', 16), ('nitric', 16), ('rainfall', 16), ('productivity', 16), ('runoff', 16), ('phosphorus', 16), ('threshold', 15), ('solar', 15), ('oxygen', 15), ('drag', 15), ('difference', 15), ('basal', 15), ('snowfall', 15), ('limitation', 15), ('analogue', 15), ('respiration', 15), ('flag', 14), ('correction', 14), ('gravity', 14), ('streamfunction', 14), ('dimethyl', 14), ('secondary', 14), ('relative', 14), ('canopy', 14), ('frozen', 14), ('spherical', 14), ('tropopause', 14), ('kinetic', 13), ('ammonium', 13), ('overturn', 13), ('sinking', 13), ('oxide', 12), ('river', 12), ('conservative', 12), ('geostrophic', 12), ('miscellaneous', 12), ('partial', 12), ('radiation', 11), ('instrument', 11), ('hexachlorocyclohexane', 11), ('graupel', 11), ('hexachlorobiphenyl', 11), ('tropical', 11), ('forestry', 11), ('distance', 11), ('thermodynamics', 11), ('silicon', 11), ('moment', 11), ('sound', 11), ('backwards', 10), ('biogenic', 10), ('divalent', 10), ('isoprene', 10), ('peroxy', 10), ('volcanic', 10), ('cyclone', 10), ('taxon', 10), ('length', 10), ('geoid', 10), ('fall', 10), ('diatom', 10), ('dissipation', 10), ('fore', 10), ('effect', 10), ('turbulent', 10), ('ketone', 10), ('horizontal', 9), ('acetic', 9), ('beta', 9), ('formic', 9), ('cyanide', 9), ('hydrocarbon', 9), ('brightness', 9), ('albedo', 9), ('exclude', 9), ('direct', 9), ('drift', 9), ('moisture', 9), ('sedimentation', 9), ('natural', 9), ('day', 9), ('wavenumber', 9), ('sublimation', 9), ('plant', 9), ('heating', 9), ('alcohol', 9), ('solvent', 9), ('pentane', 9), ('acoustic', 8), ('strength', 8), ('anomaly', 8), ('aceto', 8), ('nitrile', 8), ('tetrachloride', 8), ('dichlorine', 8), ('ethanol', 8), ('hail', 8), ('hydroxyl', 8), ('limonene', 8), ('methanol', 8), ('peroxyacetyl', 8), ('peroxynitric', 8), ('radon', 8), ('shallow', 8), ('vorticity', 8), ('thermal', 8), ('flood', 8), ('shear', 8), ('gross', 8), ('slope', 8), ('diazotrophic', 8), ('aragonite', 8), ('calcite', 8), ('abiotic', 8), ('starboard', 8), ('significant', 8), ('directional', 8), ('virtual', 8), ('average', 7), ('azimuth', 7), ('zenith', 7), ('static', 7), ('enthalpy', 7), ('brox', 7), ('clox', 7), ('dinitrogen', 7), ('pentoxide', 7)]