Instructions

How to add to the dataset

Sadly the current set-up is a bit convoluted and geared towards GitHub users. For a introduction on using Git and GitHub through R see: Happy Git and GitHub for the useR.

If you have an GitHub account:

  1. Fork the repository

    Go to the GitHub repository and click the “Fork” button. This creates a copy of the repository in your GitHub account.

  2. Clone Your Fork

    In your forked repository, click the green “Code” button, copy the URL, and clone it to your local machine using Git:

    git clone https://github.com/your-username/your-forked-repo.git
  3. Update the CSV File

    • In your cloned repository, navigate to the CSV file (e.g., data/Contrib_metaData.csv).

    • Open it and add your data in the same format as the existing rows.

    • Save the file.

  4. Commit and Push Your Changes

    git add data/quotes.csv
    git commit -m "Added new data to the CSV"
    git push origin main
  5. Submit a Pull Request

    • Go back to your fork on GitHub and click “Contribute” > “Open Pull Request.”

    • Submit your pull request, and I will review and merge your changes!

Codebook

All variables that are in the dataset with the categories and explaintions

Click to expand
Feature Variable_name Definition Ranges/Categories Adopted
contributor dataset Contrinutor ID
DOI NA Paper ID
First Author and publication year AuthorYear Name(s) of authors First auther and publication year used as study label.
Location location Location of the data used (country level)
Overall Accuracy* OA_reported Effect size of interest
Sample Size* sample_size The sample size (i.e.: number of pixels, or objects)
Publication Year Publication_Year Year of publication
Classification Type classification_type Unit of analysis in the primary study Object-level, Pixel-level, Unclear
Model Group* model_group Type of algorithm used. Any group that makes up less than 5 is regrouped as other analysis Use abbreviations: Decision Tree (DT), Discriminant Analysis (DA), Fuzzy (FZ), Genetic Algorithm (GA), Immune System (IS), Index-Based (IB), K-Nearest Neighbor (KNN), Maximum Likelihood (ML), MinimumDistance (MD), Neural Network (NN), Parallelepiped (PP), Random Forest (RF), Spectral Angle Mapper (SAM), Subspace (SS), and Support Vector Machines (SVM), Ot
Ancillary Data ancillary Use of non-RS data in the model Remote Sensing Only, Ancillary Data Included
Indices indices Use of indices to enhance analysis Used, Not Used
Remote Sensing Type RS_device_type Category of remote sensing Active, Passive, Combined, Not Reported
Device Group RS_device_group Specific device extracted, then grouped Landsat, Sentinel, Other, Not Reported
Number of Spectral Bands RS_spectral_bands_no Number of spectral bands used Count the number of bands or NA
Spectral Bands group* no_band_group Number of spectral bands is regrouped Low:1-4 , Mid:5-20,high >20, Not Reported:NA
Spatial Resolution RS_spatital_resolution_m Spatial resolution in meters eg: 30, <1, NA
Confusion Matrix* Confusion_matrix Whether a confusion matrix was present Reported, Not Reported
Majority-class Proportion* fraction_majority_class The proportion of the largest class
Device Rs_devices Type of remote sensing device Satellite, Aerial Photographic Images