Here you can find the code which scrapes and saves data from the Shopify App Store.
The scraper is used to collect Shopify app store dataset on Kaggle and includes these files:
appsapps_categoriescategorieskey_benefitspricing_plan_featurespricing_plansreviews
While the dataset published on Kaggle is regularly updated, this repository allows keeping the local copy up to date independently of the released version.
Detailed dataset description can be found here.
Authenticate to GitHub Container Registry (if not already)
docker login ghcr.io -u USERNAME -p TOKENPull container
docker pull ghcr.io/usernam3/shopify-app-store-scraperRun container
docker run -v `pwd`/output/:/app/output/ ghcr.io/usernam3/shopify-app-store-scraperAfter container finished the execution check the output folder (in current directory)
ls -la output/Install requirements
pip install -r requirements.txtRun scraper
scrapy crawl app_storeAfter container finished the execution check the output folder (in current directory)
ls -la output/Please don't hesitate to open issues or PRs at any time if you need help with anything.