
gutenbergr - Download and Process Public Domain Works from Project Gutenberg
Download and process public domain works in the Project Gutenberg collection <https://www.gutenberg.org/>. Includes metadata for all Project Gutenberg works, so that they can be searched and retrieved.
Last updated
digital-humanitiesnatural-language-processingnlppeer-reviewedproject-gutenbergpublic-domaintext-mining
11.11 score 115 stars 1 dependents 1.4k scripts 2.1k downloadsrobotstxt - A 'robots.txt' Parser and 'Webbot'/'Spider'/'Crawler' Permissions Checker
Provides functions to download and parse 'robots.txt' files. Ultimately the package makes it easy to check if bots (spiders, crawler, scrapers, ...) are allowed to access specific resources on a domain.
Last updated
crawlerpeer-reviewedrobotstxtscraperspiderwebscraping
9.44 score 69 stars 5 dependents 422 scripts 1.6k downloads
readMDTable - Read Markdown Tables into Tibbles
Efficient reading of raw markdown tables into tibbles. Designed to accept content from strings, files, and URLs with the ability to extract and read multiple tables from markdown for analysis.
Last updated
datadata-analysisdata-analyticsdata-extractiondata-miningdata-sciencemarkdownmarkdown-parsermarkdown-tabler-programmingtables
5.86 score 8 stars 2 dependents 5 scripts 1.5k downloads