Skip to content

Remove the ^ from regex that is looking for Wiki article links.#114

Merged
REMitchell merged 1 commit intoREMitchell:masterfrom
ACPK:patch-4
Feb 12, 2023
Merged

Remove the ^ from regex that is looking for Wiki article links.#114
REMitchell merged 1 commit intoREMitchell:masterfrom
ACPK:patch-4

Conversation

@ACPK
Copy link
Copy Markdown
Contributor

@ACPK ACPK commented Feb 6, 2023

NOTE: This relates to page 73 as the same regex on page 78 doesn't contain the ^.

The links being scraped are absolute (full) URLs. The rule should look for links that contain /wiki/ rather than start with /wiki/.

The links being scraped are now absolute (full) URLs. The rule should look for links that contain `/wiki/` rather than start with `/wiki/`.

Ex: https://en.wikipedia.org/wiki/Benevolent_dictator_for_life rather than /wiki/Benevolent_dictator_for_life'
@ACPK ACPK changed the title Remove ^ from Regex for Wiki links Remove ^ from Regex Feb 6, 2023
@ACPK ACPK changed the title Remove ^ from Regex Remove the ^ from Regex that is looking for Wiki article links. Feb 6, 2023
@ACPK ACPK changed the title Remove the ^ from Regex that is looking for Wiki article links. Remove the ^ from regex that is looking for Wiki article links. Feb 6, 2023
@REMitchell REMitchell merged commit 8462666 into REMitchell:master Feb 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants