Filter rows in an output CSV
You can use HTMLTab in combination with xsv to output only those rows within a table that match a string or regular expression. For example, to get a list of countries whose entire sovereignty is disputed by another state, run:
htmltab --select 2 https://en.wikipedia.org/wiki/List_of_sovereign_states \
| xsv search -s 3 "None" --invert-match \
| xsv select 1
htmltab --select 2
selects the second<table>
element on the Wikipedia page, and converts it to CSVxsv search --select 3 --invert-match "None"
filters out all CSV rows where the third column containsNone
xsv select 1
extracts only the first column from the remaining data.
As of the time of writing, that command outputs:
Common and formal names
Armenia – Republic of Armenia
China – People's Republic of China[o]
Cyprus – Republic of Cyprus
Israel – State of Israel
North Korea – Democratic People's Republic of Korea
Palestine – State of Palestine
South Korea – Republic of Korea