The source data was scraped from the either the "Cars" section of the publication, if the publication had a section, or with the search terms "Electric Vehicle"
The original web scrape was executed every five minutes and captured headline changes that were made by some publication, no doubt to increase clicks.
The scraping operation produced 1.9 million observations from August to January. A further web scrape operation is currently under way to capture a full 12 month period of headlines.
See the report section on feature engineering to gain an understanding of the data engineering that was carried out with this data set.
Column | Dtype | Meta | |
---|---|---|---|
1 | Date | datetime64[ns] | Date / Time the scrape operation took place |
2 | Publication | object | The web site / newspaper |
3 | Headline | object | The Text of the headline |