The initial dataset consisted of 1.9 million records. This was filtered to produce distinct datasets that enabled the analysis and visualization of electric vehicle (EV) related news headlines.
Firstly, a dataset was created that filtered to the date a Headline first appeared. This dataset was used to analyze the volume and distribution of EV-related headlines, as well as compare them with general news headlines (non-EV related).
The second dataset was filtered to capture a unique headline each day, enabling tracking of how long a particular headline remained in search results over time. This analysis helped to shed light on the longevity of EV-related headlines and their persistence in the public discourse.
To further refine the datasets, EV-related headlines were separated from non-EV related headlines, enabling more targeted analysis and comparison between general news and EV-specific news. These filtered datasets allowed exploration of the frequency, duration, and distribution of EV-related headlines, providing valuable insights into the media's coverage of Electric Vehicles.
These filtered datasets, enabled in-depth analysis and visualization of EV-related news headlines, ultimately informing understanding of the media's role in shaping public perceptions of electric vehicles.
Column | Dtype | Meta | |
---|---|---|---|
1 | Date | datetime64[ns] | Date / Time the scrape operation took place |
2 | Publication | object | The web site / newspaper |
3 | Headline | object | The Text of the headline |
There were 19,553 duplicates in this dataset for Publication, Headline and Date. Where a publication has edited a headline after first publishing it, this will show as a new headline, with a new date/time.
There were were no NaN values, every observation's fields were populated.
Additional fields were engineered from the source data and added to the dataframe. Source fields are title case, engineered fields are snake case.
Column | Dtype | Meta | |
---|---|---|---|
1 | Date | datetime64[ns] | Date / Time the scrape operation took place |
2 | Publication | object | The web site / newspaper |
3 | Headline | object | The Text of the headline |
4 | time | object | Engineered field from Date field |
5 | month | object | Engineered field from Date field |
6 | month_num | int32 | Engineered field from Date field |
7 | day_of_week | object | Engineered field from Date field |
8 | day_num | int32 | Engineered field from Date field |
9 | date_str | object | Engineered field from Date field |
10 | combi | object | Combined publication field and title field |
11 | datetime64[ns](1), | int32(2), | Engineered field from Data field |
12 | usage: | 135.7+ MB |
The publication field was a short form field added during the web scrape operation. This was expanded to the full Publication/Web Site title.
Short Code | Full Publication Title |
---|---|
"dailymail" | "The Daily Mail" |
"independent" | "The Independent" |
"spectator" | "The Spectator" |
"telegraph" | "The Telegraph" |
"dailystar" | "The Daily Star" |
"bbc" | "BBC" |
"thesun" | "The Sun" |
"economist" | "The Economist" |
"thetimes" | "The Times" |
"express" | "Express" |
Additional datasets were created and filtered to create datasets for further analysis.
Dataset | Contents | # Rows |
---|---|---|
all_headlines |
original source data after cleansing | 1,827,596 |
all_ev_headlines |
all_headlines filtered to ev terms : 'electric', 'ev', 'evs', 'charger', 'charging', 'charge'. Further refined to remove non-relevant headlines e.g. "Parking Charge", "Leads the Charge..." etc. |
621,446 |
all_daily_headlines |
all_headlines filtered to the Date of the headline. A daily list of headlines. |
96,639 |
ev_daily_headlines |
all_daily_headlines filtered to ev terms : 'electric', 'ev', 'evs', 'charger', 'charging', 'charge'. Further refined to remove non-relevant headlines e.g. "Parking Charge", "Leads the Charge..." etc. |
27,581 |
all_unique_headlines |
all_headlines filtered to unique Publication and Headline, creating a unique list by publication of the headlines that were published |
19,554 |
ev_unique_headlines |
all_ev_headlines filtered to unique Publication and Headline, creating a unique list by publication of the EV headlines that were published. |
1,316 |
sentiment-manual-refined |
ev_unique_headlines with the sentiment analysis from Textblob, Llama 3.1b LLM and the Hand Tuned Sentiment . |
1,316 |
The Sentiment Analysis dataset used the ev_unique_headlines
as the basis from which analysis was carried out.
ev_unique_headlines
a filtered dataset of all the unique EV Headlines.
Additional fields were added for each sentiment analysis method: Llama 3.1b LLM, TextBlob and Hand-Tuned.
Because of the wildly varying results from Llama 3.1b LLM and Textblob, only the Hand-Tuned sentiment analysis was used for the analysis and visualisations you see in this report. While this has potential for personal bias, dates and Publication titles were hidden during the review process to attempt to minimise any bias.
Sentiments were classified in one of three ways
Positive |
|
Neutral |
|
Negative | The headline was viewed as negative, typically these were unsubstantiated facts, common misinformation, directly criticize or denigrate electric vehicles and either negative Subject Matter Expert Opinions or negative Personal Experiences in the headline. Generally speaking if the headline promotes FUD (Fear, Uncertainty, Doubt) about Electric Vehicles, then it is classified as negative |
Title | Headline | Date | Hand Tuned Sentiment |
Textblob Sentiment |
Llama 3.1b LLM |
---|---|---|---|---|---|
The Sun | EV explodes as firefighters used 6,000 litres of water to extinguish fire | 14-09-2023 | Negative | Neutral | Neutral |
The Telegraph | Burning electric cars must be dunked in baths of water to stop fires spreading | 16-10-2023 | Negative | Neutral | Neutral |
Express | I'm an EV expert - these electric car tax fees should not be raised significantly more | 05-11-2023 | Negative | Positive | Neutral |
The Sun | I've driven thousands of cars - no one wants to buy an EV for these three reasons | 09-10-2023 | Negative | Positive | Neutral |