Jump to section

Analysing Mainstream Media Headlines for Electric Vehicles

Data Preparation and Filtering

The initial dataset consisted of 1.9 million records. This was filtered to produce distinct datasets that enabled the analysis and visualization of electric vehicle (EV) related news headlines.

Firstly, a dataset was created that filtered to the date a Headline first appeared. This dataset was used to analyze the volume and distribution of EV-related headlines, as well as compare them with general news headlines (non-EV related).

The second dataset was filtered to capture a unique headline each day, enabling tracking of how long a particular headline remained in search results over time. This analysis helped to shed light on the longevity of EV-related headlines and their persistence in the public discourse.

To further refine the datasets, EV-related headlines were separated from non-EV related headlines, enabling more targeted analysis and comparison between general news and EV-specific news. These filtered datasets allowed exploration of the frequency, duration, and distribution of EV-related headlines, providing valuable insights into the media's coverage of Electric Vehicles.

These filtered datasets, enabled in-depth analysis and visualization of EV-related news headlines, ultimately informing understanding of the media's role in shaping public perceptions of electric vehicles.

Source Meta Data

The Web scrape operation stored results as a CSV, with each row as an observation.
Column Dtype Meta
1 Date datetime64[ns] Date / Time the scrape operation took place
2 Publication object The web site / newspaper
3 Headline object The Text of the headline

Top

Duplicates

There were 19,553 duplicates in this dataset for Publication, Headline and Date. Where a publication has edited a headline after first publishing it, this will show as a new headline, with a new date/time.

NaN Values

There were were no NaN values, every observation's fields were populated.

Feature Engineering

Additional fields were engineered from the source data and added to the dataframe. Source fields are title case, engineered fields are snake case.

Column Dtype Meta
1 Date datetime64[ns] Date / Time the scrape operation took place
2 Publication object The web site / newspaper
3 Headline object The Text of the headline
4 time object Engineered field from Date field
5 month object Engineered field from Date field
6 month_num int32 Engineered field from Date field
7 day_of_week object Engineered field from Date field
8 day_num int32 Engineered field from Date field
9 date_str object Engineered field from Date field
10 combi object Combined publication field and title field
11 datetime64[ns](1), int32(2), Engineered field from Data field
12 usage: 135.7+ MB

Top

Additional engineering

The publication field was a short form field added during the web scrape operation. This was expanded to the full Publication/Web Site title.

Short Code Full Publication Title
"dailymail" "The Daily Mail"
"independent" "The Independent"
"spectator" "The Spectator"
"telegraph" "The Telegraph"
"dailystar" "The Daily Star"
"bbc" "BBC"
"thesun" "The Sun"
"economist" "The Economist"
"thetimes" "The Times"
"express" "Express"

Top

Filtered Data Sets

Additional datasets were created and filtered to create datasets for further analysis.

Dataset Contents # Rows
all_headlines original source data after cleansing 1,827,596
all_ev_headlines all_headlines
filtered to ev terms : 'electric', 'ev', 'evs', 'charger', 'charging', 'charge'. Further refined to remove non-relevant headlines e.g. "Parking Charge", "Leads the Charge..." etc.
621,446
all_daily_headlines all_headlines
filtered to the Date of the headline. A daily list of headlines.
96,639
ev_daily_headlines all_daily_headlines
filtered to ev terms : 'electric', 'ev', 'evs', 'charger', 'charging', 'charge'. Further refined to remove non-relevant headlines e.g. "Parking Charge", "Leads the Charge..." etc.
27,581
all_unique_headlines all_headlines
filtered to unique Publication and Headline, creating a unique list by publication of the headlines that were published
19,554
ev_unique_headlines all_ev_headlines
filtered to unique Publication and Headline, creating a unique list by publication of the EV headlines that were published.
1,316
sentiment-manual-refined ev_unique_headlines
with the sentiment analysis from Textblob, Llama 3.1b LLM and the Hand Tuned Sentiment .
1,316

Top

Sentiment Analysis Data Set

The Sentiment Analysis dataset used the ev_unique_headlines as the basis from which analysis was carried out. ev_unique_headlines a filtered dataset of all the unique EV Headlines.

Additional fields were added for each sentiment analysis method: Llama 3.1b LLM, TextBlob and Hand-Tuned.

Because of the wildly varying results from Llama 3.1b LLM and Textblob, only the Hand-Tuned sentiment analysis was used for the analysis and visualisations you see in this report. While this has potential for personal bias, dates and Publication titles were hidden during the review process to attempt to minimise any bias.

Sentiments were classified in one of three ways

Positive


The headline was viewed as positive towards Electric Vehicles.


Neutral


The headlines was viewed as neutral, generally reporting verifiable facts.


Negative The headline was viewed as negative, typically these were unsubstantiated facts, common misinformation, directly criticize or denigrate electric vehicles and either negative Subject Matter Expert Opinions or negative Personal Experiences in the headline. Generally speaking if the headline promotes FUD (Fear, Uncertainty, Doubt) about Electric Vehicles, then it is classified as negative

Examples of the Sentiment Analysis results

Title Headline Date Hand Tuned
Sentiment
Textblob
Sentiment
Llama 3.1b LLM
The Sun EV explodes as firefighters used 6,000 litres of water to extinguish fire 14-09-2023 Negative Neutral Neutral
The Telegraph Burning electric cars must be dunked in baths of water to stop fires spreading 16-10-2023 Negative Neutral Neutral
Express I'm an EV expert - these electric car tax fees should not be raised significantly more 05-11-2023 Negative Positive Neutral
The Sun I've driven thousands of cars - no one wants to buy an EV for these three reasons 09-10-2023 Negative Positive Neutral

Top