Jump to section

Food Standards Agency - Alert Monitoring, Management, analysis, Visualisations & Infographic.

Web Scrape Front End

Maintenance of the data is carried out through the web scrape front end.

The Web Scrape front end is an app developed in Python that handles the scraping, cleansing and feature engineering of the incoming data from the fsa.gov.uk web site.

While the identification of new alerts and push button scraping is fully automated, a human needs to review and edit the incoming data. The manual intervention improves the quality of the dataset considerably, adding additional features to the dataset not currently maintained by the FSA.

For example :

Tesco are recalling Tesco Finest 6 All Butter Pastry Mince Pies because they may contain pieces of dried glue from packaging which makes them unsafe to eat.

Additional features that can be deduced or inferred from this information:

  • Brand : in this example would be Tesco
  • Supplier : is unknown
  • Supplier Type : is unknown. Obviously someone makes these for Tesco, but who exactly is not noted in this instance.
  • Outlet : Tesco - The outlet the consumer would buy from.
  • Outlet Type : Grocer
  • Product Category : Bread or Baked Goods
  • Product Type : Bread or Baked Goods
  • Other Contaminant : Bits of Glue
  • Another example from a recent alert:

    FGS Ingredients Ltd is recalling a number of products containing mustard powder because they may contain peanuts. This means these products are a possible health risk for anyone with an allergy to peanuts. These products are sold under several different brand names at several different retail stores.

    As part of the scrape process, additional product details are included :

    ** Frozen Iceland Takeaway Chinese Style Chicken Curry - 375g

  • Brand : Iceland
  • Supplier : FGS Ingredients
  • Supplier Type : Manufacturer.
  • Outlet : Iceland
  • Outlet Type : Grocer
  • Product Category : Ready Meal / Ready to Eat
  • Product Type : Ready Meal / Ready to Eat
  • Allergen Contaminant: nuts
  • This is simplified through the Web Scrape front end.

    Once the data has been collated and added to the FSA Safety Alert data set, analysis and the subsequent production of the visualisations and infographic along with publishing to the dev and public server are automatic.

    The master list

    The master list contains the data we have, front page search results + edited product information

    Web Scrape new Data

    The master list is updated by initiating a fresh scrape of the search results from the food.gov.uk web site. Any new alerts not already in the master list are added to the master list and identifed:

    Scrape Alert Notice Detail

    The user initiates a web scrape of the alert notice which populates as much of the data set as possible, leaving the remaining fields to be hand edited.

    Searching the FSA data also produces a time series plot showing each month and year a notice was issued for the search term.

    Visualisations

    A suite of visualisations have been developed that automatically update whenever new data is added to the FSA Safety Alert dataset.

    Infographic

    The infographic is automatically maintained and published whenever the dataset is updated. The main SVG file is automatically amended with new values using the lxml python library, converted to PNG format using the cairosvg library then published to the local dev and public server. This does require a human to review changes whenever there are significant new alerts.